Turkle
Turkle is a Django web application that provides a clone of Amazon's Mechanical Turk service in your local environment, allowing you to collect local expert annotations with the same templates and data files you use for crowd annotation. Meanwhile, our pip-installable ProtoTurk server can be used to rapidly prototype new templates and data files.
Getting Started ProtoTurkPatapsco
Patapsco is a scalable Python framework for reproducible cross-language information retrieval (CLIR) experiments.
Repository Colab DemoConcrete
Concrete is a cross-platform data serialization format and communication protocol for language annotations. It replaces ad-hoc TSV, XML, JSON, and other formats for storing document- and sentence-level language annotations. We developed Concrete to record and share annotations on structured human language data, including both text and speech.
Getting Started Python JavaScript JavaConcretely Annotated Corpora
Under the heading Concretely Annotated, we processed a variety of standard corpora with multiple popular NLP tool-chains using the Concrete data schema.
Wikipedia English Gigaword The New York Times