Compound recognition – one of three fundamental natural language processing tasks
Morphosyntactic tagging – extends POS with given context.
MULTITEXT EAST: language resources, a multilingual dataset for language engineering research, focused on the morphosyntactic level of linguistic description. The MULTEXT-East dataset includes the EAGLES-based morphosyntactic specifications, morphosyntactic lexicons, and an annotated multilingual corpora. The parallel corpus, the novel “1984” by George Orwell, is sentence aligned and contains hand-validated morphosyntactic descriptions and lemmas. The resources are…
Multiword expressions – a type of lexical unit, made of two or more sequential tokens
Named Entity (word type)
Named entity (NE) recognition – one of three fundamental natural language processing tasks.