XeLDA models and standardizes unstructured documents in order to automatically exploit their content. Based on a technology developed through 20 years of research and development, XeLDA provides advanced text mining features enabling textual information processing.
TEMIS further explains XeLDA offers a scalable range of services based on natural language processing components that can be integrated in business applications, enabling:
• automatic identification of the language within each document,
• segmentation of text into sentences,
• split of text into basic lexical units (tokenization),
• morphological text analysis to return the normalized form (the lemma) and the potential grammatical categories for all the words identified during the tokenization stage,
• morpho-syntactic disambiguation to determine the exact grammatical category of a word according to its context,
• extraction of sequences of words that form noun phrases,
• identification of the context of a word to find the corresponding dictionary entry (dictionary lookup), and
• recognition of idiomatic expressions.