Description
Intellexer Language Recognizer accurately determines not only the language of the whole document, but also the language of each text fragment.
For evaluation experiments, we’ve created a dataset, which contains more than 1300000 documents in different languages (English, German, French, Spanish, Japanese, Chinese, etc.). In this collection we've achieved language identification accuracy from 95% to 99% (typical competitors’ results: 86 - 96%).The average processing speed was over 8000 KB/s.