Language Identification (State of the art)
"Standard" measure:
"Standard" datasets:
System Name | Short Description | Main Publications | Software (if available) | Results | Comments (i.e. extra resources used, train/test times, ...) |
---|---|---|---|---|---|
SystemName | How does it work? | Author and Article [1] | Software? | 98% according to... | Any extra comments? |
textcat | n-gram matching | Cavnar, W. B. and J. M. Trenkle (1994) "N-Gram-Based Text Categorization" | Yes | - | - |