On-line demo of Xerox's language identifier (commercial)
47 languages, not terribly actively maintained.
I believe this was originally created by one of their Finnish researchers in XRCE Grenoble once upon a time ... I also got the impression that this one was the first to make a conscious effort at supporting different character set encodings.
Fun Observation: the Danish sample Sentence uses ancient German-Style Capitalization Rules (-:
... and the Norwegian is (predictably) unlabelled, although I believe it's Bokmål. And it's incorrectly punctuated.
by
era
2006-06-19 01:25
history
·
language
·
language.identification
·
server
·
tool
·
20060619-0123
http://www.xrce.xerox.com/competencies/content-analysis/tools/guesser-ISO-8859-1.en.html
-
cached
-
mail it
-
history