language identification
wikipedia
a python implementation
the code is available and readable
words set difference
simulation of 'Evaluation of a language identification system for mono- and multilingual text documents'
Language detection with python nltk