public class SimpleLanguageIdentifier extends LanguageIdentifier
LanguageIdentifier.ParsedLanguageLists
COMMON_WORDS_LANG_IDENTIFIER, CONSIDER_ONLY_PREFERRED_THRESHOLD, maxLength, NON_LATIN_CHARS_LANGUAGES, REMOVE_EMAIL_SIGNATURE_FILTER, REMOVE_MENTION_FILTER, REMOVE_NON_BREAKING_SPACES_FILTER, REMOVE_URL_FILTER, SCORE_THRESHOLD, UNICODE_BASED_LANG_IDENTIFIER
Constructor and Description |
---|
SimpleLanguageIdentifier() |
SimpleLanguageIdentifier(List<String> preferredLangCodes) |
Modifier and Type | Method and Description |
---|---|
Language |
detectLanguage(String cleanText) |
DetectedLanguage |
detectLanguage(String cleanText,
List<String> noopLangsTmp,
List<String> preferredLangsTmp) |
DetectedLanguage |
detectLanguage(String cleanText,
List<String> noopLangsTmp,
List<String> preferredLangsTmp,
boolean limitOnPreferredLangs) |
cleanAndShortenText, getHighestScoringResult, prepareDetectLanguage
@Nullable public DetectedLanguage detectLanguage(String cleanText, List<String> noopLangsTmp, List<String> preferredLangsTmp)
detectLanguage
in class LanguageIdentifier
cleanText
- a cleanText as returned by LanguageIdentifier.cleanAndShortenText(String)
noopLangsTmp
- list of codes that are detected but will lead to the NoopLanguage that has no rulesnull
if language could not be identified@Nullable public DetectedLanguage detectLanguage(String cleanText, List<String> noopLangsTmp, List<String> preferredLangsTmp, boolean limitOnPreferredLangs)
detectLanguage
in class LanguageIdentifier
@Nullable public Language detectLanguage(String cleanText)
detectLanguage
in class LanguageIdentifier
cleanText
- a cleanText as returned by LanguageIdentifier.cleanAndShortenText(String)
null
if language could not be identified