public class ArabicTagger extends BaseTagger
locale, wordTagger
Constructor and Description |
---|
ArabicTagger() |
Modifier and Type | Method and Description |
---|---|
protected List<AnalyzedToken> |
additionalTags(String word,
morfologik.stemming.IStemmer stemmer) |
void |
enableNewStylePronounTag() |
String |
getEnclitic(AnalyzedToken token) |
String |
getJarProclitic(AnalyzedToken token) |
List<String> |
getLemmas(AnalyzedTokenReadings patternTokens,
String type) |
String |
getProclitic(AnalyzedToken token) |
List<AnalyzedTokenReadings> |
tag(List<String> sentenceTokens)
Returns a list of
AnalyzedToken s that assigns each term in the
sentence some kind of part-of-speech information (not necessarily just one tag). |
AnalyzedTokenReadings |
tag(String word) |
additionalTags, asAnalyzedToken, asAnalyzedTokenList, asAnalyzedTokenListForTaggedWords, createNullToken, createToken, getAnalyzedTokens, getDictionary, getDictionaryPath, getManualAdditionsFileNames, getManualRemovalsFileNames, getWordTagger, overwriteWithManualTagger
public List<AnalyzedTokenReadings> tag(List<String> sentenceTokens)
Tagger
AnalyzedToken
s that assigns each term in the
sentence some kind of part-of-speech information (not necessarily just one tag).
Note that this method takes exactly one sentence. Its implementation may implement special cases for the first word of a sentence, which is usually written with an uppercase letter.
tag
in interface Tagger
tag
in class BaseTagger
sentenceTokens
- the text as returned by a WordTokenizer@Nullable protected List<AnalyzedToken> additionalTags(String word, morfologik.stemming.IStemmer stemmer)
public void enableNewStylePronounTag()
public String getProclitic(AnalyzedToken token)
public String getEnclitic(AnalyzedToken token)
public String getJarProclitic(AnalyzedToken token)
public AnalyzedTokenReadings tag(String word)
public List<String> getLemmas(AnalyzedTokenReadings patternTokens, String type)