Constructor and Description |
---|
ChineseTagger() |
Modifier and Type | Method and Description |
---|---|
AnalyzedTokenReadings |
createNullToken(String token,
int startPos)
Create the AnalyzedToken used for whitespace and other non-words.
|
AnalyzedToken |
createToken(String token,
String posTag)
Create a token specific to the language of the implementing class.
|
List<AnalyzedTokenReadings> |
tag(List<String> sentenceTokens)
Returns a list of
AnalyzedToken s that assigns each term in the
sentence some kind of part-of-speech information (not necessarily just one tag). |
public List<AnalyzedTokenReadings> tag(List<String> sentenceTokens) throws IOException
Tagger
AnalyzedToken
s that assigns each term in the
sentence some kind of part-of-speech information (not necessarily just one tag).
Note that this method takes exactly one sentence. Its implementation may implement special cases for the first word of a sentence, which is usually written with an uppercase letter.
tag
in interface Tagger
sentenceTokens
- the text as returned by a WordTokenizerIOException
public final AnalyzedTokenReadings createNullToken(String token, int startPos)
Tagger
null
as the POS tag for this token.createNullToken
in interface Tagger
public AnalyzedToken createToken(String token, String posTag)
Tagger
createToken
in interface Tagger