Modifier and Type | Class and Description |
---|---|
class |
TagalogWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
GoogleStyleWordTokenizer
Tokenize sentences to tokens like Google does for its ngram index.
|
Modifier and Type | Method and Description |
---|---|
protected WordTokenizer |
EnglishNgramProbabilityRule.getGoogleStyleWordTokenizer() |
Modifier and Type | Class and Description |
---|---|
class |
ArabicWordTokenizer |
class |
PersianWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
BelarusianWordTokenizer
Specific to Belarusian: apostrophes (', ’, ʼ) are part of the
word.
|
Modifier and Type | Class and Description |
---|---|
class |
BretonWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
CatalanWordTokenizer
Tokenizes a sentence into words.
|
Modifier and Type | Class and Description |
---|---|
class |
GermanWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
GreekWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
EnglishWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
EsperantoWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
SpanishWordTokenizer
Tokenizes a sentence into words.
|
Modifier and Type | Class and Description |
---|---|
class |
FrenchWordTokenizer
Tokenizes a sentence into words.
|
Modifier and Type | Class and Description |
---|---|
class |
GalicianWordTokenizer
Tokenizes a sentence into words.
|
Modifier and Type | Class and Description |
---|---|
class |
KhmerWordTokenizer
Tokenizes a sentence into words.
|
Modifier and Type | Class and Description |
---|---|
class |
DutchWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
PolishWordTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
PortugueseWordTokenizer
Tokenizes a sentence into words.
|
Modifier and Type | Class and Description |
---|---|
class |
RomanianWordTokenizer
Tokenizes a sentence into words.
|
Modifier and Type | Class and Description |
---|---|
class |
RussianWordTokenizer |