public class ChineseSentenceTokenizer extends Object implements SentenceTokenizer
| Constructor and Description |
|---|
ChineseSentenceTokenizer() |
| Modifier and Type | Method and Description |
|---|---|
void |
setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
Note: does have no effect for Chinese
|
boolean |
singleLineBreaksMarksPara()
Note: will always return
false |
List<String> |
tokenize(String text)
Tokenize the given string to sentences.
|
public List<String> tokenize(String text)
SentenceTokenizertokenize in interface SentenceTokenizertokenize in interface Tokenizerpublic void setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
setSingleLineBreaksMarksParagraph in interface SentenceTokenizerlineBreakParagraphs - if true, single line breaks are assumed to end a paragraph,
with false, only two ore more consecutive line breaks end a paragraphpublic boolean singleLineBreaksMarksPara()
falsesingleLineBreaksMarksPara in interface SentenceTokenizer