public class ChineseSentenceTokenizer extends Object implements SentenceTokenizer
Constructor and Description |
---|
ChineseSentenceTokenizer() |
Modifier and Type | Method and Description |
---|---|
void |
setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
Note: does have no effect for Chinese
|
boolean |
singleLineBreaksMarksPara()
Note: will always return
false |
List<String> |
tokenize(String text)
Tokenize the given string to sentences.
|
public List<String> tokenize(String text)
SentenceTokenizer
tokenize
in interface SentenceTokenizer
tokenize
in interface Tokenizer
public void setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
setSingleLineBreaksMarksParagraph
in interface SentenceTokenizer
lineBreakParagraphs
- if true
, single line breaks are assumed to end a paragraph,
with false
, only two ore more consecutive line breaks end a paragraphpublic boolean singleLineBreaksMarksPara()
false
singleLineBreaksMarksPara
in interface SentenceTokenizer