public class SRXSentenceTokenizer extends Object implements SentenceTokenizer
Constructor and Description |
---|
SRXSentenceTokenizer(Language language)
Build a sentence tokenizer based on the rules in the
segment.srx file
that comes with LanguageTool. |
SRXSentenceTokenizer(Language language,
String srxInClassPath) |
Modifier and Type | Method and Description |
---|---|
void |
setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs) |
boolean |
singleLineBreaksMarksPara() |
List<String> |
tokenize(String text)
Tokenize the given string to sentences.
|
public SRXSentenceTokenizer(Language language)
segment.srx
file
that comes with LanguageTool.public final List<String> tokenize(String text)
SentenceTokenizer
tokenize
in interface SentenceTokenizer
tokenize
in interface Tokenizer
public final boolean singleLineBreaksMarksPara()
singleLineBreaksMarksPara
in interface SentenceTokenizer
public final void setSingleLineBreaksMarksParagraph(boolean lineBreakParagraphs)
setSingleLineBreaksMarksParagraph
in interface SentenceTokenizer
lineBreakParagraphs
- if true
, single lines breaks are assumed to end a
paragraph; if false
, only two ore more consecutive line breaks end a paragraph