public class SimpleSentenceTokenizer extends SRXSentenceTokenizer
[.!?…]
followed by whitespace
or an uppercase letter. You probably want to use an adapted SRXSentenceTokenizer
instead.Constructor and Description |
---|
SimpleSentenceTokenizer() |
setSingleLineBreaksMarksParagraph, singleLineBreaksMarksPara, tokenize