public class BretonWordTokenizer extends WordTokenizer
Constructor and Description |
---|
BretonWordTokenizer() |
Modifier and Type | Method and Description |
---|---|
List<String> |
tokenize(String text)
Tokenizes just like WordTokenizer with the exception that "c’h"
is not split.
|
getProtocols, getTokenizingCharacters, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrls
public List<String> tokenize(String text)
tokenize
in interface Tokenizer
tokenize
in class WordTokenizer
text
- Text to tokenize