public class DutchWordTokenizer extends WordTokenizer
| Constructor and Description |
|---|
DutchWordTokenizer() |
| Modifier and Type | Method and Description |
|---|---|
String |
getTokenizingCharacters() |
List<String> |
tokenize(String text)
Tokenizes just like WordTokenizer with the exception for words such as
"oma's" that contain an apostrophe in their middle.
|
getProtocols, isEMail, isUrl, joinEMails, joinEMailsAndUrls, joinUrlspublic List<String> tokenize(String text)
tokenize in interface Tokenizertokenize in class WordTokenizertext - Text to tokenizepublic String getTokenizingCharacters()
getTokenizingCharacters in class WordTokenizer