public class SentenceSourceIndexer extends DefaultHandler implements AutoCloseable
SentenceSource.
Performance examples (Dell XPS 13 9360):
German Wikipedia and Tatoeba With POS tags: 22,000 sentences per minute
German Wikipedia and Tatoeba Without POS tags: 2.4 million sentences per minute| Modifier and Type | Field and Description |
|---|---|
static String |
MAX_DOC_COUNT_FIELD |
static String |
MAX_DOC_COUNT_FIELD_VAL |
static String |
MAX_DOC_COUNT_VALUE |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
static void |
main(String... args) |
characters, endDocument, endElement, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warningpublic static final String MAX_DOC_COUNT_VALUE
public static final String MAX_DOC_COUNT_FIELD
public static final String MAX_DOC_COUNT_FIELD_VAL