MultiWordChunker2 (LanguageTool 6.4-SNAPSHOT API)

java.lang.Object
- org.languagetool.tagging.disambiguation.AbstractDisambiguator
- - org.languagetool.tagging.disambiguation.MultiWordChunker2

All Implemented Interfaces:

Disambiguator
```
public class MultiWordChunker2
extends AbstractDisambiguator
```
Multiword tagger-chunker. Note: currently does not support:
- overlapping tagging (first matching multiword entry wins)
Author:

Andriy Rysin

Constructor Summary

Constructors
Constructor and Description

MultiWordChunker2(String filename)

MultiWordChunker2(String filename, boolean allowFirstCapitalized)

Constructors
Constructor and Description
`MultiWordChunker2(String filename)`
`MultiWordChunker2(String filename, boolean allowFirstCapitalized)`

Method Summary

All Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`AnalyzedSentence`	`disambiguate(AnalyzedSentence input)` Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...)
`protected String`	`formatPosTag(String posTag, int position, int multiwordLength)` Override this method if you want format POS tag differently
`protected boolean`	`matches(String matchText, AnalyzedTokenReadings inputTokens)`
`protected AnalyzedTokenReadings`	`prepareNewReading(String tokens, String tok, AnalyzedTokenReadings token, String tag)`
`void`	`setRemoveOtherReadings(boolean removeOtherReadings)`
`void`	`setWrapTag(boolean wrapTag)`

Methods inherited from class org.languagetool.tagging.disambiguation.AbstractDisambiguator
preDisambiguate

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Methods inherited from interface org.languagetool.tagging.disambiguation.Disambiguator
disambiguate

Constructor Detail
- MultiWordChunker2
```
public MultiWordChunker2(String filename)
```
  Parameters:
  
  filename - file text with multiwords and tags
- MultiWordChunker2
```
public MultiWordChunker2(String filename,
                         boolean allowFirstCapitalized)
```
  Parameters:
  
  filename - file text with multiwords and tags
  
  allowFirstCapitalized - if set to true, first word of the multiword can be capitalized

Method Detail

setRemoveOtherReadings
```
public void setRemoveOtherReadings(boolean removeOtherReadings)
```
Parameters:

removeOtherReadings - If true and multiword matches other readings will be removed

setWrapTag
```
public void setWrapTag(boolean wrapTag)
```
Parameters:

wrapTag - If true the tag will be wrapped with < and >

formatPosTag
```
protected String formatPosTag(String posTag,
                              int position,
                              int multiwordLength)
```
Override this method if you want format POS tag differently

Parameters:

posTag - POS tag for the multiword

position - Position of the token in the multiword

Returns:

Returns formatted POS tag for the multiword

disambiguate
```
public AnalyzedSentence disambiguate(AnalyzedSentence input)
```
Implements multiword POS tags, e.g., <ELLIPSIS> for ellipsis (...) start, and </ELLIPSIS> for ellipsis end.

Parameters:

input - The tokens to be chunked.

Returns:

AnalyzedSentence with additional markers.

matches

protected boolean matches(String matchText,
                          AnalyzedTokenReadings inputTokens)

prepareNewReading

protected AnalyzedTokenReadings prepareNewReading(String tokens,
                                                  String tok,
                                                  AnalyzedTokenReadings token,
                                                  String tag)

Class MultiWordChunker2

Constructor Summary

Method Summary

Methods inherited from class org.languagetool.tagging.disambiguation.AbstractDisambiguator

Methods inherited from class java.lang.Object

Methods inherited from interface org.languagetool.tagging.disambiguation.Disambiguator

Constructor Detail

MultiWordChunker2

MultiWordChunker2

Method Detail

setRemoveOtherReadings

setWrapTag

formatPosTag

disambiguate

matches

prepareNewReading