public class BaseSynthesizer extends Object implements Synthesizer
Modifier and Type | Field and Description |
---|---|
protected List<String> |
possibleTags |
String |
SPELLNUMBER_FEMININE_TAG |
String |
SPELLNUMBER_ROMAN_TAG |
String |
SPELLNUMBER_TAG |
Constructor and Description |
---|
BaseSynthesizer(String resourceFileName,
String tagFileName,
Language lang)
Deprecated.
|
BaseSynthesizer(String resourceFileName,
String tagFileName,
String langShortCode) |
BaseSynthesizer(String sorosFileName,
String resourceFileName,
String tagFileName,
Language lang)
Deprecated.
|
BaseSynthesizer(String sorosFileName,
String resourceFileName,
String tagFileName,
String langShortCode) |
Modifier and Type | Method and Description |
---|---|
protected morfologik.stemming.IStemmer |
createStemmer()
Creates a new
IStemmer based on the configured dictionary . |
protected morfologik.stemming.Dictionary |
getDictionary()
Returns the
Dictionary used for this synthesizer. |
String |
getPosTagCorrection(String posTag)
Gets a corrected version of the POS tag used for synthesis.
|
String |
getRomanNumber(String arabicNumeral) |
String |
getSpelledNumber(String arabicNumeral)
Spells out a number
|
morfologik.stemming.IStemmer |
getStemmer() |
String |
getTargetPosTag(List<String> posTags,
String targetPosTag)
Select the desired POS tag to synthesize
|
protected void |
initPossibleTags() |
protected boolean |
isException(String w) |
protected List<String> |
lookup(String lemma,
String posTag)
Lookup the inflected forms of a lemma defined by a part-of-speech tag.
|
protected String[] |
removeExceptions(String[] words) |
String[] |
synthesize(AnalyzedToken token,
String posTag)
Get a form of a given AnalyzedToken, where the form is defined by a
part-of-speech tag.
|
String[] |
synthesize(AnalyzedToken token,
String posTag,
boolean posTagRegExp)
Generates a form of the word with a given POS tag for a given lemma.
|
String[] |
synthesizeForPosTags(String lemma,
Predicate<String> acceptTag)
Synthesize forms for the given lemma and for all POS tags satisfying the given predicate.
|
public final String SPELLNUMBER_TAG
public final String SPELLNUMBER_FEMININE_TAG
public final String SPELLNUMBER_ROMAN_TAG
public BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, Language lang)
BaseSynthesizer(String, String, String, String)
resourceFileName
- The dictionary file name.tagFileName
- The name of a file containing all possible tags.public BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, String langShortCode)
resourceFileName
- The dictionary file name.tagFileName
- The name of a file containing all possible tags.langShortCode
- the language short code used to find the data filespublic BaseSynthesizer(String resourceFileName, String tagFileName, Language lang)
BaseSynthesizer(String, String, String)
protected morfologik.stemming.Dictionary getDictionary() throws IOException
Dictionary
used for this synthesizer.
The dictionary file can be defined in the constructor
.IOException
- In case the dictionary cannot be loaded.protected morfologik.stemming.IStemmer createStemmer()
IStemmer
based on the configured dictionary
.
The result must not be shared among threads.protected List<String> lookup(String lemma, String posTag)
lemma
- the lemma to be inflected.posTag
- the desired part-of-speech tag.public String[] synthesize(AnalyzedToken token, String posTag) throws IOException
synthesize
in interface Synthesizer
token
- AnalyzedToken to be inflected.posTag
- The desired part-of-speech tag.IOException
public String[] synthesize(AnalyzedToken token, String posTag, boolean posTagRegExp) throws IOException
Synthesizer
synthesize
in interface Synthesizer
token
- the token to be used for synthesisposTag
- POS tag of the form to be generatedposTagRegExp
- Specifies whether the posTag string is a regular expression.IOException
public String[] synthesizeForPosTags(String lemma, Predicate<String> acceptTag) throws IOException
IOException
public String getPosTagCorrection(String posTag)
Synthesizer
getPosTagCorrection
in interface Synthesizer
posTag
- original POS tag to correctpublic morfologik.stemming.IStemmer getStemmer()
protected void initPossibleTags() throws IOException
IOException
public String getSpelledNumber(String arabicNumeral)
Synthesizer
getSpelledNumber
in interface Synthesizer
arabicNumeral
- in arabic numeralsprotected boolean isException(String w)
public String getTargetPosTag(List<String> posTags, String targetPosTag)
Synthesizer
getTargetPosTag
in interface Synthesizer