public class BaseSynthesizer extends Object implements Synthesizer
| Modifier and Type | Field and Description |
|---|---|
protected List<String> |
possibleTags |
String |
SPELLNUMBER_FEMININE_TAG |
String |
SPELLNUMBER_ROMAN_TAG |
String |
SPELLNUMBER_TAG |
| Constructor and Description |
|---|
BaseSynthesizer(String resourceFileName,
String tagFileName,
Language lang)
Deprecated.
|
BaseSynthesizer(String resourceFileName,
String tagFileName,
String langShortCode) |
BaseSynthesizer(String sorosFileName,
String resourceFileName,
String tagFileName,
Language lang)
Deprecated.
|
BaseSynthesizer(String sorosFileName,
String resourceFileName,
String tagFileName,
String langShortCode) |
| Modifier and Type | Method and Description |
|---|---|
protected morfologik.stemming.IStemmer |
createStemmer()
Creates a new
IStemmer based on the configured dictionary. |
protected morfologik.stemming.Dictionary |
getDictionary()
Returns the
Dictionary used for this synthesizer. |
String |
getPosTagCorrection(String posTag)
Gets a corrected version of the POS tag used for synthesis.
|
String |
getRomanNumber(String arabicNumeral) |
String |
getSpelledNumber(String arabicNumeral)
Spells out a number
|
morfologik.stemming.IStemmer |
getStemmer() |
String |
getTargetPosTag(List<String> posTags,
String targetPosTag)
Select the desired POS tag to synthesize
|
protected void |
initPossibleTags() |
protected boolean |
isException(String w) |
protected List<String> |
lookup(String lemma,
String posTag)
Lookup the inflected forms of a lemma defined by a part-of-speech tag.
|
protected String[] |
removeExceptions(String[] words) |
String[] |
synthesize(AnalyzedToken token,
String posTag)
Get a form of a given AnalyzedToken, where the form is defined by a
part-of-speech tag.
|
String[] |
synthesize(AnalyzedToken token,
String posTag,
boolean posTagRegExp)
Generates a form of the word with a given POS tag for a given lemma.
|
String[] |
synthesizeForPosTags(String lemma,
Predicate<String> acceptTag)
Synthesize forms for the given lemma and for all POS tags satisfying the given predicate.
|
public final String SPELLNUMBER_TAG
public final String SPELLNUMBER_FEMININE_TAG
public final String SPELLNUMBER_ROMAN_TAG
public BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, Language lang)
BaseSynthesizer(String, String, String, String)resourceFileName - The dictionary file name.tagFileName - The name of a file containing all possible tags.public BaseSynthesizer(String sorosFileName, String resourceFileName, String tagFileName, String langShortCode)
resourceFileName - The dictionary file name.tagFileName - The name of a file containing all possible tags.langShortCode - the language short code used to find the data filespublic BaseSynthesizer(String resourceFileName, String tagFileName, Language lang)
BaseSynthesizer(String, String, String)protected morfologik.stemming.Dictionary getDictionary()
throws IOException
Dictionary used for this synthesizer.
The dictionary file can be defined in the constructor.IOException - In case the dictionary cannot be loaded.protected morfologik.stemming.IStemmer createStemmer()
IStemmer based on the configured dictionary.
The result must not be shared among threads.protected List<String> lookup(String lemma, String posTag)
lemma - the lemma to be inflected.posTag - the desired part-of-speech tag.public String[] synthesize(AnalyzedToken token, String posTag) throws IOException
synthesize in interface Synthesizertoken - AnalyzedToken to be inflected.posTag - The desired part-of-speech tag.IOExceptionpublic String[] synthesize(AnalyzedToken token, String posTag, boolean posTagRegExp) throws IOException
Synthesizersynthesize in interface Synthesizertoken - the token to be used for synthesisposTag - POS tag of the form to be generatedposTagRegExp - Specifies whether the posTag string is a regular expression.IOExceptionpublic String[] synthesizeForPosTags(String lemma, Predicate<String> acceptTag) throws IOException
IOExceptionpublic String getPosTagCorrection(String posTag)
SynthesizergetPosTagCorrection in interface SynthesizerposTag - original POS tag to correctpublic morfologik.stemming.IStemmer getStemmer()
protected void initPossibleTags()
throws IOException
IOExceptionpublic String getSpelledNumber(String arabicNumeral)
SynthesizergetSpelledNumber in interface SynthesizerarabicNumeral - in arabic numeralsprotected boolean isException(String w)
public String getTargetPosTag(List<String> posTags, String targetPosTag)
SynthesizergetTargetPosTag in interface Synthesizer