public final class AnalyzedSentence extends Object
Constructor and Description |
---|
AnalyzedSentence(AnalyzedTokenReadings[] tokens)
Creates an AnalyzedSentence from the given
AnalyzedTokenReadings . |
AnalyzedSentence(AnalyzedTokenReadings[] tokens,
AnalyzedTokenReadings[] preDisambigTokens) |
Modifier and Type | Method and Description |
---|---|
AnalyzedSentence |
copy(AnalyzedSentence sentence)
The method copies
AnalyzedSentence and returns the copy. |
boolean |
equals(Object o) |
String |
getAnnotations()
Get disambiguator actions log.
|
int |
getCorrectedTextLength()
Text length taking position fixes (for removed soft hyphens etc.) into account, so
this is _not_ always equal to
getText() . |
List<Integer> |
getLemmaOffsets(String token) |
Set<String> |
getLemmaSet()
Get the lowercase lemmas of this sentence in a set.
|
int |
getOriginalPosition(int nonWhPosition)
Get a position of a non-whitespace token in the original sentence with
whitespace.
|
AnalyzedTokenReadings[] |
getPreDisambigTokens() |
AnalyzedTokenReadings[] |
getPreDisambigTokensWithoutWhitespace() |
String |
getText()
Return the original text.
|
List<Integer> |
getTokenOffsets(String token) |
AnalyzedTokenReadings[] |
getTokens()
Returns the
AnalyzedTokenReadings of the analyzed text. |
Set<String> |
getTokenSet()
Get the lowercase tokens of this sentence in a set.
|
AnalyzedTokenReadings[] |
getTokensWithoutWhitespace()
Returns the
AnalyzedTokenReadings of the analyzed text, with
whitespace tokens removed but with the artificial SENT_START
token included. |
int |
hashCode() |
String |
toShortString(String readingDelimiter)
Return string representation without chunk information.
|
String |
toString() |
String |
toString(String readingDelimiter)
Return string representation with chunk information.
|
public AnalyzedSentence(AnalyzedTokenReadings[] tokens)
AnalyzedTokenReadings
. Whitespace is also a token.public AnalyzedSentence(AnalyzedTokenReadings[] tokens, AnalyzedTokenReadings[] preDisambigTokens)
public AnalyzedSentence copy(AnalyzedSentence sentence)
AnalyzedSentence
and returns the copy.
Useful for performing local immunization (for example).sentence
- AnalyzedSentence
to be copiedpublic AnalyzedTokenReadings[] getTokens()
AnalyzedTokenReadings
of the analyzed text. Whitespace
is also a token.public AnalyzedTokenReadings[] getPreDisambigTokens()
public AnalyzedTokenReadings[] getTokensWithoutWhitespace()
AnalyzedTokenReadings
of the analyzed text, with
whitespace tokens removed but with the artificial SENT_START
token included.public AnalyzedTokenReadings[] getPreDisambigTokensWithoutWhitespace()
public int getOriginalPosition(int nonWhPosition)
nonWhPosition
- position of a non-whitespace tokenpublic String toShortString(String readingDelimiter)
public String getText()
public int getCorrectedTextLength()
getText()
.public String toString(String readingDelimiter)
public String getAnnotations()
public Set<String> getTokenSet()
public Set<String> getLemmaSet()
@Nullable @ApiStatus.Internal public List<Integer> getTokenOffsets(String token)
getTokensWithoutWhitespace()
where tokens with the given text occur (case-insensitive),
or null
if there are no such occurrences@Nullable @ApiStatus.Internal public List<Integer> getLemmaOffsets(String token)
getTokensWithoutWhitespace()
where tokens with the given lemma occur (case-insensitive),
or null
if there are no such occurrences