Java Reference
In-Depth Information
new ObjectInputStream(inputStream);) {
HiddenMarkovModel hmm = (HiddenMarkovModel)
objectStream.readObject();
HmmDecoder decoder = new HmmDecoder(hmm);
} catch (IOException ex) {
// Handle exceptions
} catch (ClassNotFoundException ex) {
// Handle exceptions
};
We will perform tagging on theSentence variable. First, it needs to be tokenized. We
will use an Indo-European tokenizer as shown here. The tokenizer method requires
that the text string be converted to an array of chars. The tokenize method then returns
an array of tokens as strings:
TokenizerFactory TOKENIZER_FACTORY =
IndoEuropeanTokenizerFactory.INSTANCE;
char[] charArray = theSentence.toCharArray();
Tokenizer tokenizer =
TOKENIZER_FACTORY.tokenizer(
charArray, 0, charArray.length);
String[] tokens = tokenizer.tokenize();
The actual tagging is performed by the HmmDecoder class' tag method. However, this
method requires a List instance of String tokens. This list is created using the Ar-
rays class' asList method. The Tagging class holds a sequence of tokens and tags:
List<String> tokenList = Arrays.asList(tokens);
Tagging<String> tagString = decoder.tag(tokenList);
We are now ready to display the tokens and their tags. The following loop uses the
token and tag methods to access the tokens and tags, respectively, in the Tagging ob-
ject. They are then displayed:
for (int i = 0; i < tagString.size(); ++i) {
System.out.print(tagString.token(i) + "/"
+ tagString.tag(i) + " ");
}
Search WWH ::




Custom Search