Java Reference
In-Depth Information
classifier = DynamicLMClassifier.createNGramProcess(
categories, nGramSize);
As we did earlier, we will create a series of instances based on the contents found in the
training files. We will not detail the following code as it is very similar to that found in the
Training text using the Classified class section. The main difference is there are only two
categories to process:
String directory = "...";
File trainingDirectory = new File(directory,
"txt_sentoken");
for (int i = 0; i < categories.length; ++i) {
Classification classification =
new Classification(categories[i]);
File file = new File(trainingDirectory, categories[i]);
File[] trainingFiles = file.listFiles();
for (int j = 0; j < trainingFiles.length; ++j) {
try {
String review = Files.readFromFile(
trainingFiles[j], "ISO-8859-1");
Classified<CharSequence> classified =
new Classified<>(review, classification);
classifier.handle(classified);
} catch (IOException ex) {
ex.printStackTrace();
}
}
}
The model is now ready to be used. We will use the review for the movie Forrest Gump:
String review = "An overly sentimental film with a somewhat
"
+ "problematic message, but its sweetness and charm "
+ "are occasionally enough to approximate true depth "
+ "and grace. ";
We use the classify method to perform the actual work. It returns a Classifica-
tion instance whose bestCategory method returns the best category, as shown here:
Search WWH ::




Custom Search