Database Reference
In-Depth Information
How to do it…
To set things up, we have to load the models and bind them to function names. To load the
models, we'll use the opennlp.nlp/make-name-finder function. We can use this to load
each recognizer individually, as follows:
(def get-persons
(nlp/make-name-finder "models/en-ner-person.bin"))
(def get-orgs
(nlp/make-name-finder "models/en-ner-organization.bin"))
(def get-date
(nlp/make-name-finder "models/en-ner-date.bin"))
(def get-location
(nlp/make-name-finder "models/en-ner-location.bin"))
(def get-money
(nlp/make-name-finder "models/en-ner-money.bin"))
Now, in order to test this out, let's load the latest SOTU address in our corpus. This is Barak
Obama's 2013 State of the Union:
(def sotu (tokenize (slurp "sotu/2013-0.txt")))
We can call each of these functions on the tokenized text to see the results, as shown here:
user=> (get-persons sotu)
("John F. Kennedy" "Most Americans—Democrats" "Government" "John
McCain" "Joe Lieberman" "So" "Tonight I" "Joe Biden" "Joe" "Tonight"
"Al Qaida" "Russia" "And" "Michelle" "Hadiya Pendleton" "Gabby
Giffords" "Menchu Sanchez" "Desiline Victor" "Brian Murphy" "Brian")
user=> (get-orgs sotu)
("Congress" "Union" "Nation" "America" "Tax" "Apple" "Department
of Defense and Energy" "CEOs" "Siemens America—a" "New York Public
Schools" "City University of New York" "IBM" "American" "Higher
Education" "Federal" "Senate" "House" "CEO" "European Union" "It")
user=> (get-date sotu)
("this year" "18 months ago" "Last year" "Today" "last 15" "2007"
"today" "tomorrow" "20 years ago" "This" "last year" "This spring"
"next year" "2014" "next two decades" "next month" "a")
user=> (get-location sotu)
("Washington" "United States of America" "Earth" "Japan" "Mexico"
"America" "Youngstown" "Ohio" "China" "North Carolina" "Georgia"
"Oklahoma" "Germany" "Brooklyn" "Afghanistan" "Arabian Peninsula"
"Africa" "Libya" "Mali" "And" "North Korea" "Iran" "Russia" "Asia"
"Atlantic" "United States" "Rangoon" "Burma" "the Americas" "Europe"
"Middle East" "Egypt" "Israel" "Chicago" "Oak Creek" "New York City"
"Miami" "Wisconsin")
user=> (get-money sotu)
("$ 2.5 trillion" "$ 4 trillion" "$ 140 to" "$")
 
Search WWH ::




Custom Search