Classifying the TRIZ Contradiction Problem of the Patents Based on Engineering Parameters - Technologies and Applications of Artificial Intelligence

Information Technology Reference

In-Depth Information

Fig. 2. The flowchart of the CFF process

The Processes for Generating TFIDF Type Features

For the inputs of the all sentences, the Bag-of-words process calculates TFIDF value

of each word appeared in all sentences of the document, and then generates the

TFIDF vector outputs for training and testing documents. For the inputs of the strong

sentences, the sentences which the count of relating important set is equal or larger

than one are extracted throughout the Bag-of-words process. The TFIDF value of

each words appeared in extracted sentences of the document is calculated, and then

the STFIDF vector outputs for training and testing documents are generated.

The Processes for Generating Termset Type Features

We proposed an algorithm named Verb Including Split and Associate Termsets

(VISAT). It is a very important process included in CFF process, and we will illu-

strate this in detail later. We discovered that if the sentence relative to more important

word sets, it is taken as more important sentence. Therefore, for training documents,

only the sentences which the count of relating important set is larger than one in the

strong sentences are needed to perform POS tagging to get their part-of-speech infor-

mation. For testing documents, all sentences of testing documents are needed to per-

form POS tagging because the labels of Engineering Parameters of the document are

unknown.

We use the VISAT algorithm to extract the candidate termsets in the CFF Process.

According to our observation, the termsets containing two words are strong enough to

Search WWH ::

Custom Search

Home