Information Technology Reference
In-Depth Information
3. Morphosyntactic linguistic wavelet approach
3.1 A sequential approach to wavelets
Because language is complex, soft decomposition into a set of base functions (as in
traditional wavelets) is a multi-step process with several components.
Developing numeric wavelets usually includes the following steps:
1. Take the original signal sample
2. Apply filtering (decomposition using the mother wavelet)
3. Analyze coefficients defined by the basis function
4. If the granularity and details are inadequate for the current problem, repeat from step 2
5. Take the resulting coefficients as a current representation of the signal
Language requires additional steps, which are described in more detail in the following
section. In brief, these steps are:
1. Take the original text sample
2. Compress and translate text into an oriented graph (E ci ) preserving most
morphosyntactic properties
3. Apply filtering using the most suitable approach
4. If abstraction granularity and details are insufficient for the current problem
4.1 Insert a new filter, E ce , in the knowledge organization
4.2 Repeat from step 3
5. Take the resulting sequence of filtering as a current representation of the knowledge
about and ontology of the text
6. Take the resulting E ci as the internal representation of the new text event
A short description of the MLW steps is presented below, with an example in the Use case.
3.2 Details of the MLW process
Further details of the MLW process are provided in this section, with the considerations
relevant to each step included.
3.2.1 Take the original text sample
Text can be extracted from Spanish dialogs, Web pages, documents, speech transcriptions,
and other documents. The case study in the section 4 uses dialogs, transcriptions, and other
documents. Several references mentioned in this chapter were based on Web pages.
3.2.2 Compress and translate text into an oriented graph (Called E ci ) preserving most
morphosyntactic properties
Original text is processed using predefined and static tables. The main components of this
step are as follows:
- Filter useless morphemes 5 using reference tables.
5 Syntagm (linguistics) is any sequenced combination of morphologic elements that is considered a unit,
has stability and is commonly accepted by native speakers.
Search WWH ::




Custom Search