Remarks on a Fuzzy Approach to Flexible Database Querying, Its Extension and Relation to Data Mining and Summarization - Advanced Database Query Systems

Database Reference

In-Depth Information

The matching degree of a tuple against a bipolar query ( C, P ) is thus meant here as the truth value of

(21), computed in the framework of fuzzy (multivalued) logic using the right-hand side of (22). Thus, the

evaluation of a bipolar query in this approach produces a fuzzy set of tuples, where the membership func-

tion value for a tuple t corresponds to the matching degree of this tuple against the query. The answer to a

bipolar query is then a list of the tuples, non-increasingly ordered according to their membership degree.

In (22), the min, max and 1- x operators are used to model the connectives of conjunction, disjunction

and negation, respectively. Moreover, the implication connective ∧ is modeled by the Kleene-Dienes

implication operator and the existential quantifier ∧ is modeled via the maximum operator. As there are

many other alternatives the issue arises how to appropriately model the logical connectives in (22). For

more information on this issue, as well as on other issues related to bipolar queries, we refer the reader

to our works (Zadrożny & Kacprzyk, 2007, 2009a; Matthé & De Tré, 2009; De Tré et al., 2009).

Concluding this section, it has to be stressed that the research on bipolarity in the framework of

database querying is still at its infancy. Despite some very advanced theoretical treatments [cf. (Dubois

& Prade, 2008)] still a vast area of possible interpretations is not covered yet and further research is

definitely needed.

FUZZY QUERIES AND LINGUISTIC DATA SUMMARIES

Though the power of FQUERY for Access, or maybe more appropriately its underlying idea of using a

linguistic quantifier in the query, is quite obvious for retrieving information of interest to a human user

from a database, it has been proven even more effective and efficient as a tool for the implementation of

linguistic data summarization. This is one of basic capabilities needed by any “intelligent” system that is

meant to operate in real life situations, and - since for the human being the only fully natural means of

communication is natural language - then a linguistic (say, by a sentence or a small number of sentences

in a natural language) summarization of a set of data would be very desirable and human consistent.

Unfortunately, data summarization is still in general unsolved a problem in spite of vast research

efforts. In this paper we will use a simple yet effective and efficient approach to the linguistic sum-

marization of data sets (databases) proposed by Yager (1982), and then presented in a more advanced,

and implementable form by Kacprzyk & Yager (2001), and Kacprzyk, Yager & Zadrożny (2000). This

will provide a point of departure for our further analysis of more complicated and realistic summaries.

Let us assume the following notation and terminology:

• V is a quality (attribute) of interest, e.g. salary in a database of workers,

• Y ={ y 1 ,…, y n } is a set of objects (records) that manifest quality V , e.g. the set of workers; hence

V ( y i ) are values of quality V for object y i ∧ Y ;

• D = { V ( y 1 ) ,…, V ( y n )} is a set of data (the “database” in question)

A linguistic summary of a data set D consists of:

•

a summarizer S (e.g. young),

•

a qualifier R (e.g. recently hired),

•

a quantity in agreement Q (e.g. most),

•

truth T - e.g. 0.7,

Search WWH ::

Custom Search

Home