Database Reference
In-Depth Information
Table 2. Classification of linguistic summaries (S structure denotes that attributes and their connection in a
summary are known, while S value denotes a non-instantiated part of a protoform (a summarizer sought)).
Type
Given
Sought
Remarks
1
S
Q
Simple summaries through ad-hoc queries
2
S B
Q
Conditional summaries through ad-hoc queries
3
Q S structure
S value
Simple value oriented summaries
4
Q S structure B
S value
Conditional value oriented summaries
5
Nothing
S B Q
General fuzzy rules
Kacprzyk & Zadrożny (2005, 2009) proposed to use the concept of a protoform in the sense of Za-
deh (2006) as a template underlying both the internal representation of linguistic summaries and their
formation in a dialogue with the user. A protoform is defined as an abstract prototype of a linguistically
quantified proposition, and its most abstract form is given by (14). Less abstract protoforms are obtained
by instantiating particular elments of (14), i.e., for example by replacing F with a condition/property “price
is cheap”. A more subtle instantiation is also possible where, e.g., only an attribute “price” is specified
and its (fuzzy) value is left over. Thus, the user is constructing a more or less abstract protoform and the
role of the system is to complete it with all missing elements (e.g., referring to our previous example of
the protoform, all possible fuzzy values representing the price) and check the truth value (or other quality
indicator) of each thus obtained lingustic summary. Of course, this is fairly easy for a fully instantiated
protoform, such as (23) but much more difficult, if possible at all, for fully abstract protoform (14).
In Table 1 we show a classification of linguistic summaries into 5 basic types corresponding to pro-
toforms of an increasingly abstract form.
Type 1 and 2 summaries may be easily produced by a simple extension of a fuzzy querying interface
as provided by FQUERY for Access. Basically, the user has to construct a query - a candidate sum-
mary, and it has to be determined what is the fraction of rows matching this query and what linguistic
quantifier best denotes this fraction. Type 3 summaries require much more effort. Their primary goal is
to determine typical (exceptional) values of an attribute. So, query S consists of only one simple condi-
tion built of the attribute whose typical (exceptional) value is sought, the “=” relational operator and a
placeholder for the value sought. The latter corresponds to the non-instantiated part of an underlying
protform. For example, using the following summary in the context of personnel database: Q = “most”
and S = “age=?” (here “?” denotes a placeholder mentioned above) we look for a typical value of age.
A Type 4 summary may produce typical (exceptional) values for some, possibly fuzzy, subset of rows.
From the computational point of view Type 5 summaries, corresponding to the most abstract protoform
(14), represent the fuzzy rules describing dependencies between specific values of particular attributes.
The summaries of Type 1 and 3 have been actually implemented in the framework of FQUERY for Ac-
cess.
As for possible future directions, we can mention the new proposals to explicitly base linguistic data
summarization in the sense considered here, i.e. founded on the concept of Zadeh's computing with
words , on some developments in computational linguistics. First, Kacprzyk & Zadrożny (2010a) have
proposed to consider linguistic summarization in the context of natural language generation (NLG).
Second, Kacprzyk & Zadrozny (2010b) suggested the use of some natural language generation (NLG)
Search WWH ::




Custom Search