Database Reference
In-Depth Information
• To show totals and compute percentages for each speaker, we need the full set of
spoken lines in the play:
let
$
all-lines
:=
$
play-document
//
SPEECH
/
LINE
• We need the lines spoken by each speaker. Assuming the name of the speaker is
in
$speaker
, we can get these with:
$
speaker-lines
:=
$
play-document
//
SPEECH
[
SPEAKER
eq
$
speaker
]/
LINE
• Given a sequence of
LINE
elements in a variable
$elms
(the sequence of
LINE
ele‐
ments we retrieved in one of the previous two bullets), how many words are spo‐
ken? A rough but usable approximation for this can be calculated by tokenizing
everything said using whitespace boundaries and couningt/aggregating the
results:
sum
(
$
elms
!
count
(
tokenize
(.,
'\s+'
)))
This might need a little further explanation:
— The exclamation mark after the
$elms
expression is an XQuery 3.0 “bang” or
“simple map” operator (see
“The simple map operator” on page 114
). It per‐
forms the operation on the right for all members of the sequence on the left.
You could have done this in several other (and probably more customary)
ways (e.g., using a FLWOR expression), but this seemed a useful way to intro‐
duce one of the new XQuery 3.0 capabilities.
— The
tokenize
function tokenizes (breaks up) strings on boundaries given by a
regular expression. The regular expression
'\s+'
signifies a sequence of
whitespace characters, so that gives us the words (more or less, sometimes
punctuation gets in the way, but let's forget about that for now).
— We're not interested in the words themselves but only in how many there are.
So we simply
count
them.
— The outer
sum
function sums the numeric results of what's inside, returning
the total of all words spoken in the elements in
$elms
.
As a last step, let's put this functionality into a local function (because we're going to
use it twice in our code: once for the full play and once for every speaker):
declare
function
local:word-count
(
$
elms
as
element
()*)
as
xs:integer
{
sum
(
$
elms
!
count
(
tokenize
(.,
'\s+'
)))
};
Now let's put this all together and create a page that shows us the results of our analy‐
sis (just for
Hamlet
) in a table. The code for this is shown in
Example 3-7
.
Search WWH ::
Custom Search