Information Technology Reference
In-Depth Information
Application (scalar multiplication):
The next program
scalarMultiplica
-
tion
implements scalar multiplication. If
aFile
is a vector and
n
is a number, then it is
used as
scalarMultiplication aFile n
.
#!/bin/sh
# scalarMultiplication
# First argument $1 is vector. Second argument $2 is scalar.
awk
'{
$(NF)*=('$2'+0) ;
print
}'
$1
Explanation:
The scalar
$2
is spliced into the
awk
program by
sh
using string
concatenation of the strings
'
{
$(NF)*=('
, the content of the second argument of
the command
$2
and
'+0) ; print
}
'
. Then, for every input line of the file
$1
,
every frequency in the vector which is stored in
$(NF)
is multiplied by the scalar
and the resulting pattern space is printed.
Application (absolute value):
The next program
computeAbsoluteValue
com-
putes the absolute value of the frequencies of items in a vector.
#!/bin/sh
# computeAbsoluteValue
awk
'$(NF)<0 { $(NF)=-$(NF) } ;
{ print }'
$1
Explanation:
If the last field of an input line is negative, then its sign is reversed.
Next, every line is printed.
Application (sign function):
Like the previous program, the next program
frequencySign
computes the sign of the frequencies of items in a vector.
#!/bin/sh
# frequencySign
awk
'$(NF)>0 {$(NF)=1}; $(NF)<0 {$(NF)=-1}; {print}'
$1
Application (selecting frequencies):
The next program cuts away low frequencies
from a file
$1
that is a vector. The limit value
$2
is the second argument to the
program. We shall refer to it as
filterHighFrequencies
. It can be used to gain files
with very common words that are functional in the grammatical sense but not in
regard to the context.
#!/bin/sh
# filterHighFrequencies
# First argument $1: vector. Second argument $2: cut-off threshold.
awk
'$(NF)>='$2
$1
Explanation:
$2
stands for the second argument to
filterHighFrequencies
.
If this program is invoked with
filterHighFrequencies fname 5
, then
sh
passes
$(NF)>=5
as selecting address pattern to
awk
. Consequently, all lines of
fname
where
the last field is larger than or equal to 5 are printed.
Application:
The vector operations presented above allow to analyse and com-
pare,
e.g.
, vocabulary use of students in a class in a large variety of ways (vocab-
ulary use of a single student
vs.
the class or
vs.
a dedicated list of words, sim-
ilarity/distinction of vocabulary use among students, computation of probalility
distributions over vocabulary use (normalization), etc.).
Search WWH ::
Custom Search