Information Technology Reference
In-Depth Information
represented using kanji,
on
pronunciation and kun pronunciation. The
on
pronun-
ciation of kanji is typeset in
italics
. In the source file, the associated text is framed
by a unique pair of control sequences. Similarly, the kun pronunciation of kanji is
represented by small caps.
3) The source was typed by a human with a regular layout on paper (
i.e.
,inthe
printed topic) in mind. Though quite regular, it contains a certain collection of de-
scribable irregularities. For example, the ranges of framing pairs of control sequences
overlap sometimes. In order to match
on
and kun pronunciation in the source file
of [37] properly, a collection of commutation rules for control sequences was imple-
mented such that the control sequences needed for pattern matching framed only a
piece of text and no other control sequences. These commutation rules were imple-
mented in a similar way as the latter example shows.
Application (sorting results into different files):
The following example illustrates
how to sort/copy results of text-processing with
sed
into a number of dedicated files.
We shall refer to the following program as
sortByVowel
.
sortByVowel
sorts all words
in a text file
$1
=
fName
into several files depending upon the vowels occurring in the
words. For example, all words containing the vowel “a” are put into one file
fName.a
.
1: #!/bin/sh
2: # sortByVowel
3: echo >$1.a;
echo >$1.e;
echo >$1.i;
echo >$1.o;
echo >$1.u;
4: leaveOnlyWords $1
|
oneItemPerLine -
|
5: sed -n '/a/w '$1'.a
6: /e/w '$1'.e
7: /i/w '$1'.i
8: /o/w '$1'.o
9: /u/w '$1'.u'
Explanation:
Line 3 of this
sh
program generates empty files
$1.a
...
$1.u
in case
the program has been used before on the same file. This is done by
echo
ing “
nothing
plus a terminating
newline
character” into the files.
12
First, we observe, that the
option
13
-n
(no printing) suppresses
14
any output of
sed
. Next, observe the use of
the single quotes and string concatenation in
sh
in lines 5-9. For example, if the
argument
$1
to
sh
equals the string
fName
, then
sh
passes the string
/a/w fName.a
to
sed
in line 5. Thus, if the current pattern space (input line) contains the vowel
“a”, then
sed
writes to the file
fName.a
in line 5. Output by the
w
operator is
always appended to an existing file. Thus, the files have to be removed or empty
versions have to be created in case the program has been used before on the same
file. (Consult
man echo
and
man rm
.) Note that everything after a
w
operator and
separating white space until the end of the line is understood as the filename the
w
operator is supposed to write to. Note in addition, that a
w
operator can follow and
be part of a substitution command. In that case the
w
operator writes to the named
file if a substitution was made.
12
For example,
echo 'liberal' >fName
overwrites(
>
) the file
fName
with the con-
tent
liberal
newline
.
13
When used, options are usually listed directly after a UNIX command with a
leading hyphen before the first “real” argument
$1
of the command.
14
Printing can always be triggered explicitly by the print operator
p
. For example,
/liberal/p
prints the pattern space, if the string
liberal
has been found.
Search WWH ::
Custom Search