Information Technology Reference
In-Depth Information
represented using kanji, on pronunciation and kun pronunciation. The on pronun-
ciation of kanji is typeset in italics . In the source file, the associated text is framed
by a unique pair of control sequences. Similarly, the kun pronunciation of kanji is
represented by small caps.
3) The source was typed by a human with a regular layout on paper ( i.e. ,inthe
printed topic) in mind. Though quite regular, it contains a certain collection of de-
scribable irregularities. For example, the ranges of framing pairs of control sequences
overlap sometimes. In order to match on and kun pronunciation in the source file
of [37] properly, a collection of commutation rules for control sequences was imple-
mented such that the control sequences needed for pattern matching framed only a
piece of text and no other control sequences. These commutation rules were imple-
mented in a similar way as the latter example shows.
Application (sorting results into different files): The following example illustrates
how to sort/copy results of text-processing with sed into a number of dedicated files.
We shall refer to the following program as sortByVowel . sortByVowel sorts all words
in a text file $1 = fName into several files depending upon the vowels occurring in the
words. For example, all words containing the vowel “a” are put into one file fName.a .
1: #!/bin/sh
2: # sortByVowel
3: echo >$1.a;
echo >$1.e;
echo >$1.i;
echo >$1.o;
echo >$1.u;
4: leaveOnlyWords $1
|
oneItemPerLine -
|
5: sed -n '/a/w '$1'.a
6: /e/w '$1'.e
7: /i/w '$1'.i
8: /o/w '$1'.o
9: /u/w '$1'.u'
Explanation: Line 3 of this sh program generates empty files $1.a ... $1.u in case
the program has been used before on the same file. This is done by echo ing “ nothing
plus a terminating newline character” into the files. 12 First, we observe, that the
option 13 -n (no printing) suppresses 14 any output of sed . Next, observe the use of
the single quotes and string concatenation in sh in lines 5-9. For example, if the
argument $1 to sh equals the string fName , then sh passes the string /a/w fName.a
to sed in line 5. Thus, if the current pattern space (input line) contains the vowel
“a”, then sed writes to the file fName.a in line 5. Output by the w operator is
always appended to an existing file. Thus, the files have to be removed or empty
versions have to be created in case the program has been used before on the same
file. (Consult man echo and man rm .) Note that everything after a w operator and
separating white space until the end of the line is understood as the filename the w
operator is supposed to write to. Note in addition, that a w operator can follow and
be part of a substitution command. In that case the w operator writes to the named
file if a substitution was made.
12 For example, echo 'liberal' >fName overwrites( > ) the file fName with the con-
tent liberal newline .
13 When used, options are usually listed directly after a UNIX command with a
leading hyphen before the first “real” argument $1 of the command.
14 Printing can always be triggered explicitly by the print operator p . For example,
/liberal/p prints the pattern space, if the string liberal has been found.
Search WWH ::




Custom Search