Information Technology Reference
In-Depth Information
special characters in sed . 2) The second substitution command generates an s -
command from a given string on a single line. In fact, out of the pattern space (line)
containing solely the string built\-in which is matched by .* and reproduced by
& in the substitution command in line 7, the following s -command is generated in
$1.sed :
s/\([^A-Za-z]\)built\-in([^A-Za-z]\)/\1\2/g
Note that all slash and backslash characters occurring in the latter line (except the
one in built\-in ) have to be preceded by an additional backslash in the replacement
s\/\\([^A-Za-z]\\)&\\([^A-Za-z]\\)\/\\1\\2\/g
in the generating second substitution command listed above to represent themselves.
The list of generated s -commands is stored in a new file $1.sed in line 8. Using
sed -f $1.sed in line 9, this file of s -commands is then applied to the file whose
name is given to sh as second argument $2 .
Application (checking summary writing): With the technique introduced above,
one can compare student summaries against the original text by deleting words from
the original in the students' writings.
12.3.5 Further Processing Techniques
Processing an input line repeatedly with the same (fragment of the) cycle is an
important feature of sed programs and an important technique. This involves the
address operator ( : ) and the loop operator test ( t )of sed .
The next example
illustrates the basic mechanism in an elementary setup.
Example: The following program moves all characters 0 (zero) to the very right
of a line. This shows the typical use of the t operator.
#!/bin/sh
# Move all zeroes to the right in a line.
sed
': again;
s/0\([^0]\)/\10/g;
t again'
$1
Explanation: The first command of the sed program defines the address again .
The second command exchanges all characters 0 with a neighboring non-zero to the
right. Hereby, the non-zero is encoded as [^0] , tagged, and reused as \1 to the left in
the replacement in the substitution command. The last command tests whether or
not a substitution happened. If a substitution happened, then the cycle is continued
at : again . Otherwise, the cycle is terminated.
Application (defining commutation relations and standardization for control se-
quences in a non plain-text file): In the course of the investigation in [1, 2, 3],
techniques were developed by one of the authors to transform the file containing
the source file of [37] (which was generated with a What-You-See-Is-What-You-Get
Editor) into a prolog database. This raised the following problems:
1) The source is “dirty”: it contains many control sequences coming from the
wysiwyg-editor which have no meaning, but were used for the format and the spac-
ing in the printed topic. Such control sequences had to be removed. This was done
using substitution commands with empty replacements.
2) The source cannot be “cleaned” in an easy fashion from the control sequences
mentioned in 1). Some of the control sequences in the source are important in regard
to the database which was generated. In [37], Japanese words and compounds are
Search WWH ::




Custom Search