Database Reference
In-Depth Information
Then the
VAR 3
says that the first 3 bytes of each input record contain the length of that input record. If we take a
data file such as the following, we can load it using that control file:
[tkyte@desktop tkyte]$ cat demo.dat
05510,Sales,Virginia,This is the Sales
Office in Virginia
06520,Accounting,Virginia,This is the Accounting
Office in Virginia
06530,Consulting,Virginia,This is the Consulting
Office in Virginia
05940,Finance,Virginia,This is the Finance
Office in Virginia
[tkyte@desktop tkyte]$
In our input data file, we have four rows of data. The first row starts with
055
, meaning that the next 55 bytes
represent the first input record. This 55 bytes includes the terminating newline after the word
Virginia
. The next
row starts with
065
. It has 65 bytes of text, and so on. Using this format data file, we can easily load our data with
embedded newlines.
Again, if you are using UNIX/Linux and Windows (the preceding example was with UNIX/Linux, where a
newline is one character long), you would have to adjust the length field for each record. On Windows, the preceding
example's
.dat
file would have to contain 56, 66, 66, and 60 for the values in the length fields.
Use the STR Attribute
This is perhaps the most flexible method of loading data with embedded newlines. Using the
STR
attribute, we can
specify a new end-of-line character (or sequence of characters). This allows us to create an input data file that has
some special character at the end of each line—the newline is no longer “special.”
I prefer to use a sequence of characters, typically some special marker, and then a newline. This makes it easy
to see the end-of-line character when viewing the input data in a text editor or some utility, as each record still
has a newline at the end of it. The
STR
attribute is specified in hexadecimal, and perhaps the easiest way to get the
exact hexadecimal string we need is to use SQL and
UTL_RAW
to produce the hexadecimal string for us. For example,
assuming we are on UNIX/Linux where the end-of-line marker is
CHR(10)
(linefeed) and our special marker character
is a pipe symbol (
|
), we can write this:
EODA@ORA12CR1> select utl_raw.cast_to_raw( '|'||chr(10) ) from dual;
UTL_RAW.CAST_TO_RAW('|'||CHR(10))
-------------------------------------------------------------------------------
7C0A
It shows us that the
STR
we need to use on UNIX/Linux is
X'7C0A'
.
■
Note
on Windows, you would use
UTL_RAW.CAST_TO_RAW( '|'||chr(13)||chr(10) )
.