Database Reference
In-Depth Information
generally, it is useful to choose an arbitrary character as the field separator
depending on the characters that are common in the field data. BigQuery
supports a field delimiter option that allows the overriding comma as the
separator.
load_config['fieldDelimiter'] = '\011'
This code sets the delimiter to the tab character, which corresponds to
octal code 011. We used the octal code instead of the C escape code '\t'
to emphasize that the delimiter has to be set to a single byte in the range
0-255. Setting a byte using the octal code escape only works for bytes in
the range 0-127. If you are sending a byte in the range 128-255 it needs
to be encoded as a multi-byte UTF-8 character in the request. The service
converts the UTF-8 string back to ISO-8859-1, almost the same as Latin1,
encoding and uses the first byte in the converted string. To get a feel for
what is happening on the wire, try the following commands on the Python
interactive prompt:
separator = b'\246'.decode('latin1')
separator
u'\xa6'
print separator
¦
separator.encode('utf8')
'\xc2\xa6'
If you want to use the byte with decimal value 166 (octal value 246), which
corresponds to the broken bar (¦) in Latin1 encoding, as the separator for
field values you would need to send the bytes '\xc2\xa6' on the wire,
which is the UTF-8 encoding of this character.
quote
Just like you can customize the field delimiter, you can change the default
quoting character (“) to be any single byte character. The setting is
interpreted like fieldDelimiter , the first byte of the string after
conversion to Latin1 encoding. For example, this is how you would set it to
the single quote character:
load['quote'] = "'"
Search WWH ::




Custom Search