Database Reference
In-Depth Information
character (comma by default). It is legal to quote any field even if the
quotes are not required. Floating point values can use either a decimal
representation or scientific notation. Boolean values can be any of true ,
false , t , and f , and the case is not significant.
Whitespace handling is easy to overlook in CSV. Leading and trailing
whitespace characters in fields are ignored. So on the fourth line, the first
field is parsed as “bare string” with the trailing space dropped. If you
need to preserve whitespace in a string field use quotes.
Timestamp values can be represented as a calendar date and time or as
seconds since the UNIX epoch (1970-01-01 00:00:00 UTC). The first and
fourth lines use the string format, and the third line uses seconds since the
epoch. The string format for timestamps is:
YYYY-MM-DD HH:MM:SS[.ssssss] [±00:00]
The fractional seconds and time zone offset are optional. If the offset is not
present, the UTC time zone is assumed. Time zone offset codes (e.g. UTC,
EST, PDT) are not supported.
The line (record) separators are not easy to see in the previous sample. Each
line must be terminated by a newline ( \n ), carriage return ( \r ), or a carriage
return followed by a new newline ( \r\n ). By default the service assumes
that these characters do not appear within fields, even if the field is quoted.
The reasons for this is explained in a separate section, but the main thing
to note is that these characters serve as the record separators in the CSV
format.
This covers the basics of the CSV format. If you can directly generate the
CSV to be loaded into BigQuery, this is all you need. However, in many
scenarios you cannot control the details of how the CSV is generated, so
it is necessary to adjust how the data is parsed. The next few sections are
organized by parsing options specific to the CSV format.
fieldDelimiter
A common variant of the CSV format is Tab-Separated-Values where the
tab ( \t ) character is used to separate fields rather than a comma. This is
convenient when the fields are text and frequently contain commas. With
tab as the separator, commas can appear in fields without quoting. More
Search WWH ::




Custom Search