Database Reference
In-Depth Information
BigQuery tables support fields that are arrays and fields that are nested
within other fields. When a table contains such fields, it is not possible to
represent a record in the table as a simple list of values. The CSV family of
formats was designed to represent tabular data, so each record or line of
data is a simple list of values. As a result this input format is not compatible
with BigQuery schemas that contain fields that are arrays or have type
RECORD . If you are constrained to using CSV as an input format, you should
not employ a schema that includes these features.
To understand how CSV formatted data is turned into a BigQuery record,
consider the following concrete schema:
load_config['schema'] = {
'fields': [
{'name':'string_f', 'type':'STRING'},
{'name':'boolean_f', 'type':'BOOLEAN'},
{'name':'integer_f', 'type':'INTEGER'},
{'name':'float_f', 'type':'FLOAT'},
{'name':'timestamp_f', 'type':'TIMESTAMP'}
]
}
Because the default mode for a field is NULLABLE , all the fields in this
schema are optional. Now look at a couple of lines of CSV to see how they
can be transformed into records.
"one",true,1,1.0,2011-11-11 11:11:11
,,,,
"",false,,3.14e-1,1380378423
bare string ,"TRUE","0","0.000","2013-01-03
09:15:02.478 -05:00"
"quoted , and "" in a string",,,,
All the preceding lines import correctly into the table. The fields mapping
is a simple positional mapping, so the order of the fields in the schema
is significant, and the order of the values in the CSV data must line up.
The first line illustrates the basic formatting of values. An unquoted empty
string is interpreted as a missing or null value, so the second line generates
a record with null values for every field. In the fourth line you can see
that quoting is optional for strings that do not contain the field separating
Search WWH ::




Custom Search