Database Reference
In-Depth Information
more akin to a BCP format file or a view than it is a table. (We will look at
the internal implementation of external tables shortly.)
It is important to note that the coupling between PDW and Hadoop is
very, very loose. The external table merely defines the interface for data
transmission; it does not contain any data itself nor does it bind itself to the
data in Hadoop. Changes to the structure of the data residing in Hadoop is
possible, and even the complete removal of the data is conceivable; PDW
would be none the wiser. In tech speak, unlike database views there is
no schema-binding option available here. We cannot prevent a table from
disappearing in Hadoop by creating an external table. Likewise, when we
delete an external table, we do not delete the data from HDFS.
Furthermore, there are no additional concurrency controls or isolation
levels in operation either when accessing data through an external table.
While there is schema validation against the existing external table object
definitions, it is quite possible you may see a runtime error when using
Polybase. However, this is part of the design and helps Polybase retain its
agnostic approach to Hadoop integration.
External tables are exposed via the sys.external_tables catalog view and in
the SSDT tree control. The view inherits from sys.objects, and it exposes to
us all the configuration and connection metadata for the external table.
Following is an example of an external table. Imagine if you will that the
data for the AdventureWorks table FactInternetSales existed in HDFS. We
are still bound by the restriction of having unique names for objects in the
same database, so for clarity I have named data residing in Hadoop with an
HDFS prefix:
CREATE EXTERNAL TABLE [dbo].[HDFS_FactInternetSales]
(
[ProductKey] int NOT NULL,
[OrderDateKey] int NOT NULL,
[DueDateKey] int NOT NULL,
[ShipDateKey] int NOT NULL,
[CustomerKey] int NOT NULL,
[PromotionKey] int NOT NULL,
[CurrencyKey] int NOT NULL,
[SalesTerritoryKey] int NOT NULL,
[SalesOrderNumber] nvarchar(20) NOT NULL,
Search WWH ::




Custom Search