Databases Reference
In-Depth Information
You do not do anything to identify a distributed query as different in the
SELECT statement. The difference is that you specify data sources located on
different database servers. Data sources must be fully qualified, at least to the
point that the database server can uniquely identify and locate the source. How-
ever, depending on the remote sources, special clauses or functions may be
needed to access the remote servers and retrieve the necessary data.
The server executing the query is also the controlling server. It issues the
commands necessary to retrieve the data from the remote databases. The specifics
of this process are somewhat transparent to the user. Once the controlling server
has all of the data that it needs cached in local memory, it completes the process
as if running a local query.
As with distributed transactions, this process depends on well-connected
servers to keep performance at an acceptable level. It is also suggested that you
filter the results at the source servers to minimize the amount of data trans-
ferred. This is another situation where views and UDFs can be helpful in man-
aging and optimizing the process. The queries may also require special secu-
rity configurations to enable connectivity between the servers involved in the
query.
Some DBMSs, like SQL Server, require you to fully qualify each data source
so that it is uniquely identified. Other DBMSs do not require you to fully qual-
ify the tables and columns used in a distributed join, instead relying on a dis-
tributed data dictionary to locate the table data. Even then, using fully qualified
object names is still an option and is the preferred way of running the join. One
potential problem with this is that if there are any variations in the data in the
different tables, such as might occur because of replication latency, you might
not get the results you expect from your query.
Managing Management Issues
Managing and administering a distributed data environment often requires a bal-
ancing act between local autonomy and company standards. In order to prevent
complete chaos, it is necessary for someone to have final say over the data envi-
ronment as a whole, even if various administrative duties are delegated to local
administrators.
The two most common management models are a centralized management
model and a localized management with central oversight model. The central-
ized management model is the easiest one to understand and is the least likely
to result in administrative errors. In this model, all database servers, no matter
where they are physically located, are administered from a central location using
remote management tools. This is possible because most modern DBMSs ship
with embedded remote management utilities. SQL Server Management Studio,
for example, lets you register any number of local and remote servers for man-
agement and administration purposes. The primary advantage of this model is that
one person, or one administration group, has total control. There is no question
Search WWH ::




Custom Search