Database Reference
In-Depth Information
Amazon: S3/SimpleDB/Amazon RDS
Amazon Simple Storage Service (S3) is an online public storage web service offered
by Amazon Web Services. Conceptually, S3 is an infinite store for objects of
variable sizes. An object is simply a byte container which is identified by a URI.
Clients can read and update S3 objects remotely using a simple web services
(SOAP or REST-based) interface. For example, get . uri / returns an object and
put . uri ; bytestream / writes a new version of the object. In principle, S3 can be
considered as an online backup solution or for archiving large objects which are
not frequently updated.
Amazon has not published details on the implementation of S3. However,
Brantner et al. [ 85 ] have presented initial efforts of building Web-based database
applications on top of S3. They described various protocols for storing, reading and
updating objects and indexes using S3. For example, the record manager component
is designed to manages records where each record is composed of a key and payload
data. Both key and payload are bytestreams of arbitrary length where the only
constraint is that the size of the whole record must be smaller than the page size.
Physically, each record is stored in exactly one page which in turn is stored as a
single object in S3. Logically, each record is part of a collection (e.g., a table).
The record manager provides functions to create new objects, read objects, update
objects, and scan collections. The page manager component implements a buffer
pool for S3 pages. It supports reading pages from S3, pinning the pages in the buffer
pool, updating the pages in the buffer pool, and marking the pages as updated. All
these functionalities are implemented in straightforward way just as in any standard
database system. Furthermore, the page manager implements the commit and abort
methods where it is assumed that the write set of a transaction (i.e. the set of updated
and newly created pages) fits into the client's main memory or secondary storage
(flash or disk). If an application commits, all the updates are propagated to S3
and all the affected pages are marked as unmodified in the client's buffer pool.
Moreover, they implemented standard B-tree indexes on top of the page manager
and basic redo log records. On the other hand, there are many database-specific
issues that has not been addressed, yet, by this work. For example, DB-style strict
consistency and transactions mechanisms are not provided. Furthermore, query
processing techniques (e.g., join algorithms and query optimization techniques) and
traditional database functionalities such as: bulkload a database, create indexes and
drop a whole collection still need to be devised.
SimpleDB is another Amazon service which is designed for providing structured
data storage in the cloud and backed by clusters of Amazon-managed database
servers. It is a highly available and flexible non-relational data store that offloads
the work of database administration. Storing data in SimpleDB does not require
any pre-defined schema information. Developers simply store and query data items
via web services requests and Amazon SimpleDB does the rest. There is no rule
that forces every data item (data record) to have the same fields. However, the lack
of schema means also that there are no data types as all data values are treated as
Search WWH ::




Custom Search