Information Technology Reference
In-Depth Information
Legacy file storage systems do not expose the metadata of a file other than what the
operating system reveals. The concept of defining custom metadata for a file is also alien
to legacy file systems.
Data/Blob
In the world of structured data, a blob is considered a lump of raw data that cannot be
operated on. Even though there is structure in every type of data that exists today, the defi-
nition of structure in terms of data is limited to data elements that can be operated on by a
database system. A list of patient names, their addresses, and SSNs would be an example of
structured data, whereas X-ray files, 3D scan files, and such visual data would be classified
as unstructured data.
Any database system would be able to run SQL queries on the structured data and treat
unstructured data as a lump that can be identified by some form of associated structured
data, like an ID or patient name in this case. You cannot “sort” X-ray images based on
their brightness or some other image-related characteristic, but you can always sort a
patient list based on ascending order of SSNs.
An RDBMS is one of the most inefficient ways to store unstructured data. Consider a
typical example of a customer table with the following columns:
serial_no
checkin_date
SSN
treatment_id
Suppose you are building an online electronic medical record (EMR) application and
you have to keep a record of every test and its result, including all imagery/audio data of
the patient within the system, and be able to fetch them on demand. This would mean that
you insert all this “unstructured” data as blobs into the database and hurt the overall query
performance because a single row of maybe a few kilobytes has now bloated to anywhere
from a few megabytes to hundreds of megabytes or even GBs. This would be the most
inefficient but highly convenient way to do this because all your information of any given
patient is centralized into the database. Databases were just not made to embrace unstruc-
tured data; this was always an afterthought or a default option to construct your applica-
tion quickly and forego performance.
Within the cloud, your data is considered a blob and put into the same pool as any other
structured or unstructured data element. The recommended practice, though, is to store
structured data into document-based storage and route unstructured data into Amazon S3,
Microsoft Azure Storage, or Google Cloud Storage.
Extended Metadata
As we discussed previously, the association of metadata with an object, both system-generated
metadata and custom key-value metadata created by the user, helps in associating added
information, especially with unstructured data objects. Extended metadata primarily consists
of a unique identifier for every single object in the pool of the cloud vendor, which helps in
fetching the object regardless of where it's physically located.
Search WWH ::




Custom Search