Information Technology Reference
In-Depth Information
I want to stress that even when we are talking about technical clouds, there are a
number of these, and do not make the mistake of thinking they play particularly nicely
with each other: they are all a bit different. It is quite a pain, in many cases, quite a
substantial and expensive and time-consuming technical enterprise to migrate
something from one cloud to another.
Furthermore, particularly when you think about data-intensive applications, as
many of us in the world of scientific or scholarly data curation, libraries, management
of cultural materials think about, when you think about those sorts of things,
bandwidth is a big issue. If it takes you months to get your data in and further months
to get it out because you are constrained by some limited bandwidth bottleneck at the
entry point to the cloud or you discover that the cost of bandwidth in and out of a
given cloud is very expensive (and hence the switching costs to move from one cloud
to another are very high), or if you face these problems because you want to use data
stored across multiple clouds, you suddenly realize that this world of computational
utilities that we have been dreaming about since the 1960s and now, in some ways,
can see manifest in the clouds, is still a place with lots of barriers and bottlenecks and
many of them revolve around the availability of bandwidth; you need to look very
carefully at how things are interconnected and where your data flows to and from.
So, for instance, you see commercial players like Amazon or Microsoft offering
compute-on-demand services; you might think these would be of great interest to
people who needed extra peak load cycles for some reason, who are doing high-end
scientific computing. They are indeed of some interest, but until fairly recently there
were real barriers. You did not have the direct connections and peering between the
research and education networks (where most of the data was housed) and these
commercial clouds. A lot of research computing (particularly with irregular demands
for cycles, that might be driven by observational campaigns or natural events that
occur irregularly) turns out to also be data intensive as well; it is not like a transaction
peak over a common (fairly small) commercial database already in the cloud that
takes place for Internet florists on Mother's Day. The problem has been getting data
to where the cycles are.
Another likely bottleneck (and we'll come back to this a little later as well) is the
mismatch between the bandwidth available to an individual - what you can get at
home from DSL, from cable broadband, or from fiber (if you are lucky enough to
have it) at some reasonable price - the mismatch between the kind of data rates you
can get there and the kind of bandwidth provisioning available to support activities
that take place inside a Cloud. One implication of this mismatch means that you
cannot casually say “I'm going to do a really big data extract and move it down to my
desktop and deal with it.” You cannot do that: not because your desktop machine is
not big enough or because you do not have enough local disk storage, but because in
many countries the policies that surround the deployment of consumer broadband
have left us in situations where that broadband is really not very broad, and it is
probably not going to get a lot better real soon, as is unquestionably the case in much
of the United States right now. While there are some wild cards in the States like
Google's threat (or promise) to go out and connect every home in a couple of select
cities with direct fiber and really fast, inexpensive connections, it looks pretty grim
for the next few years for most people.
Search WWH ::




Custom Search