Information Technology Reference
In-Depth Information
Visual Data Exploration Using Webbles
Jonas Sjobergh and Yuzuru Tanaka
Meme Media Lab., Hokkaido University, Japan
{ js,tanaka } @meme.hokudai.ac.jp
Abstract. We describe a system for visual exploration of data built us-
ing pluggable software components called Webbles. The system specifies
a small common interface, a set of slots the plugins are expected to have,
and any Webble following this interface can be plugged in at runtime.
The system contains several types of visualization components, some
built from scratch, some built by writing Webble wrappers for existing
software, and some built by writing small interface wrappers for existing
Webbles. The visualization components allow for interactive exploration
of data, and selections or grouping of data in one visualization compo-
nent are propagated to other components automatically. Interaction is
done through direct manipulation of the visualization results.
1 Introduction
Nowadays very large collections of data are common in many areas. Sometimes
the real world system generating the data is dicult to model, what parameters
influence the model in what way may not be known in detail. Thus, even though
we can collect lots of data that probably contains information we are interested
in, using this data to predict what we want to know may still be di cult.
One example from our research is snow plowing and snow removal in Sapporo,
a city with almost two million citizens that gets six meters of snow per year. The
cost of removing snow are around 15,000,000,000 yen per year (150 million dol-
lars). Using modern IT technology, it may be possible to decrease cost, improve
quality, or remove snow in a more environmentally friendly way.
We have data from many sources: weather stations (temperatures, snow fall,
wind, etc.); taxis and private cars regularly recording speed and position; snow
plowing and snow removal records; logs from call centers for snow related prob-
lems; trac jam sensors; trac accident reports; bus trac logs; social media
like Twitter (are people talking about problems with snow?), and more. A lot
of these data are likely to be related to snow removal, e.g. less complaints when
snow removal works well or lower average speeds of cars and buses when snow
is not removed. The system is of course very complex, though, and it is dicult
to model what factors will have what impact on e.g. trac conditions.
The data is very high dimensional, very sparse (e.g. a few thousand cars with
sensors driving over 100,000 road segments means low coverage), and often of low
quality. There are sensor failures leaving “holes” in the data, manually entered
data (e.g. call center complaints) with input errors, and unreliable data sources,
 
Search WWH ::




Custom Search