Database Reference
In-Depth Information
Table 12.1 A sample of real
moving object data showing
non-constant sampling rate
Id
Timestamp
Location-long
Location-lat
2635
1997-07-24 20:50:00
149 . 007
63.809
2635
1997-07-24 21:23:35
148 . 897
63.766
2635
1997-07-27 22:30:23
148 . 967
63.824
2635
1997-07-31 02:52:48
149 . 026
63.803
2635
1997-08-03 01:47:04
149 . 046
63.795
An ID-based spatiotemporal data is essentially a trajectory. The tracking device is
attached to a moving object. For example, scientists can embed sensors on animals'
body and use GPS to track them; cellphone data can reveal an individual person's
movement; and GPS embedded in cars can track a vehicle's movement. Suppose we
have trajectories of n moving objects
. Each trajectory is represented
as a sequence of points ( x 1 , y 1 , t m ), ( x 2 , y 2 , t m ), ... ,( x n , y n , t m ), where ( x i , y i )isa
location (longitude and latitude) and t i is the time when location ( x i , y i ) is recorded.
The trajectory data could contain a large set of moving objects and the tracking time
for moving objects could expand several years.
A location-based spatiotemporal data is the temporal data collected from a fixed
location. The tracking devices (i.e., sensors) are fixed at certain locations. For ex-
ample, sensors embedded on the road can track the speed and volume of the traffic;
sensors are installed at various locations to track the weather information, such as
temperature, wind speed and humidity. There are a set of associated properties at
location ( x , y ) at time t . We use f ( x , y , t , p ) to denote the value of property p at
location ( x , y ) at time t .
In this topic chapter, we will focus on ID-based spatiotemporal data (i.e., trajec-
tories). We will mainly discuss about the patterns of animal and human movement
data.
{
o 1 , o 2 , ... , o n }
2.2
Data Preprocessing
The raw trajectory data are unevenly sampled and could contain a long period of
missing data. Table 12.1 shows a sample of raw trajectory data. As we can see that
the data is sampled with uneven gaps and there could be 3-4 days missing data.
Depending on different tracking scenarios, the sampling rate of movement could
vary from seconds to days. For bird tracking, the data could be sampled every 3-5
days in order to save battery and make the tracking time span to several years. For
vehicles, the sampling rate could be as small as seconds. For mobile phone users,
there is a reported point only when the user is connecting to cellphone towers.
Most of trajectory mining methods assume the data is evenly sampled. A simple
and commonly used preprocessing step is to use linear interpolation to make the
data evenly gapped. If two consecutive points in a trajectory are gapped with a long
time period, linear interpolation may introduce a lot of errors. For example, one
data point of a human trajectory is being at home at 9 p.m. on Monday and the next
point is being at home at 10 p.m. on Wednesday. If we use1htolinearly interpolate
 
Search WWH ::




Custom Search