Advances in Multimodal Tracking of Driver Distraction - Digital Signal Processing for In-Vehicle Systems and Safety

Digital Signal Processing Reference

In-Depth Information

From these continuous streams of data, we estimate the derivative of the brake and

gas pedal information. In addition, we estimate the jitter in the steering wheel angle,

since we expect that drivers involved in secondary tasks will produce more “jittery”

behaviors. Vehicle speed is also considered, since it is hypothesized that drivers

tend to reduce the speed of the car when they are engaged in a secondary task.

Frontal Video Camera: The camera captures frontal views of the drivers. From

this modality, we estimate head orientation and eye closure count. The head pose is

described by the yaw and pitch angles. Head roll movement is hypothesized to be

less important, given the considered secondary tasks. Therefore it is not included in

the analysis. Likewise, eye closure percentage is defined as the percentage of

frames in which the eyelids are lowered below a given threshold. This threshold

is set at the point where the eyes are looking straight at the frontal camera. These

variables are automatically extracted with the AFECT software [ 30 ]. Previous

studies have shown that this toolkit is robust against large datasets and different

illumination conditions. Another advantage of this toolkit is that the information is

independently estimated frame by frame. Therefore, the errors do not propagate

across frames. Unfortunately, some information is lost when the head is rotated

beyond a certain degree or when the face is occluded by the driver's hands. The

algorithm produces empty data in those cases.

Microphone Array: The acoustic information is a relevant modality for

secondary tasks characterized by sound or voice activity such as GPS Following ,

Phone Talking , Pictures , and Conversation . Here, we estimate the average audio

energy from the microphone that is closest to the driver.

The proposed monitoring system segments the data into small windows (e.g., 5 s),

from which it extracts relevant features. We estimate the mean and standard devia-

tion of each of the aforementioned data, which are used as features. Details of other

preprocessing steps are described in Jain and Busso [ 5 ].

After the multimodal features are estimated, we compare their values under task

and normal conditions. Notice that segments of the road have different speed limits

and number of turns. Therefore, the features observed when the driver was engaged

in one task (first lap - Sect. 18.3 ) are only compared with the data collected when

the driver was not performing any task over the same route segment (second lap -

Sect. 18.3 ). This approach reduces the variability introduced by the route.

We conducted a statistical analysis to identify features that change their values

when the driver is engaged in secondary tasks. A matched pair hypothesis test is

used to assess whether the differences in the features between each task and the

corresponding normal condition are significant. We used matched pairs instead of

independent sample, because we want to compensate for potential driver

variability. For each feature f , we have the following hypothesis test [ 31 ]:

f

normal m

f

task ¼

H 0 : m

0

f

normal m

f

task

H 1 : m

6¼

0

(18.1)

Digital Signal Processing for In-Vehicle Systems and Safety

Search WWH ::

Custom Search

Home