Trajectory evaluation and behavioral scoring using JAABA in a noisy system - Emerging Trends in Image Processing, Computer Vision, and Pattern Recognition

Image Processing Reference

In-Depth Information

between bias and fly number in our estimates ( Figure 1 ). While the raw output overestimates

the number of flies on a patch at low fly numbers, it tends to underestimate fly numbers

when there are more flies on a patch (Blob bias: est = − 0.034, df = 442,173, t = − 309.1, P < 0.001).

However, TABU does show evidence of a consistent bias towards over-counting, which be-

comes slightly stronger at high numbers of flies (Tabu bias: est = 0.0075375, df = 495,300,

t = 71.67, P < 0.001). Application of the Tabu algorithm reduces the number of spurious patch

joining and leaving events to about 30% over the raw blob data ( Table 1 ). However, even for

the TABU output, the number of inferred joining and leaving events is still more than 2 × the

actual data, offering potential for improvement through subsequent application of ML.

FIGURE 1 Heat map of the distribution of per-fly over- and under-counts ( D ) as function of

the number of flies on a patch for each frame across five test videos.

We now investigate whether application of ML methods to our TABU trajectories can identi-

fy miscalled blob counts B N . Threefold cross-validation model-fit results are shown in Table 2 .

Here algorithms were trained using a period of 10 K frames. We see that all models have an

accuracy above 0.98. The two SVM models rank highly on almost all metrics, while logistic re-

gression ranks poorly on most metrics. While JAABA is not top ranked on any metric, we note

that it performs very well overall.

Table 2

Performance Measures of ML Algorithms for Multifly Calling on Threefold Cross Validation

Algorithms

Accuracy Sensitivity Specificity Precision AUC

JAABA

0.994(2)

0.994(3)

0.994(2)

-

GradientBoost 0.988(5)

0.989(5)

0.987(3)

0.987(5)

0.994(4)

Logistic

0.989(4)

0.993(4)

0.984(5)

0.985(3)

0.997(3)

lSVM

0.991(3)

0.996(1)

0.985(4)

0.986(4)

0.998(2)

gSVM

0.995(1)

0.996(2)

0.995(1)

0.998(1)

The accuracy, sensitivity, specificity, and area under the curve scores are shown for each. Ranks among ML meth-

ods for each performance score are given in brackets.

The critical practical question is whether models trained on one part of a video will be

equally effective when applied to later periods of the same video, or to completely new video.

Fly behavior is known to change over time, and varies among different genotypes and in dif-

ferent social contexts. We tested the performance of all algorithms on four videos that were not

used in the training of the algorithm. This included different genotypes and sex ratios, as well

as slightly different lighting and focus, than the algorithms were trained on. Results are shown

in Table 3 . The performance of all ML methods dropped slightly under these new conditions.

All the ML methods improved upon the trajectory input data from TABU. The performance

Emerging Trends in Image Processing, Computer Vision, and Pattern Recognition

Search WWH ::

Custom Search

Home