Image Processing Reference
During fission events if there are fewer flies than blobs we update fly numbers. Thus, we
arrive at our count of flies. Each blob is also assigned a count of the number of flies it con-
tains, B N .
4. Update statistics : Each fly is assigned a number of fly-specific statistics. These include a
unique index for each fly ( F j ); fly area in pixels ( F P ); and ly area from the ited ellipse
( F e = B A B B π ). Statistics are running averages, updated only when a fly is inferred to be in
a single-fly blob. An error parameter is also updated ( F S ) to alert us when there is a mis-
match between observed blob properties and the inferred fly state, for instance, if the ratio
between F P and F e is much smaller than 0.9, there is a high likelihood the blob contains mul-
5. Resolve probable errors : For cases where error deviance F S has grown too large, we atempt
to reduce mismatch between imputed fly and blob matches by imputing leaving events, for
evaluating group assignment.
We have found that this method gives us correct fly counts in blobs > 85% of the time, but is
subject to several systematic biases (see Section 3 ). For example, it deals poorly with occlusion
due to mounting which may last for seconds, and mating, which lasts for up to 20 min. It also
may incorrectly infer several small flies instead of a single large fly. We therefore attempt a
subsequent analysis aimed at correcting these remaining biases using machine learning (ML).
2.1 ML in JAABA and Trajectory Scoring
Once TABU has been applied, the trajectories become compatible with JAABA, allowing us to
conveniently score behavior using its video annotation capabilities. We then it various ML all
gorithms. The first, GentleBoost, is natively implemented within JAABA. The others (Gradien-
tBoost, logistic regression, and Support Vector Machine [SVM] with linear and Gaussian ker-
nels [gSVM and lSVM]) we implemented ourselves using the Python Scikit-learn [ 14 ] package.
For boosting, we use decision stumps as the weak rules, and to ensure fair comparison default
parameter values were used for all other methods.
Training of ML Algorithms : We used JAABA to calculate a number of internal single-frame
fly statistics, as well as multiframe window features. Window features are normalized to have
mean 0 and variance 1. It is these features that were used for the ML classifiers. Users deine
behaviors, and score positive and negative cases for trajectories in the JAABA Graphical User
Interface (GUI), by observation in the video window. Since the ML algorithms are binary calls.
siiers, we scored instances of behavior as a binary outcome: Multifly = 1 for blobs labeled
as having more than one fly, Multifly = 0 otherwise; Sex = 0 for female (or 1 for male); and
Chase = 1 (or 0 for other behaviors).
We then it ML classifiers using threefold cross-validation analysis in which the training
data uses the manual annotations that we input using JAABA. After iting, the performance
of each model was evaluated using accuracy, specificity, sensitivity, precision, and area under
the curve. Here, accuracy is defined as the proportion of times that the fly state is correctly
called, for a total number of validation calls. All other performance measures follow the
usual statistical definitions. Sex and Multifly classifiers were trained on 4000 frames from a
single training video, and evaluated on 400 randomly picked frames. The Chase classifier was
trained on 2000 frames, and evaluated on 500. At the same time, using the Multifly classifier
we evaluated the performance of the TABU input trajectories by scoring whether our B N stat-
istic accurately described blob fly count.
Sex classification was performed after trajectory scoring, and incorporated both an ML calls.
siier (as above) and color information. Because the color scoring was more accurate than the