Published software exists for similar purposes: there are tools that exist to track the positions and orientations of many flies in pretty much any type of arena you are likely to encounter (Ctrax is great for this) and to train classifiers on any type of behavior you have examples of (the powerful JAABA package). But these tools are overkill for my current application -- what I wanted was a small lightweight piece of code that would take in a raw movie of behaving flies, and some parameters to specify where the borders of my arena and regions of interest are, and return a plot showing PI over time for that movie. I know I'm not the first to write a tool that does this, but as far as I know the tools that have been written are not publicly shared, so I am giving a brief writeup in hopes that my work may be useful to someone else out there. Here is my solution.
First, I should say that I am acquiring data in a somewhat unusual format -- the acquisition software saves movies in the "micro fly movie format" (ufmf), which can be read by Ctrax and JAABA but not by Matlab. So, in order to read in my raw movies, I first needed a way to read them into Matlab. A colleague in my lab told me about a FIJI/ImageJ extension that reads ufmf movies, but in order to use this within Matlab one has to set up the MIJ package, and since I'd ideally like to be able to run my code on any Matlab machine without the minimum of extra dependencies to install. Looking around for other ways to read a ufmf into Matlab, I found that JAABA includes a function (ufmf2avi) that converts ufmf videos into avi files. I decided that this seemed like the way to go -- from any machine with Matlab one could easily install the JAABA package and use the ufmf2avi function to convert their movies (or they could use another solutions such as the FIJI/ImageJ converter), and then when they had their avi movies ready to go they could use my tool to compute PI over time. Or, if they acquire their movies into avi or some other format that is easily converted to avi, they would be ready to go. For my own ease of use, I set up a block of code that allows the user to select whether they wish to read in an avi or ufmf movie. If they select ufmf, JAABA's ufmf2avi is used to convert the movie to avi before processing (this requires the user to have JAABA fully installed and included on the Matlab search path). A note on avi: I felt avi was the best default input format for my tool, since it is the most common movie format that is supported by Matlab on all platforms.
Once an avi movie has been selected, the borders of the arena need to be determined. To do this, the first frame of the movie is loaded. For one of my typical movies, it may look like this:
Next, since I designed the tool with circular arenas in mind, imfindcircles is called to look for a large circle that defines the outer edge of the arena area. The user can adjust the range of radii that that are used by imfindcircles. A mask is then made from the found circle. Each frame will ultimately be multiplied by this circular mask to exclude everything outside the border from analysis.
In order to separate the flies from the background of the image, I wrote a small helper function called
flies_detect. This function is based on Matlab help's approach to detecting grains of rice on a nonuniform background, which works beautifully even when your grains of rice have wings (like my flies). Importantly, I am computing a new background image at each frame where a PI is computed, so this can be quite time consuming for long/high-frame-count videos. I've therefore given the user the ability to easily specify how often they want to compute PI. I have played around with computing the background only once for the entire movie, but found recomputing the background for each frame led to significantly better results that to me were worth the extra wait.
flies_detect. This function is based on Matlab help's approach to detecting grains of rice on a nonuniform background, which works beautifully even when your grains of rice have wings (like my flies). Importantly, I am computing a new background image at each frame where a PI is computed, so this can be quite time consuming for long/high-frame-count videos. I've therefore given the user the ability to easily specify how often they want to compute PI. I have played around with computing the background only once for the entire movie, but found recomputing the background for each frame led to significantly better results that to me were worth the extra wait.
After running flies_detect and applying masks for my regions of interest (which are diagonal pairs of quadrants, but this can be modified by the user for different regions/region shapes), the binarized images multiplied by the region masks are displayed, and look like this:
At this point, there was a choice to make between trying to count the individual flies or count the pixels that belong to each fly. Since overlapping flies can make counting flies automatically quite difficult, I was worried that the changing fly count over time would be likely to throw off my PI measurements at certain points in the video, and it would add a step that for PI calculation I don't need. To see this, you just need to understand that the PI is calculated as
PI = (Flies in region 1 - Flies in region 2)/(Total Flies)
which if there is a fairly consistent number of pixels per fly is almost exactly equal to
(Fly pixels in region 1 - Fly pixels in region 2)/(Total Fly pixels).
When the flies overlap, they usually only overlap by a few pixels (flies touching each other -- the arena is not deep enough for one fly to climb fully over another fly). So overlaps should impact the total number of pixels in a region less then they would the fly count, if contiguous regions of brightness were counted as a single fly. Note that although there are a few bright spots that show up that are not flies -- the center of the arena, and sometimes a thin strip at the top and bottom of the arena -- these are symmetric between our two regions of interest and therefore will cancel out in the PI calculation. Therefore I used the pixel-wise PI calculation, although this could be modified by users who need to know the exact number of flies in a given area at each frame.
And voila!
The code is here on my github. Please let me know if this is helpful to you or if you have any suggestions you'd like to share for improving this tool.
Thanks to Daisuke Hattori for sharing and discussing his solution to pixel-based PI calculation, and to Yoshinori Aso and the folks at Janelia for their help in getting my behavior arena up and running.