There are a lot of ways to decide what comprises "positive" in flow data, statistically. All available methods begin by binning the data - counting the number of events that fall into discreet ranges. This can be visualized in one dimension as a histogram. The X-axis units are bins, and the Y-axis are the number of cells in each bin. FlowJo divides the data into 1024 bins of equal size by default. To compare whether two samples have a statistically different distribution, or figure out where "negative" ends and positive begins using a control for reference, one must eliminate differences caused simply by having more events in one tube than the other. This can be achieved though "normalization".

The Overton method is one of the original methods applied to cytometric data for calculating "positive" in unimodal distributions. It is popular because it is easy to understand and works reasonably well. The process for normalization in the Overton method is to find the mode (the bin that has the most cells in it) of each tube and divide the data from that tube by that value. This puts the data on roughly the same scale (0 - 100%) while preserving features. The Overton method then subtracts the control data from the comparison tube and counts the number of events that remain per bin, labeling these "positive".

The SED method has never been published. There is a link for a paper written by Bruce Bagwell (the person who created the SED method, among other things) at the bottom of this article that sums up the algorithms that lead to the SED method...but he never actually says specifically what the SED does differently. What we've put in FlowJo is essentially the Enhanced Normalization Subtraction Method (ENS) which is very similar to the SED (it lacks some correction factor). The difference between ENS and Overton is that the normalization is better - the control and test algorithms are normalized so that they have the same area. This protects the user from bad normalization due to one outlier bin having a huge number of events in it, usually due to a data artifact. One bin with a huge number of cells would cause normalize one of the tube to a huge number and would then make the scales *very* different, resulting in a huge number of events to be labelled positive. The ENS also estimates the probability distribution function of the positive population, and aligns it to the data using the point of maximum difference between samples. By estimating the shape of the positive population, the algorithm is less likely to create a poor fit because of noisy data.

So overall what is called the SED method in FlowJo is superior to the Overton method because it factors in some safety precautions for data artifacts. For really nice clean data, there should be almost no difference between the two methods.

Download SED paper by Bagwell.

- Bad Mo'Flow

Send comments and questions to John@treestar.com

## Comments