Data Structures

The top-level object in a starfish workflow is the Experiment. It is composed of one or more Field of View objects, and a Codebook, which maps detected spots to the entities they target.

Each Field of View consists of a set of Primary Images and optionally, Auxiliary images that may contain information on nuclei (often used to seed segmentation) or fiduciary beads (often used to enable fine registration).

Both Primary and Auxiliary Images are referenced by slicedimage TileSet objects, which map two dimensional image tiles stored on disk into a 5-dimensional Image Tensor that labels each (z, y, x) tile with the round and channel that it corresponds to. When loaded into memory, these Image Tensors are stored in ImageStack objects. The ImageStack is what starfish uses to execute image pre-processing, and serves as the substrate for spot finding.

Identified spots are stored in the IntensityTable, which stores the intensity of the spot across each of the rounds and channels that it is detected in. It also stores assigned genes when decoded with a Codebook and assigned cells when combined with a segmentation results.

Finally, the IntensityTable can be converted into an ExpressionMatrix by summing all of the spots detected for each gene across each cell. The ExpressionMatrix provides conversion and serialization for use in single-cell analysis environments such as Seurat and Scanpy.