Working with Starfish Outputs

Starfish’s output_formats are serialized as netcdf or csv files. These files are easy to work with in both Python and R.

To work with the IntensityTable in python, it’s as simple as using that object’s open_netcdf command:

import starfish
example_netcdf_file: str = "docs/source/_static/example_data_files/"
intensity_table: starfish.IntensityTable = starfish.IntensityTable.open_netcdf(example_netcdf_file)

in R, the ncdf4 library allows the .nc archive, which is based on hdf5, to be opened. It will contain a number of variables, each of which can be accessed by name. Alternative installation instructions can be accessed here. Alternative installation instructions can be accessed here:

example_netcdf_file <- "docs/source/_static/example_data_files/"
netcdf_connection <- nc_open(example_netcdf_file)

# access the z-coordinate vector
zc <- ncvar_get(netcdf_connection, "zc")

# access the 3-dimensional data structure containing intensty information
# this variable has a special name, the rest are accessible with the string
# constants you would expect from the starfish python API.
data <- ncvar_get(netcdf_connection, "__xarray_dataarray_variable__")

To work with the decoded table is even simpler, as they are stored as .csv files, and can be read natively by pandas in Python and natively in R.


import pandas as pd
example_decoded_spots_file: str = "docs/source/_static/example_data_files/decoded.csv"
table: pd.DataFrame = pd.read_csv(example_decoded_spots_file, index_col=0)


example_decoded_spots_file <- "docs/source/_static/example_data_files/decoded.csv"
table <- read.csv(file=example_decoded_spots_file, header=TRUE, sep=',', row.names=1)