Skip to contents

A dataset containing cell lineage metadata and time series of protein expression for a set of mother–daughter cell pairs.

Usage

cell_lineage_data

Format

A named list with two elements:

metadata

A data frame providing cell lineage and timing information. Columns include:

  • cell_id: Unique identifier for each cell.

  • mother_id: Identifier of the mother cell; "root" indicates a founder cell (seeded at the start of the experiment).

  • cell_birth_timepoint: Time index at which the cell is born. This corresponds to the column index in the time series matrices (excluding the first column).

time_series

A named list of six protein time series matrices:

  • Cdc10, Stb3, CLB5, Whi5, Xbp1, Tup1

Each matrix has rows representing cells and columns representing uniformly spaced time points. In other words, each row is the time series of a protein for a single cell. The first column is cell_id, matching those in metadata. Values before each cell's birth are NA.

Details

Time series data were denoised using functional principal component analysis and interpolated to 5× temporal resolution using local polynomial smoothing. The dataset includes 25 mother cells and 60 daughter cells, each measured on a common grid of 240 time points (uniformly spaced) across the experiment (originally 48 time points before interpolation). In the paper, these time series are also referred to as "trajectories".

Examples

data(cell_lineage_data)
names(cell_lineage_data)
#> [1] "metadata"    "time_series"
head(cell_lineage_data$metadata)
#>   cell_id mother_id cell_birth_timepoint
#> 1   1-6-1      root                    5
#> 2   1-7-1      root                    5
#> 3   1-7-2      root                    5
#> 4  2-11-1      root                    5
#> 5  2-12-1      root                    5
#> 6  3-15-1      root                    5
dim(cell_lineage_data$time_series$CLB5)
#> [1]  85 240