A dataset containing cell lineage metadata and time series of protein expression for a set of mother–daughter cell pairs.
Format
A named list with two elements:
metadata
A data frame providing cell lineage and timing information. Columns include:
cell_id
: Unique identifier for each cell.mother_id
: Identifier of the mother cell;"root"
indicates a founder cell (seeded at the start of the experiment).cell_birth_timepoint
: Time index at which the cell is born. This corresponds to the column index in the time series matrices (excluding the first column).
time_series
A named list of six protein time series matrices:
Cdc10
,Stb3
,CLB5
,Whi5
,Xbp1
,Tup1
Each matrix has rows representing cells and columns representing uniformly spaced time points. In other words, each row is the time series of a protein for a single cell. The first column is
cell_id
, matching those inmetadata
. Values before each cell's birth areNA
.
Details
Time series data were denoised using functional principal component analysis and interpolated to 5× temporal resolution using local polynomial smoothing. The dataset includes 25 mother cells and 60 daughter cells, each measured on a common grid of 240 time points (uniformly spaced) across the experiment (originally 48 time points before interpolation). In the paper, these time series are also referred to as "trajectories".
Examples
data(cell_lineage_data)
names(cell_lineage_data)
#> [1] "metadata" "time_series"
head(cell_lineage_data$metadata)
#> cell_id mother_id cell_birth_timepoint
#> 1 1-6-1 root 5
#> 2 1-7-1 root 5
#> 3 1-7-2 root 5
#> 4 2-11-1 root 5
#> 5 2-12-1 root 5
#> 6 3-15-1 root 5
dim(cell_lineage_data$time_series$CLB5)
#> [1] 85 240