Extracting Probability of Observation Matrix


I am trying to extract the Probability of observation matrix (regions by cells) from a trained model. Is this automatically saved in the AnnData object?

Thank you :slight_smile:


We donโ€™t generate or save it by default, but you can generate it using:

Some notes on using it:
1 - do you want the estimated probability of accessibility or the probability of observation? This method by default gives the former (i.e after removing cell-specific and region-specific biases).
If you want the probability of observation, youโ€™ll want to re-introduce those biases by setting normalize_cells=True, normalize_regions=True.
2 - the output can be quite large (since the matrix is dense). You can shrink the size by only generating the estimates for some cells / some regions (using the indices and region_indices arguments, respectively); alternatively, you can threshold the output so that estimates below some given threshold are replaced with 0s, and the output matrix is sparse.

Thank you for the detailed response! Very helpful :slight_smile:

Hi @Tal_Ashuach @adamgayoso

  1. Just to clarify further, .get_accessibility_estimates(adata) by default
    returns y_ij and with .get_accessibility_estimates(adata, normalize_cells=True, normalize_regions=True) you get y_ij * l_i * r_j ?

  2. Can the adata be different to the adata used for training given that the object contains all the same columns/slots? I would like to compute accessibility estimates for cells not seen during traning.


Yes, we have an extensive validation procedure that will flag any inconsistencies in the new data.