Questions on Shapes + Tables vs single AnnData object

Moving the conversation from a GitHub issue to here.

Original message from asmlgkj (curry_fan) · GitHub

I have a couple of questions regarding the SpatialData outputs and how cell positions are represented in Seurat objects:

SpatialData outputs: When using SpatialData, the results (image and expression) are stored as separate elements. However, this makes it more difficult to interface with downstream tools such as Squidpy, which often rely on each cell (or observation) having an associated spatial coordinate. Similarly, visualization tools that expect an H&E image may fail if the AnnData/Seurat object does not directly contain the image.

What is the recommended way to ensure compatibility with such tools?

Should we be explicitly merging the image and expression layers, or is there a standard workflow for linking them?

Polygon-based cells: For segmented cells represented by polygons, a single representative point is often required for certain downstream analyses (e.g., co-localization, neighbor graphs).

Does Seurat currently use the geometric centroid of each polygon, or some other representative location?

In cases where the centroid falls outside of the polygon (e.g., crescent-shaped or elongated cells), would you recommend using the centroid, or instead an interior point (such as st_point_on_surface in sf)?

I would greatly appreciate your advice on best practices here, as this has important implications for interoperability with Squidpy and other spatial analysis or visualization software.

The visualization packages spatialdata-plot or napari-spatialdata are built to support SpatialData object. Tools that are designed to expect an AnnData or Seurat object containing an image will naturally not work with a SpatialData object, or an AnnData object in the table slot. This is expected.

You can still try using spatialdata_io.to_legacy_anndata(), that we developed for these use cases, but the image will be downscaled.

We provide several APIs to help with this:

  • match_element_to_table(), match_sdata_to_table(), match_table_to_element() documented here
  • Internally these function use the more powerful (but less user-friendly) API: join_spatial_element_table() (docs here)
  • We will provide also a convenient way to filter and match the object at the same time (you can do this by filtering the table and then calling one of the above, but with the new way it will be less verbose). You can see a preview in this Pull Request with filter_table_by_query()

You can get the centroid of a polygon (or any element) with get_centroids()documented here.

This depends on the specific analysis and use case, so I won’t comment on this.

Thanks, @LucaMarconato for the response. I currently have anndata objects (h5ad) of the CosMx, do you think it is best to reconstruct the spatialdata from scratch or use spatialdata_io.experimental.from_legacy_anndata? Does the spatialdata_io.experimental.from_legacy_anndata preserve all the obs, var, obsm etc data? Thank you for your advice. Ref: spatialdata_io.experimental.from_legacy_anndata — spatialdata-io