Hello,
I’m a biostatistician working on a project with visiumHD data and I’d like to use spatialdata objects as the backbone of the segmentation process. I’m used to work with R and probably my supervisor will prefer the statistical analysis to be done in bioconductor. But I noticed that there are a lot of nice libraries to work with in python and I really like the possibility of spatialdata to manage all the elements of the experiment in a more flexible way than the standard SummarizedExperiment of BioC.
What I don’t understand thought is the difficulty of doing the most basic thing, for exemple I’m trying to filter the bins element and the associated table of transcipts to mantain only the bins inside the tissue, and consequently to annotate the bins so I know to which tissue they belong to (we have 3 differents tissue samples in one slide). So I have my sd object (read with spatialdata_io and rewritten as zarr)
SpatialData object, with associated Zarr store: /.../spe_blocco1_mod001.zarr
├── Images
│ ├── ‘blocco1_cytassist_image’: DataArray[cyx] (3, 3000, 3200)
│ ├── ‘blocco1_full_image’: DataTree[cyx] (3, 22718, 16166), (3, 11359, 8083), (3, 5679, 4041), (3, 2839, 2020), (3, 1419, 1010)
│ ├── ‘blocco1_hires_image’: DataArray[cyx] (3, 6000, 4270)
│ └── ‘blocco1_lowres_image’: DataArray[cyx] (3, 600, 427)
├── Shapes
│ ├── ‘blocco1_intissue’: GeoDataFrame shape: (3, 5) (2D shapes)
│ ├── ‘blocco1_square_002um’: GeoDataFrame shape: (9233739, 1) (2D shapes)
│ └── ‘intissue_002um’: GeoDataFrame shape: (3548786, 5) (2D shapes)
└── Tables
├── ‘intissue’: AnnData (3548786, 1)
└── ‘square_002um’: AnnData (9233739, 32285)
with coordinate systems:
▸ ‘blocco1’, with elements:
blocco1_cytassist_image (Images), blocco1_full_image (Images), blocco1_hires_image (Images), blocco1_lowres_image (Images), blocco1_square_002um (Shapes)
▸ ‘blocco1_downscaled_hires’, with elements:
blocco1_hires_image (Images), blocco1_square_002um (Shapes)
▸ ‘blocco1_downscaled_lowres’, with elements:
blocco1_lowres_image (Images), blocco1_square_002um (Shapes)
▸ ‘global’, with elements:
blocco1_intissue (Shapes), intissue_002um (Shapes)
as you can see I have already filtered the blocco1_square_002um (bins object) with blocco1_intissue (3 multipolygon obj for the 3 tissue samples in one slide), but when I try to filter the square_002um with the shapes of intissue_002um i get the intissue AnnData with 1 column.
I tried different things. I started to try to filter the tables directly with the blocco1_intissue shape, but I was unable to achieve anything because, as the error wrote, the shapes wasn’t annotated with the table (square_002um), so I filtered the bin shape with the intissue shape and then used the filtered_bin shape to filter the table. It didn’t work, as you can see the resulting “intissue” table.
I tried to use polygon_query, but apparently I needed to give it one polygon at a time and I don’t know how to manage the resulting object in a loop.
I was looking for exemples/tutorials/notebooks but seems like there aren’t, a part for some that are too minimal and probably too much advanced for me. I also thought that filtering spatial object with geojson and such would have been pretty standard thing to do and so there must be functions to achieve that, maybe not?
If i don’t find a way to work with spatialdata I’ll do all the algebra in R and then read the object in python when it’s ready for the segmentation but I don’t like this solution, also the scverse seems nice and I’d like to work more with it.
Thanks in advance and sorry for the wall of text,
Valerio