Hi, I’m trying to plot module scores from a gene list over our tissue images and even though sc.pl.spatial is used with clusters gene counts as the color and nothing is outside of the tissue, it keeps happening when using module scores I stored in adata.obsm. Do you think this is an issue with the spatial coordinates? I’m able to select the artifacts on napari viewer, but can’t remove them from the plot itself. Any help is greatly appreciated!!
Hi @CSMattison, which reader function are you using? If you use visium()
from spatialdata-io
the default behavior is to load the filtered data (so no locations outside the tissue), and you can use the parameter counts_file
to load the expression data from a specific file.
HI @LucaMarconato I’m using squidpy’s reader: adata = sq.read.visium(file_fold, counts_file=f"{library_id}_rawfeaturebcmatrix.h5", load_images=True). I tried to alter it to use spatialdata-io, but it doesn’t seem to recognize the .json file.
Thanks for sharing the details. Is the dataset you are using public? I would like to fix the bug in the spatialdata-io reader as it eventually replace the one in squidpy. If not, could you please:
- paste here the output of the
tree
bash command for your data directory - (if you know) tell which version of spaceranger was used to obtain the data.
For the moment, as a workaround you can check the from_legacy_anndata()
function shown here to convert the AnnData
object that you obtain with sq.read_visium()
into a SpatialData
object.
I was able to convert the AnnObject to SpatialData with legacy, but ended up getting the same error. I did fix the original bug in the visium() reader, but now there seems to be a different error with the scale factors.
→ Tissue positions:
output:
barcode in_tissue array_row array_col y x
0 ACGCCTGACACGCGCT-1 0 0 0 1228 1561
1 TACCGATCCAACACTT-1 0 1 1 1419 1671
2 ATTAAAGCGGACGAGC-1 0 0 2 1227 1781
3 GATAAGGGACGATTAG-1 0 1 3 1419 1891
4 GTGCAAATCACCAATA-1 0 0 4 1227 2001
→ Scale factors:
with open(scale_path, ‘r’) as f:
scalefactors = json.load(f)
print(“Loaded scalefactors:”, scalefactors)
Output:
Loaded scalefactors: {‘tissue_hires_scalef’: 0.11547344, ‘tissue_lowres_scalef’: 0.034642033, ‘fiducial_diameter_fullres’: 231.08556, ‘spot_diameter_fullres’: 143.05296000000004}
→ New error:
Error in visium function: ufunc ‘points’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ‘‘safe’’
Directory tree:
.
├── AddModuleScores.ipynb
├── CU032-U54-HRA-028-C_metricssummary.csv
├── CU032-U54-HRA-028-C_rawfeaturebcmatrix.h5
├── CU032-U54-HRA-028-C_spatialenrichment.csv
├── CU032-U54-HRA-028-C_websummary.html
├── plots
│ ├── CU032-U54-HRA-028-C_detectedtissueimage.jpg
│ ├── Plot_CU032-U54-HRA-028-C_V1.png
│ ├── Plot_CU032-U54-HRA-028-C_V10.png
│ ├── Plot_CU032-U54-HRA-028-C_V11.png
│ ├── Plot_CU032-U54-HRA-028-C_V15.png
│ ├── Plot_CU032-U54-HRA-028-C_V16.png
│ ├── Plot_CU032-U54-HRA-028-C_V19.png
│ ├── Plot_CU032-U54-HRA-028-C_V2.png
│ ├── Plot_CU032-U54-HRA-028-C_V24.png
│ ├── Plot_CU032-U54-HRA-028-C_V25.png
│ ├── Plot_CU032-U54-HRA-028-C_V26.png
│ ├── Plot_CU032-U54-HRA-028-C_V27.png
│ ├── Plot_CU032-U54-HRA-028-C_V3.png
│ ├── Plot_CU032-U54-HRA-028-C_V31.png
│ ├── Plot_CU032-U54-HRA-028-C_V32.png
│ ├── Plot_CU032-U54-HRA-028-C_V33.png
│ ├── Plot_CU032-U54-HRA-028-C_V35.png
│ ├── Plot_CU032-U54-HRA-028-C_V4.png
│ ├── Plot_CU032-U54-HRA-028-C_V5.png
│ ├── Plot_CU032-U54-HRA-028-C_V6.png
│ ├── Plot_CU032-U54-HRA-028-C_V7.png
│ ├── Plot_CU032-U54-HRA-028-C_V8.png
│ └── Plot_CU032-U54-HRA-028-C_V9.png
├── scalefactors_json.json
├── spatial
│ ├── tissue_hires_image.png
│ ├── tissue_lowres_image.png
│ └── tissue_positions_list.csv
├── tissue_positions.csv
└── umap_CU032-U54-HRA-028-C.png
Visium reader code:
from spatialdata_io import visium
file_fold = f"/Users/courteneymattison/Desktop/CUMC_Projects/sennet_work/spatial_exp_maps/visium_{library_id}"
counts_file_path = f"{file_fold}/{library_id}_rawfeaturebcmatrix.h5"
scale_path = f"{file_fold}/scalefactors_json.json"
tissue_path = f"{file_fold}/tissue_positions.csv"
spatial_data = visium(file_fold,
dataset_id = library_id,
counts_file=counts_file_path,
scalefactors_file = scale_path,
tissue_positions_file = tissue_path)
I got the correct output once, but now I’m getting that ufunc error:
SpatialData object with:
├── Shapes
│ └── ‘CU032-U54-HRA-028-C’: GeoDataFrame shape: (4992, 2) (2D shapes)
└── Tables
└── ‘table’: AnnData (4992, 36601)
with coordinate systems:
▸ ‘downscaled_hires’, with elements:
CU032-U54-HRA-028-C (Shapes)
▸ ‘downscaled_lowres’, with elements:
CU032-U54-HRA-028-C (Shapes)
▸ ‘global’, with elements:
CU032-U54-HRA-028-C (Shapes)
Thanks for sharing, can you trying setting
scale_path = f"scalefactors_json.json"
instead of
scale_path = f"{file_fold}/scalefactors_json.json"
?
It should work, if not please see how the path for the scalefactors file is used: spatialdata-io/src/spatialdata_io/readers/visium.py at 96e4932925144722d8213b1bedb8a80846f0c061 · scverse/spatialdata-io · GitHub
Passing the correct paths to visium()
should fix the issue.
Regarding the problem with to_legacy_anndata()
, I can’t identify the cause from the details above. If you want you could try report a reproducible example in the spatialdata-io
repository and we could have a look at it, but I would rather focus on fixing the visium()
function (if the fix I suggested above is not enough) since the to_legacy_anndata()
is an experimental function and we don’t recommended its use if alternatives are available.
Changing the path resulted in the error. I think visium() is able to go into the main directory and find the file either way. Is is possible the error has to do with how coords is being created?
Load the data using visium() from spatialdata-io
spatial_data = visium(file_fold,
dataset_id = library_id,
counts_file=counts_file_path,
scalefactors_file = “scalefactors_json.json”,
tissue_positions_file = tissue_path)
TypeError Traceback (most recent call last)
Cell In[8], line 2
1 # Load the data using visium() from spatialdata-io
----> 2 spatial_data = visium(file_fold,
3 dataset_id = library_id,
4 counts_file=counts_file_path,
5 scalefactors_file = f"{file_fold}/scalefactors_json.json",
6 tissue_positions_file = tissue_path)
File ~/anaconda3/lib/python3.11/site-packages/spatialdata_io/readers/visium.py:189, in visium(path, dataset_id, counts_file, fullres_image_file, tissue_positions_file, scalefactors_file, imread_kwargs, image_models_kwargs, **kwargs)
184 transform_hires = Scale(
185 np.array([scalefactors[VisiumKeys.SCALEFACTORS_HIRES], scalefactors[VisiumKeys.SCALEFACTORS_HIRES]]),
186 axes=(“y”, “x”),
187 )
188 shapes = {}
→ 189 circles = ShapesModel.parse(
190 coords,
191 geometry=0,
192 radius=scalefactors[“spot_diameter_fullres”] / 2.0,
193 index=adata.obs[“spot_id”].copy(),
194 transformations={
195 “global”: transform_original,
196 “downscaled_hires”: transform_hires,
197 “downscaled_lowres”: transform_lowres,
198 },
199 )
200 shapes[dataset_id] = circles
201 adata.obs[“region”] = dataset_id
File ~/anaconda3/lib/python3.11/functools.py:946, in singledispatchmethod.get.._method(*args, **kwargs)
944 def _method(*args, **kwargs):
945 method = self.dispatcher.dispatch(args[0].class)
→ 946 return method.get(obj, cls)(*args, **kwargs)
File ~/anaconda3/lib/python3.11/site-packages/spatialdata/models/models.py:415, in ShapesModel._(cls, data, geometry, offsets, radius, index, transformations)
403 @parse.register(np.ndarray)
404 @classmethod
405 def _(
(…)
412 transformations: MappingToCoordinateSystem_t | None = None,
413 ) → GeoDataFrame:
414 geometry = GeometryType(geometry)
→ 415 data = from_ragged_array(geometry_type=geometry, coords=data, offsets=offsets)
416 geo_df = GeoDataFrame({“geometry”: data})
417 if GeometryType(geometry).name == “POINT”:
File ~/anaconda3/lib/python3.11/site-packages/shapely/_ragged_array.py:441, in from_ragged_array(geometry_type, coords, offsets)
439 if geometry_type == GeometryType.POINT:
440 assert offsets is None or len(offsets) == 0
→ 441 return _point_from_flatcoords(coords)
442 if geometry_type == GeometryType.LINESTRING:
443 return _linestring_from_flatcoords(coords, *offsets)
File ~/anaconda3/lib/python3.11/site-packages/shapely/_ragged_array.py:304, in _point_from_flatcoords(coords)
303 def _point_from_flatcoords(coords):
→ 304 result = creation.points(coords)
306 # Older versions of GEOS (<= 3.9) don’t automatically convert NaNs
307 # to empty points → do manually
308 empties = np.isnan(coords).all(axis=1)
File ~/anaconda3/lib/python3.11/site-packages/shapely/decorators.py:77, in multithreading_enabled..wrapped(*args, **kwargs)
75 for arr in array_args:
76 arr.flags.writeable = False
—> 77 return func(*args, **kwargs)
78 finally:
79 for arr, old_flag in zip(array_args, old_flags):
File ~/anaconda3/lib/python3.11/site-packages/shapely/creation.py:74, in points(coords, y, z, indices, out, **kwargs)
72 coords = _xyz_to_coords(coords, y, z)
73 if indices is None:
—> 74 return lib.points(coords, out=out, **kwargs)
75 else:
76 return simple_geometries_1d(coords, indices, GeometryType.POINT, out=out)
TypeError: ufunc ‘points’ not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ‘‘safe’’
Thanks for the attempt. At least now it’s the same error as for to_legacy_anndata()
, so fixing one will fix both.
Which version of shapely and of geopandas are you using? Can you try installing the latest ones? Also, do you have pygeos installed? If so, can you try uninstalling it?
If the above doesn’t work, I would check if we are in one of the following two cases:
-
I would try running the
to_zarr.py
for one of our heavily tested datasets, such as this one. spatialdata-sandbox/visium_associated_xenium_io at main · giovp/spatialdata-sandbox · GitHub. If the problem disappears, I would try to “mix and match” the two datasets to see if you find the file that causes the issue. If the problem persists, I would go with the point 2 below. -
there is a problem with the version of
spatialdata
orspatialdata-io
being used or of dependencies being installed. I kindly try if you could install them frommain
using
pip install git+https://github.com/scverse/spatialdata-io
pip install git+https://github.com/scverse/spatialdata
and then try again. If this doesn’t work I would try redoing the same in a clean conda env, so that the dependencies are reinstalled.
Please let me know if this casts some light into the issue.