Sc.tl.ingest over representing rare cell types in spatial transcriptomics data

I am trying to use sc.tl.ingest to predict cell types for my Xenium 5k spatial transcriptomics data.

I have subsetted the reference dataset to only contain the disease types and tissue types that are present in my STx data. However, when I use sc.tl.ingest, I am getting an over representation of rare cell types so I know it is not working correctly. I realize this is likely because my STx has a reduced number of genes compared to scRNAseq, less transcripts per cell, and fewer genes per cell. However, is there way to improve the prediction accuracy? Any other suggestions?

Thanks!