I am re-analyzing a previously published dataset that created an atlas of cell types in a given tissue. I have re-run the full pipeline starting from the original fastq files. Now I have assigned the same labels to each cells as determined by the authors.
The issue: when I run UMAP and plot the cell type labels in UMAP space, the majority of cells of the same type are grouped together, however there are a number of cells intermingled in “incorrect” territories…is it acceptable, according to current best practices, to correct the label of those cells solely based on their UMAP position? Is there any other way to check why this happens? Should I assume that simply the original authors made some mistakes?
Thank you all!