Hi,
I would like to remove certain genes from my list of highly variable genes generated from sc.pp.highly_variable_genes
. Is it enough to assign adata.var.highly_variable[gene] = False
? Or is there some other way?
Thanks for any help.
Hi,
I would like to remove certain genes from my list of highly variable genes generated from sc.pp.highly_variable_genes
. Is it enough to assign adata.var.highly_variable[gene] = False
? Or is there some other way?
Thanks for any help.
Basically, yes.
I would do:
adata.var.loc[gene_list, "highly_variable"] = False
As pandas is going to complain about adata.var.highly_variable[gene] = False
(and it may not work in a future version), e.g.
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({"a": [True, False, True]})
In [3]: df
Out[3]:
a
0 True
1 False
2 True
In [4]: df.a[1] = True
<ipython-input-4-6ffeec29c796>:1: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:
df["col"][row_indexer] = value
Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
df.a[1] = True