Remove genes from highly variable genes

Hi,

I would like to remove certain genes from my list of highly variable genes generated from sc.pp.highly_variable_genes. Is it enough to assign adata.var.highly_variable[gene] = False? Or is there some other way?

Thanks for any help.

Basically, yes.

I would do:

adata.var.loc[gene_list, "highly_variable"] = False

As pandas is going to complain about adata.var.highly_variable[gene] = False (and it may not work in a future version), e.g.

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"a": [True, False, True]})

In [3]: df
Out[3]: 
       a
0   True
1  False
2   True

In [4]: df.a[1] = True
<ipython-input-4-6ffeec29c796>:1: FutureWarning: ChainedAssignmentError: behaviour will change in pandas 3.0!
You are setting values through chained assignment. Currently this works in certain cases, but when using Copy-on-Write (which will become the default behaviour in pandas 3.0) this will never work to update the original DataFrame or Series, because the intermediate object on which we are setting values will behave as a copy.
A typical example is when you are setting values in a column of a DataFrame, like:

df["col"][row_indexer] = value

Use `df.loc[row_indexer, "col"] = values` instead, to perform the assignment in a single step and ensure this keeps updating the original `df`.

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

  df.a[1] = True
1 Like