Analyze a sharded dataset#
import lamindb as ln
import lnschema_bionty as lb
ln.track()
💡 loaded instance: testuser1/test-facs (lamindb 0.54.2)
💡 notebook imports: lamindb==0.54.2 lnschema_bionty==0.31.2 scanpy==1.9.5
💡 Transform(id='zzJzdgJ763Dyz8', name='Analyze a sharded dataset', short_name='facs3', version='0', type=notebook, updated_at=2023-09-27 19:04:03, created_by_id='DzTjkKse')
💡 Run(id='DRCTFuwx8bCnWhtb0XV7', run_at=2023-09-27 19:04:03, transform_id='zzJzdgJ763Dyz8', created_by_id='DzTjkKse')
ln.Dataset.filter().df()
name | description | version | hash | reference | reference_type | transform_id | run_id | file_id | initial_version_id | updated_at | created_by_id | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
id | ||||||||||||
8RZdIbll16NTrAwo7lRL | My versioned FACS dataset | None | 1 | fnzTGHE8BlkiMMIqHLDjyA | None | None | OWuTtS4SAponz8 | 6LxzHJKBOJu5s56VPocZ | 8RZdIbll16NTrAwo7lRL | None | 2023-09-27 19:03:42 | DzTjkKse |
8RZdIbll16NTrAwo7l1h | My versioned FACS dataset | None | 2 | H2N4gXSjQN7Qy3LOcETW | None | None | SmQmhrhigFPLz8 | iKuM6oyOKGVBDvx0YQ7P | None | 8RZdIbll16NTrAwo7lRL | 2023-09-27 19:03:53 | DzTjkKse |
dataset = ln.Dataset.filter(name="My versioned FACS dataset", version="2").one()
adata = dataset.load()
/opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/anndata/_core/anndata.py:1838: UserWarning: Observation names are not unique. To make them unique, call `.obs_names_make_unique`.
utils.warn_names_duplicates("obs")
The AnnData
has the reference to the individual files in the .obs
annotations:
adata.obs.file_id.cat.categories
Index(['8RZdIbll16NTrAwo7lRL', 'RY2suClyjVaJf7WYC0SH'], dtype='object')
By default, the intersection of features is used:
adata.var.index
Index(['CD57', 'Cd19', 'Cd4', 'CD8', 'CD3', 'CD27', 'Cd14', 'Ccr7', 'CD127',
'CD28'],
dtype='object')
Let us create a plot:
markers = lb.CellMarker.lookup()
import scanpy as sc
sc.pp.pca(adata)
sc.pl.pca(adata, color=markers.cd14.name, save="_cd14")
filepath = "figures/pca_cd14"
WARNING: saving figure to file figures/pca_cd14.pdf
file = ln.File("./figures/pca_cd14.pdf", description="My result on CD14")
file.save()
file.view_flow()
# clean up test instance
!lamin delete --force test-facs
!rm -r test-flow
💡 deleting instance testuser1/test-facs
✅ deleted instance settings file: /home/runner/.lamin/instance--testuser1--test-facs.env
✅ instance cache deleted
✅ deleted '.lndb' sqlite file
❗ consider manually deleting your stored data: /home/runner/work/lamin-usecases/lamin-usecases/docs/test-facs
rm: cannot remove 'test-flow': No such file or directory