Installation#

This pipeline to estimate genetic ancestry from scRNA-seq data has been built in snakemake and packaged in a Singularity image that contains all the required softwares. This should provide continuity to execution of this pipeline across different computational systems. We’re hoping that installation and pipeline execution will be relatively painless as well.

Singularity Image#

The only thing to note before you download this image is that the image is ~6.5Gb so, depending on the internet speed, it will take ~15-30 min to download.

To download the singularity image:

wget https://www.dropbox.com/s/nnru32uw1ate2gu/ancestry_prediction_scRNAseq.sif
wget https://www.dropbox.com/s/pz3h9gzhcn2kgob/ancestry_prediction_scRNAseq.sif.md5

Then you should check to make sure that the image downloaded completely by comparing the image md5sum to the original md5sum. You can do that by running the following commands:

md5sum ancestry_prediction_scRNAseq.sif > downloaded_ancestry_prediction_scRNAseq.sif.md5
diff -s downloaded_ancestry_prediction_scRNAseq.sif.md5 ancestry_prediction_scRNAseq.sif.md5

If everything was downloaded correctly, that command should report:

Files ancestry_prediction_scRNAseq.sif.md5 and downloaded_ancestry_prediction_scRNAseq.sif.md5 are identical

Note

Please note that the singularity image and this documentation is updated with each release. This means that the most recent documentation may not be 100% compatible with the singularity image that you have.

You can check the version of your singularity image to match with documentation with:

singularity inspect ancestry_prediction_scRNAseq.sif

If you run into any issues with downloading the image or any issue with running anything from this image, you can reach out to us by submitting an issue at Github

Software versions - for the curious

Image build date: 24 July, 2022

Software Group

Software

Version

Supporting Softwares

sinto

0.8.4

Crossmap

0.6.4

vartrix

v1.1.3

htslib

v1.13

samtools

v1.13

bcftools

v1.13

freebayes

v1.3.5

R Supporting Packages (R v4.2.1)

argparse

v2.1.6

ComplexHeatmap

v2.12.0

data.table

v1.14.2

vcfR

v1.13.0

tidyverse

v1.3.2

cowplot

v1.1.1

colorspace

v2.0-3

ggplot2

v3.3.6

caret

v6.0-92

RColorBrewer

v1.1-3

Python Supporting Packages (Python v3.6.8)

argparse

v1.4.0

pysam

v0.19.1

pandas

v1.1.5

scipy

v1.5.4

Next Steps#

The next section Data Preparation will explain the input files required for this software and their expected formats.

Support#

If you have any questions, suggestions or issues with any part of the Ancestry Prediction from scRNA-seq Data Pipeline, feel free to submit an issue or email Drew Neavin (d.neavin @ garvan.org.au)