Skip to content

Very slow paired reads mode for transcriptome  #31

Open
@siddharthab

Description

@siddharthab

Hi!

I am trying to make UMICollapse the default tool in one of the popular RNAseq analysis pipelines -- nf-core/rnaseq#1087.

Not sure if this is covered by #5 already, but when using paired reads aligned to the human transcriptome, it seems like UMICollapse is 20x slower when compared to umi-tools. UMICollapse takes between 9-10 hours for the BAM files we are considering, whereas umi-tools takes ~30 minutes. The slowness is present in both two-pass and single pass modes.

I have not gone through how UMICollapse works, so I do not have an opinion on whether this is expected or not. If it is expected, some commentary on this in the README would be appreciated.

I have made some test data available in Google Drive. You will notice that the BAM file has 44319354 read pairs with 8 bp UMIs.

Thank you for continuing to follow up on your work from a long time ago.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions