Pupmapper

software
Python genomics mappability short-read sequencing variant calling
Pupmapper

Overview

Pupmapper calculates pileup mappability scores for any genome of interest. These scores help identify genomic regions that may be difficult for variant calling with short-read whole genome sequencing data. For example, repeat elements or low-complexity regions where short reads cannot be uniquely aligned.

Pupmapper generates k-mer mappability scores (computed by Genmap) and then converts them into position-specific pileup mappability values by averaging the mappability of all k-mers overlapping each genomic position.

Installation

The easiest install is via conda, which pulls in all dependencies (Genmap, bigtools) automatically:

conda install -c bioconda pupmapper

Or via pip:

pip install pupmapper

Basic usage

pupmapper all -i genome.fasta -o output_dir/ -k 150 -e 0
  • -k — k-mer length (should match your read length)
  • -e — number of allowed mismatches

Optional: provide a GFF annotation file to get mappability summaries per annotated feature.