MEEPTOOLS: A maximum expected error based FASTQ read filtering and trimming toolkit


Abstract in English

Next generation sequencing technology rapidly produces massive volume of data and quality control of this sequencing data is essential to any genomic analysis. Here we present MEEPTOOLS, which is a collection of open-source tools based on maximum expected error as a percentage of read length (MEEP score) to filter, trim, truncate and assess next generation DNA sequencing data in FASTQ file format. MEEPTOOLS provides a non-traditional approach towards read filtering/trimming based on maximum error probabilities of the bases in the read on a non-logarithmic scale. This method simultaneously retains more reliable bases and removes more unreliable bases than the traditional quality filtering strategies.

Download