IntroductionpRESTO is a toolkit for processing raw reads from high-throughput sequencing of lymphocyte repertoires.
Dramatic improvements in high-throughput sequencing technologies now enable large-scale characterization of immunoglobulin repertoires, defined as the collection of trans-membrane antigen-receptor proteins located on the surface of T and B lymphocytes. The REpertoire Sequencing TOolkit (pRESTO) is composed of a suite of utilities to handle all stages of sequence processing prior to germline segment assignment. pRESTO is designed to handle either single reads or paired-end reads. It includes features for quality control, primer masking, annotation of reads with sequence embedded barcodes, generation of single-molecule consensus sequences, assembly of paired-end reads and identification of duplicate sequences. Numerous options for sequence sorting, sampling and conversion operations are also included.
Provides operations to multiple align sets of sequences sharing the same annotation.
Assembles paired-end reads into a complete sequence.
Constructs a consensus sequence from sets of sequences sharing the same annotation.
Clusters groups of sequences sharing an annotation into sub-clusters.
Removes duplicate sequences.
Converts sequence headers into the pRESTO annotation format.
Generates an estimate of the sequencing error rates for a data set using UID read group information.
Filters sequences to high-quality reads using a variety of criteria.
Trims or masks primer and barcode sequences in multiplexed runs and annotates reads accordingly.
Uniformly sorts paired-end read files and copies annotations between mate-pairs.
Manipulates sequence annotations.
Converts the log output of pRESTO scripts into data tables.
Performs sampling, sorting and subsetting of sequence files.
CitationpRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires
Vander Heiden JA*, Yaari G*, Uduman M, Stern JNH, O'Connor KC, Hafler DA, Vigneault F, Kleinstein SH
Bioinformatics 2014; doi: 10.1093/bioinformatics/btu138
Additional Ig Repertoire Tools
Bayesian estimation of antigen-driven selection.
Clonal assignment, lineage reconstruction, diversity analysis, mutation profiling and selection analysis.
Personal genotyping assignment and novel polymorphism detection.
A 5-mer microsequence context model of somatic hypermutation targeting and substitution rates.