Introduction
Change-O is a collection of tools for analyzing immunoglobulin sequences.Dramatic improvements in high-throughput sequencing technologies now enable large-scale characterization of immunoglobulin (Ig) repertoires, defined as the collection of trans-membrane antigen-receptor proteins located on the surface of T and B lymphocytes. Change-O is a suite of utilities to handle advanced analysis of Ig sequences following germline segment assignment. Change-O handles output from IMGT/High V-quest and works off of a tab-delimited database file. It includes features for creating a personalized genotype, identifying sequences that are from a single B cell clone and inferring its lineage tree, analyzing amino acid properties, calculating diversity, generating a model of somatic hypermutation, and quantifying selection pressure. Record sorting, grouping, and sampling operations are also included.
Change-O Utilities
-
Commandline Tools
The Change-O commandline tools provide a set of utilities for automated processing of Ig repertoire data following germline segment assignment using a tools such as IMGT/HighV-QUEST.- AnalyzeAa
Analyzes amino acid properties of the CDR3 region. - CreateGermlines
Reconstructs germline sequences from alignment information. - DefineClones
Assign Ig sequences into clonal groups. - GapRecords
Multiple align groups of sequence records. - MakeDb
Parses germline alignment output from IMGT/HighV-QUEST into a tab-delimited database file for import into other Change-O tools. - ParseDb
Performs basic database operations on tab-delimited files.
- AnalyzeAa
-
alakazam R package
- Infers maximum parsimony lineage trees for clonal groups.
- Calculates repertoire-level clonal diversity statistics.
-
shazam R package
- Calculates nearest neighbor distances for all sequences in a dataset.
- Generates SHM mutability and substitution profiles.
- Performs Bayesian estimation of antigen-driven selection.
-
tigger R package
- Infers an individual germline genotype from repertoire data.
- Identifies novel V-region polymorphisms.
Citation
Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data.Gupta NT*, Vander Heiden JA*, Uduman M, Gadala-Maria D, Yaari G, Kleinstein SH.
Bioinformatics 2015; doi: 10.1093/bioinformatics/btv359