AlignSets

usage: AlignSets.py [-h] [--version]  ...

Multiple aligns input sequences by group

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

subcommands:
              Alignment method
    muscle    Align sequence sets using MUSCLE
    offset    Align sequence sets using predefined 5' offset
    table     Create a 5' offset table by primer multiple alignment

output files:
    align-pass
        multiple aligned reads.
    align-fail
        raw reads failing multiple alignment.
    offsets-forward
        5' offset table for input into offset subcommand.
    offsets-reverse
        3' offset table for input into offset subcommand.

output annotation fields:
    None

muscle

usage: AlignSets.py muscle [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                           [--failed] [--log LOG_FILE]
                           [--delim DELIMITER DELIMITER DELIMITER]
                           [--nproc NPROC] [--outdir OUT_DIR]
                           [--outname OUT_NAME] [--bf BARCODE_FIELD] [--div]
                           [--exec MUSCLE_EXEC]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --bf BARCODE_FIELD    The annotation field containing barcode labels for
                        sequence grouping (default: BARCODE)
  --div                 Specify to calculate nucleotide diversity of each set
                        (average pairwise error rate) (default: False)
  --exec MUSCLE_EXEC    The location of the MUSCLE executable (default:
                        /usr/local/bin/muscle)

offset

usage: AlignSets.py offset [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                           [--failed] [--log LOG_FILE]
                           [--delim DELIMITER DELIMITER DELIMITER]
                           [--nproc NPROC] [--outdir OUT_DIR]
                           [--outname OUT_NAME] [--bf BARCODE_FIELD] [--div]
                           [-d OFFSET_TABLE] [--pf PRIMER_FIELD]
                           [--mode {pad,cut}]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --bf BARCODE_FIELD    The annotation field containing barcode labels for
                        sequence grouping (default: BARCODE)
  --div                 Specify to calculate nucleotide diversity of each set
                        (average pairwise error rate) (default: False)
  -d OFFSET_TABLE       The tab delimited file of offset tags and values
                        (default: None)
  --pf PRIMER_FIELD     The primer field to use for offset assignment
                        (default: PRIMER)
  --mode {pad,cut}      Specifies whether or align sequence by padding with
                        gaps or by cutting the 5' sequence to a common start
                        position (default: pad)

table

usage: AlignSets.py table [-h] [--failed]
                          [--delim DELIMITER DELIMITER DELIMITER]
                          [--outdir OUT_DIR] [--outname OUT_NAME] -p
                          PRIMER_FILE [--reverse] [--exec MUSCLE_EXEC]

optional arguments:
  -h, --help            show this help message and exit
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -p PRIMER_FILE        A FASTA or REGEX file containing primer sequences
                        (default: None)
  --reverse             If specified create a 3' offset table instead
                        (default: False)
  --exec MUSCLE_EXEC    The location of the MUSCLE executable (default:
                        /usr/local/bin/muscle)

AssemblePairs

usage: AssemblePairs.py [-h] [--version]  ...

Assembles paired-end reads into a single sequence

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

subcommands:
              Assembly method
    align     Assembled pairs by aligning ends
    join      Assembled pairs by concatenating ends
    reference
              Assembled pairs by aligning reads against a reference database

output files:
    assemble-pass
        successfully assembled reads.
    assemble-fail
        raw reads failing paired-end assembly.

output annotation fields:
    
        annotation fields specified by the --1f or --2f arguments.

align

usage: AssemblePairs.py align [-h] -1 SEQ_FILES_1 [SEQ_FILES_1 ...] -2
                              SEQ_FILES_2 [SEQ_FILES_2 ...] [--fasta]
                              [--failed] [--log LOG_FILE]
                              [--delim DELIMITER DELIMITER DELIMITER]
                              [--nproc NPROC] [--outdir OUT_DIR]
                              [--outname OUT_NAME]
                              [--coord {illumina,solexa,sra,454,presto}]
                              [--rc {head,tail,both}]
                              [--1f HEAD_FIELDS [HEAD_FIELDS ...]]
                              [--2f TAIL_FIELDS [TAIL_FIELDS ...]]
                              [--alpha ALPHA] [--maxerror MAX_ERROR]
                              [--minlen MIN_LEN] [--maxlen MAX_LEN]
                              [--scanrev]

optional arguments:
  -h, --help            show this help message and exit
  -1 SEQ_FILES_1 [SEQ_FILES_1 ...]
                        An ordered list of FASTA/FASTQ files containing
                        head/primary sequences. (default: None)
  -2 SEQ_FILES_2 [SEQ_FILES_2 ...]
                        An ordered list of FASTA/FASTQ files containing
                        tail/secondary sequences. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --coord {illumina,solexa,sra,454,presto}
                        The format of the sequence identifier which defines
                        shared coordinate information across paired ends
                        (default: presto)
  --rc {head,tail,both}
                        Specify to reverse complement sequences before
                        stitching (default: None)
  --1f HEAD_FIELDS [HEAD_FIELDS ...]
                        Specify annotation fields to copy from head records
                        into assembled record (default: None)
  --2f TAIL_FIELDS [TAIL_FIELDS ...]
                        Specify annotation fields to copy from tail records
                        into assembled record (default: None)
  --alpha ALPHA         Significance threshold for sequence assemble (default:
                        1e-05)
  --maxerror MAX_ERROR  Maximum allowable error rate (default: 0.3)
  --minlen MIN_LEN      Minimum sequence length to scan for overlap (default:
                        8)
  --maxlen MAX_LEN      Maximum sequence length to scan for overlap (default:
                        1000)
  --scanrev             If specified, scan past the end of the tail sequence
                        to allow the head sequence to overhang the end of the
                        tail sequence. (default: False)

reference

usage: AssemblePairs.py reference [-h] -1 SEQ_FILES_1 [SEQ_FILES_1 ...] -2
                                  SEQ_FILES_2 [SEQ_FILES_2 ...] [--fasta]
                                  [--failed] [--log LOG_FILE]
                                  [--delim DELIMITER DELIMITER DELIMITER]
                                  [--nproc NPROC] [--outdir OUT_DIR]
                                  [--outname OUT_NAME]
                                  [--coord {illumina,solexa,sra,454,presto}]
                                  [--rc {head,tail,both}]
                                  [--1f HEAD_FIELDS [HEAD_FIELDS ...]]
                                  [--2f TAIL_FIELDS [TAIL_FIELDS ...]] -r
                                  REF_FILE [--minident MIN_IDENT]
                                  [--evalue EVALUE] [--maxhits MAX_HITS]
                                  [--fill] [--exec USEARCH_EXEC]

optional arguments:
  -h, --help            show this help message and exit
  -1 SEQ_FILES_1 [SEQ_FILES_1 ...]
                        An ordered list of FASTA/FASTQ files containing
                        head/primary sequences. (default: None)
  -2 SEQ_FILES_2 [SEQ_FILES_2 ...]
                        An ordered list of FASTA/FASTQ files containing
                        tail/secondary sequences. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --coord {illumina,solexa,sra,454,presto}
                        The format of the sequence identifier which defines
                        shared coordinate information across paired ends
                        (default: presto)
  --rc {head,tail,both}
                        Specify to reverse complement sequences before
                        stitching (default: None)
  --1f HEAD_FIELDS [HEAD_FIELDS ...]
                        Specify annotation fields to copy from head records
                        into assembled record (default: None)
  --2f TAIL_FIELDS [TAIL_FIELDS ...]
                        Specify annotation fields to copy from tail records
                        into assembled record (default: None)
  -r REF_FILE           A FASTA file containing the reference sequence
                        database. (default: None)
  --minident MIN_IDENT  Minimum identity of the assembled sequence required to
                        call a valid assembly (between 0 and 1). (default:
                        0.5)
  --evalue EVALUE       Minimum E-value for the ublast reference alignment for
                        both the head and tail sequence. (default: 1e-05)
  --maxhits MAX_HITS    Maximum number of hits from ublast to check for
                        matching head and tail sequence reference alignments.
                        (default: 100)
  --fill                Specify to insert change the behavior of inserted
                        characters when the head and tail sequences do not
                        overlap. If specified this will result in inserted of
                        the V region reference sequence instead of a sequence
                        of Ns in the non-overlapping region. Warning, you
                        could end up making chimeric sequences by using this
                        option. (default: False)
  --exec USEARCH_EXEC   The path to the usearch executable file. (default:
                        /usr/local/bin/usearch)

join

usage: AssemblePairs.py join [-h] -1 SEQ_FILES_1 [SEQ_FILES_1 ...] -2
                             SEQ_FILES_2 [SEQ_FILES_2 ...] [--fasta]
                             [--failed] [--log LOG_FILE]
                             [--delim DELIMITER DELIMITER DELIMITER]
                             [--nproc NPROC] [--outdir OUT_DIR]
                             [--outname OUT_NAME]
                             [--coord {illumina,solexa,sra,454,presto}]
                             [--rc {head,tail,both}]
                             [--1f HEAD_FIELDS [HEAD_FIELDS ...]]
                             [--2f TAIL_FIELDS [TAIL_FIELDS ...]] [--gap GAP]

optional arguments:
  -h, --help            show this help message and exit
  -1 SEQ_FILES_1 [SEQ_FILES_1 ...]
                        An ordered list of FASTA/FASTQ files containing
                        head/primary sequences. (default: None)
  -2 SEQ_FILES_2 [SEQ_FILES_2 ...]
                        An ordered list of FASTA/FASTQ files containing
                        tail/secondary sequences. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --coord {illumina,solexa,sra,454,presto}
                        The format of the sequence identifier which defines
                        shared coordinate information across paired ends
                        (default: presto)
  --rc {head,tail,both}
                        Specify to reverse complement sequences before
                        stitching (default: None)
  --1f HEAD_FIELDS [HEAD_FIELDS ...]
                        Specify annotation fields to copy from head records
                        into assembled record (default: None)
  --2f TAIL_FIELDS [TAIL_FIELDS ...]
                        Specify annotation fields to copy from tail records
                        into assembled record (default: None)
  --gap GAP             Number of N characters to place between ends (default:
                        0)

BuildConsensus

usage: BuildConsensus.py [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                         [--failed] [--log LOG_FILE]
                         [--delim DELIMITER DELIMITER DELIMITER]
                         [--nproc NPROC] [--outdir OUT_DIR]
                         [--outname OUT_NAME] [--version] [-n MIN_COUNT]
                         [--bf BARCODE_FIELD] [-q MIN_QUAL] [--freq MIN_FREQ]
                         [--maxgap MAX_GAP] [--pf PRIMER_FIELD]
                         [--prcons PRIMER_FREQ]
                         [--cf COPY_FIELDS [COPY_FIELDS ...]]
                         [--act {min,max,sum,set,majority} [{min,max,sum,set,majority} ...]]
                         [--dep]
                         [--maxdiv MAX_DIVERSITY | --maxerror MAX_ERROR]

Builds a consensus sequence for each set of input sequences

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --version             show program's version number and exit
  -n MIN_COUNT          The minimum number of sequences needed to define a
                        valid consensus (default: 1)
  --bf BARCODE_FIELD    Position of description barcode field to group
                        sequences by (default: BARCODE)
  -q MIN_QUAL           Consensus quality score cut-off under which an
                        ambiguous character is assigned; does not apply when
                        quality scores are unavailable (default: 0)
  --freq MIN_FREQ       Fraction of character occurrences under which an
                        ambiguous character is assigned. (default: 0.6)
  --maxgap MAX_GAP      If specified, this defines a cut-off for the frequency
                        of allowed gap values for each position. Positions
                        exceeding the threshold are deleted from the
                        consensus. If not defined, positions are always
                        retained. (default: None)
  --pf PRIMER_FIELD     Specifies the field name of the primer annotations
                        (default: None)
  --prcons PRIMER_FREQ  Specify to define a minimum primer frequency required
                        to assign a consensus primer, and filter out sequences
                        with minority primers from the consensus building step
                        (default: None)
  --cf COPY_FIELDS [COPY_FIELDS ...]
                        Specifies a set of additional annotation fields to
                        copy into the consensus sequence annotations.
                        (default: None)
  --act {min,max,sum,set,majority} [{min,max,sum,set,majority} ...]
                        List of actions to take for each copy field which
                        defines how each annotation will be combined into a
                        single value. The actions "min", "max", "sum" perform
                        the corresponding mathematical operation on numeric
                        annotations. The action "set" combines annotations
                        into a comma delimited list of unique values and adds
                        an annotation named _COUNT specifying the count
                        of each item in the set. The action "majority" assigns
                        the most frequent annotation to the consensus
                        annotation and adds an annotation named _FREQ
                        specifying the frequency of the majority value.
                        (default: None)
  --dep                 Specify to calculate consensus quality with a non-
                        independence assumption (default: False)
  --maxdiv MAX_DIVERSITY
                        Specify to calculate the nucleotide diversity of each
                        read group (average pairwise error rate) and remove
                        groups exceeding the given diversity threshold.
                        Diversity is calculate for all positions within the
                        read group, ignoring any character filtering imposed
                        by the -q, --freq and --maxgap arguments. Mutually
                        exclusive with --maxerror. (default: None)
  --maxerror MAX_ERROR  Specify to calculate the error rate of each read group
                        (rate of mismatches from consensus) and remove groups
                        exceeding the given error threshold. The error rate is
                        calculated against the final consensus sequence, which
                        may include masked positions due to the -q and --freq
                        arguments and may have deleted positions due to the
                        --maxgap argument. Mutually exclusive with --maxdiv.
                        (default: None)

output files:
    consensus-pass
        consensus reads.
    consensus-fail
        raw reads failing consensus filtering criteria.

output annotation fields:
    PRIMER
        a comma delimited list of unique primer annotations found within the
        barcode read group.
    PRCOUNT
        a comma delimited list of the corresponding counts of unique primer
        annotations.
    PRCONS
        the majority primer within the barcode read group.
    PRFREQ
        the frequency of the majority primer.
    CONSCOUNT
        the count of reads within the barcode read group which contributed to
        the consensus sequence. This is the total size of the read group,
        minus sequence excluded due to user defined filtering criteria.

ClusterSets

usage: ClusterSets.py [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta] [--failed]
                      [--log LOG_FILE] [--delim DELIMITER DELIMITER DELIMITER]
                      [--nproc NPROC] [--outdir OUT_DIR] [--outname OUT_NAME]
                      [--version] [-f BARCODE_FIELD] [-k CLUSTER_FIELD]
                      [--id IDENT] [--start SEQ_START] [--end SEQ_END]
                      [--exec USEARCH_EXEC]

Cluster sequences by group

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --version             show program's version number and exit
  -f BARCODE_FIELD      The annotation field containing annotations, such as
                        UID barcode, for sequence grouping. (default: BARCODE)
  -k CLUSTER_FIELD      The name of the output annotation field to add with
                        the cluster information for each sequence. (default:
                        CLUSTER)
  --id IDENT            The sequence identity threshold for the usearch
                        algorithm. (default: 0.9)
  --start SEQ_START     The start of the region to be used for clustering.
                        Together with --end, this parameter can be used to
                        specify a subsequence of each read to use in the
                        clustering algorithm. (default: None)
  --end SEQ_END         The end of the region to be used for clustering.
                        (default: None)
  --exec USEARCH_EXEC   The location of the USEARCH executable. (default:
                        /usr/local/bin/usearch)

output files:
    cluster-pass
       clustered reads.
    cluster-fail
       raw reads failing clustering.

output annotation fields:
    CLUSTER
       a numeric cluster identifier defining the within-group cluster.

CollapseSeq

usage: CollapseSeq.py [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta] [--failed]
                      [--log LOG_FILE] [--delim DELIMITER DELIMITER DELIMITER]
                      [--outdir OUT_DIR] [--outname OUT_NAME] [--version]
                      [-n MAX_MISSING] [--uf UNIQ_FIELDS [UNIQ_FIELDS ...]]
                      [--cf COPY_FIELDS [COPY_FIELDS ...]]
                      [--act {min,max,sum,set} [{min,max,sum,set} ...]]
                      [--inner] [--keepmiss]
                      [--maxf MAX_FIELD | --minf MIN_FIELD]

Removes duplicate sequences from FASTA/FASTQ files

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --version             show program's version number and exit
  -n MAX_MISSING        Maximum number of missing nucleotides to consider for
                        collapsing sequences. A sequence will be considered
                        undetermined if it contains too many missing
                        nucleotides. (default: 0)
  --uf UNIQ_FIELDS [UNIQ_FIELDS ...]
                        Specifies a set of annotation fields that must match
                        for sequences to be considered duplicates (default:
                        None)
  --cf COPY_FIELDS [COPY_FIELDS ...]
                        Specifies a set of annotation fields to copy into the
                        unique sequence output. (default: None)
  --act {min,max,sum,set} [{min,max,sum,set} ...]
                        List of actions to take for each copy field which
                        defines how each annotation will be combined into a
                        single value. The actions "min", "max", "sum" perform
                        the corresponding mathematical operation on numeric
                        annotations. The action "set" collapses annotations
                        into a comma delimited list of unique values.
                        (default: None)
  --inner               If specified, exclude consecutive missing characters
                        at either end of the sequence. (default: False)
  --keepmiss            If specified, sequences with more missing characters
                        than the threshold set by the -n parameter will be
                        written to the unique sequence output file with a
                        DUPCOUNT=1 annotation. If not specified, such
                        sequences will be written to a separate file.
                        (default: False)
  --maxf MAX_FIELD      Specify the field whose maximum value determines the
                        retained sequence; mutually exclusive with --minf.
                        (default: None)
  --minf MIN_FIELD      Specify the field whose minimum value determines the
                        retained sequence; mutually exclusive with --minf.
                        (default: None)

output files:
    collapse-unique
        unique sequences. Contains one representative from each set of
        duplicate sequences. The retained representative is determined by
        user defined criteria.
    collapse-duplicate
        raw reads which are duplicates of the sequences retained in the
        collapse-unique file.
    collapse-undetermined
        raw reads which were excluded from consideration due to having too
        many N characters in the sequence.

output annotation fields:
    DUPCOUNT
        total number of sequences within the set of duplicates for each
        retained unique sequence. Meaning, the copy number of each unique
        sequence within the data file.
    
        annotation fields specified by the --cf parameter.

ConvertHeaders

usage: ConvertHeaders.py [-h] [--version]  ...

Converts sequence headers to the pRESTO format

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

subcommands:
              Conversion method
    generic   Converts sequence headers without a known annotation system.
    454       Converts Roche 454 sequence headers.
    genbank   Converts NCBI GenBank and RefSeq sequence headers.
    illumina  Converts Illumina sequence headers.
    imgt      Converts sequence headers output by IMGT/GENE-DB.
    sra       Converts NCBI SRA sequence headers.

output files:
    convert-pass
        reads passing header conversion.
    convert-fail
        raw reads failing header conversion.

output annotation fields:
    
        the annotation fields added are specific to the header format of the
        input file.

generic

usage: ConvertHeaders.py generic [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                                 [--failed]
                                 [--delim DELIMITER DELIMITER DELIMITER]
                                 [--outdir OUT_DIR] [--outname OUT_NAME]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)

454

usage: ConvertHeaders.py 454 [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                             [--failed]
                             [--delim DELIMITER DELIMITER DELIMITER]
                             [--outdir OUT_DIR] [--outname OUT_NAME]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)

genbank

usage: ConvertHeaders.py genbank [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                                 [--failed]
                                 [--delim DELIMITER DELIMITER DELIMITER]
                                 [--outdir OUT_DIR] [--outname OUT_NAME]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)

illumina

usage: ConvertHeaders.py illumina [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                                  [--failed]
                                  [--delim DELIMITER DELIMITER DELIMITER]
                                  [--outdir OUT_DIR] [--outname OUT_NAME]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)

imgt

usage: ConvertHeaders.py imgt [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                              [--failed]
                              [--delim DELIMITER DELIMITER DELIMITER]
                              [--outdir OUT_DIR] [--outname OUT_NAME]
                              [--simple]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --simple              If specified, only the allele name, and no other
                        annotations, will appear in the converted sequence
                        header. (default: False)

sra

usage: ConvertHeaders.py sra [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                             [--failed]
                             [--delim DELIMITER DELIMITER DELIMITER]
                             [--outdir OUT_DIR] [--outname OUT_NAME]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)

EstimateError

usage: EstimateError.py [-h] -s SEQ_FILES [SEQ_FILES ...] [--log LOG_FILE]
                        [--delim DELIMITER DELIMITER DELIMITER]
                        [--nproc NPROC] [--outdir OUT_DIR]
                        [--outname OUT_NAME] [--version] [-f SET_FIELD]
                        [-n MIN_COUNT] [--mode {freq,qual}] [-q MIN_QUAL]
                        [--freq MIN_FREQ] [--maxdiv MAX_DIVERSITY]

Calculates annotation set error rates

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --version             show program's version number and exit
  -f SET_FIELD          The name of the annotation field to group sequences by
                        (default: BARCODE)
  -n MIN_COUNT          The minimum number of sequences needed to consider a
                        set (default: 10)
  --mode {freq,qual}    Specifies which method to use to determine the
                        consensus sequence. The "freq" method will determine
                        the consensus by nucleotide frequency at each position
                        and assign the most common value. The "qual" method
                        will weight values by their quality scores to
                        determine the consensus nucleotide at each position.
                        (default: freq)
  -q MIN_QUAL           Consensus quality score cut-off under which an
                        ambiguous ' character is assigned. (default: 20)
  --freq MIN_FREQ       Fraction of character occurrences under which an
                        ambiguous character is assigned. (default: 0.6)
  --maxdiv MAX_DIVERSITY
                        Specify to calculate the nucleotide diversity of each
                        read group (average pairwise error rate) and exclude
                        groups which exceed the given diversity threshold.
                        (default: None)

output files:
    error-position
        estimated error by read position.
    error-quality
        estimated error by the quality score assigned within the input file.
    error-nucleotide
        estimated error by nucleotide.
    error-set
        estimated error by barcode read group size.

output fields:
    POSITION
        read position with base zero indexing.
    Q
        Phred quality score.
    OBSERVED
        observed nucleotide value.
    REFERENCE
        consensus nucleotide for the barcode read group.
    SET_COUNT
        barcode read group size.
    REPORTED_Q
        mean Phred quality score reported within the input file for the given
        position, quality score, nucleotide or read group.
    MISMATCHES
        count of observed mismatches from consensus for the given position,
        quality score, nucleotide or read group.
    OBSERVATIONS
        total count of observed values for each position, quality score,
        nucleotide or read group size.
    ERROR
        estimated error rate.
    EMPIRICAL_Q
        estimated error rate converted to a Phred quality score.

FilterSeq

usage: FilterSeq.py [-h] [--version]  ...

Filters sequences in FASTA/FASTQ files

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

subcommands:
              Filtering operation
    length    Sequence length filtering mode
    missing   Missing nucleotide filtering mode
    repeats   Consecutive nucleotide repeating filtering mode
    quality   Quality filtering mode
    maskqual  Character masking mode
    trimqual  Sequence trimming mode

output files:
    -pass
        reads passing filtering operation and modified accordingly, where
         is the name of the filtering operation that was run.
    -fail
        raw reads failing filtering criteria, where  is the name of
        the filtering operation.

output annotation fields:
    None

length

usage: FilterSeq.py length [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                           [--failed] [--log LOG_FILE] [--nproc NPROC]
                           [--outdir OUT_DIR] [--outname OUT_NAME]
                           [-n MIN_LENGTH] [--inner]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -n MIN_LENGTH         Minimum sequence length to retain. (default: 250)
  --inner               If specified exclude consecutive missing characters at
                        either end of the sequence. (default: False)

missing

usage: FilterSeq.py missing [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                            [--failed] [--log LOG_FILE] [--nproc NPROC]
                            [--outdir OUT_DIR] [--outname OUT_NAME]
                            [-n MAX_MISSING] [--inner]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -n MAX_MISSING        Threshold for fraction of gap or N nucleotides.
                        (default: 10)
  --inner               If specified exclude consecutive missing characters at
                        either end of the sequence. (default: False)

repeats

usage: FilterSeq.py repeats [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                            [--failed] [--log LOG_FILE] [--nproc NPROC]
                            [--outdir OUT_DIR] [--outname OUT_NAME]
                            [-n MAX_REPEAT] [--missing] [--inner]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -n MAX_REPEAT         Threshold for fraction of repeating nucleotides.
                        (default: 15)
  --missing             If specified count consecutive gap and N characters '
                        in addition to {A,C,G,T}. (default: False)
  --inner               If specified exclude consecutive missing characters at
                        either end of the sequence. (default: False)

quality

usage: FilterSeq.py quality [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                            [--failed] [--log LOG_FILE] [--nproc NPROC]
                            [--outdir OUT_DIR] [--outname OUT_NAME]
                            [-q MIN_QUAL] [--inner]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -q MIN_QUAL           Quality score threshold. (default: 20)
  --inner               If specified exclude consecutive missing characters at
                        either end of the sequence. (default: False)

maskqual

usage: FilterSeq.py maskqual [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                             [--failed] [--log LOG_FILE] [--nproc NPROC]
                             [--outdir OUT_DIR] [--outname OUT_NAME]
                             [-q MIN_QUAL]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -q MIN_QUAL           Quality score threshold. (default: 20)

trimqual

usage: FilterSeq.py trimqual [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                             [--failed] [--log LOG_FILE] [--nproc NPROC]
                             [--outdir OUT_DIR] [--outname OUT_NAME]
                             [-q MIN_QUAL] [--win WINDOW] [--reverse]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -q MIN_QUAL           Quality score threshold. (default: 20)
  --win WINDOW          Nucleotide window size for moving average calculation.
                        (default: 10)
  --reverse             Specify to trim the head of the sequence rather than
                        the tail. (default: False)

MaskPrimers

usage: MaskPrimers.py [-h] [--version]  ...

Removes primers and annotates sequences with primer and barcode identifiers

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

subcommands:
              Alignment method
    align     Find primer matches using pairwise local alignment
    score     Find primer matches by scoring primers at a fixed position

output files:
    mask-pass
        processed reads with successful primer matches.
    mask-fail
        raw reads failing primer identification.

output annotation fields:
    SEQORIENT
        the orientation of the output sequence. Either F (input) or RC
        (reverse complement of input).
    PRIMER
        name of the best primer match.
    BARCODE
        the sequence preceding the primer match. Only output when the
        --barcode flag is specified.

align

usage: MaskPrimers.py align [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                            [--failed] [--log LOG_FILE]
                            [--delim DELIMITER DELIMITER DELIMITER]
                            [--nproc NPROC] [--outdir OUT_DIR]
                            [--outname OUT_NAME] -p PRIMER_FILE
                            [--mode {cut,mask,trim,tag}]
                            [--maxerror MAX_ERROR] [--revpr] [--barcode]
                            [--maxlen MAX_LEN] [--skiprc]
                            [--gap GAP_PENALTY GAP_PENALTY]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -p PRIMER_FILE        A FASTA or REGEX file containing primer sequences.
                        (default: None)
  --mode {cut,mask,trim,tag}
                        Specifies the action to take with the primer sequence.
                        The "cut" mode will remove both the primer region and
                        the preceding sequence. The "mask" mode will replace
                        the primer region with Ns and remove the preceding
                        sequence. The "trim" mode will remove the region
                        preceding the primer, but leave the primer region
                        intact. The "tag" mode will leave the input sequence
                        unmodified. (default: mask)
  --maxerror MAX_ERROR  Maximum allowable error rate. (default: 0.2)
  --revpr               Specify to match the tail-end of the sequence against
                        the reverse complement of the primers. (default:
                        False)
  --barcode             Specify to encode sequences with barcode sequences
                        (unique molecular identifiers) found preceding the
                        primer region. (default: False)
  --maxlen MAX_LEN      Maximum sequence length to scan for primers. (default:
                        50)
  --skiprc              Specify to prevent checking of sample reverse
                        complement sequences. (default: False)
  --gap GAP_PENALTY GAP_PENALTY
                        A list of two positive values defining the gap open
                        and gap extension penalties for aligning the primers.
                        Note: the error rate is calculated as the percentage
                        of mismatches from the primer sequence with gap
                        penalties reducing the match count accordingly; this
                        may lead to error rates that differ from strict
                        mismatch percentage when gaps are present in the
                        alignment. (default: (1, 1))

score

usage: MaskPrimers.py score [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                            [--failed] [--log LOG_FILE]
                            [--delim DELIMITER DELIMITER DELIMITER]
                            [--nproc NPROC] [--outdir OUT_DIR]
                            [--outname OUT_NAME] -p PRIMER_FILE
                            [--mode {cut,mask,trim,tag}]
                            [--maxerror MAX_ERROR] [--revpr] [--barcode]
                            [--start START]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --log LOG_FILE        Specify to write verbose logging to a file. May not be
                        specified with multiple input files. (default: None)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --nproc NPROC         The number of simultaneous computational processes to
                        execute (CPU cores to utilized). (default: 4)
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -p PRIMER_FILE        A FASTA or REGEX file containing primer sequences.
                        (default: None)
  --mode {cut,mask,trim,tag}
                        Specifies the action to take with the primer sequence.
                        The "cut" mode will remove both the primer region and
                        the preceding sequence. The "mask" mode will replace
                        the primer region with Ns and remove the preceding
                        sequence. The "trim" mode will remove the region
                        preceding the primer, but leave the primer region
                        intact. The "tag" mode will leave the input sequence
                        unmodified. (default: mask)
  --maxerror MAX_ERROR  Maximum allowable error rate. (default: 0.2)
  --revpr               Specify to match the tail-end of the sequence against
                        the reverse complement of the primers. (default:
                        False)
  --barcode             Specify to encode sequences with barcode sequences
                        (unique molecular identifiers) found preceding the
                        primer region. (default: False)
  --start START         The starting position of the primer (default: 0)

PairSeq

usage: PairSeq.py [-h] -1 SEQ_FILES_1 [SEQ_FILES_1 ...] -2 SEQ_FILES_2
                  [SEQ_FILES_2 ...] [--fasta] [--failed]
                  [--delim DELIMITER DELIMITER DELIMITER] [--outdir OUT_DIR]
                  [--outname OUT_NAME] [--version]
                  [--1f FIELDS_1 [FIELDS_1 ...]]
                  [--2f FIELDS_2 [FIELDS_2 ...]]
                  [--coord {illumina,solexa,sra,454,presto}]

Sorts and matches sequence records with matching coordinates across files

optional arguments:
  -h, --help            show this help message and exit
  -1 SEQ_FILES_1 [SEQ_FILES_1 ...]
                        An ordered list of FASTA/FASTQ files containing
                        head/primary sequences. (default: None)
  -2 SEQ_FILES_2 [SEQ_FILES_2 ...]
                        An ordered list of FASTA/FASTQ files containing
                        tail/secondary sequences. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --version             show program's version number and exit
  --1f FIELDS_1 [FIELDS_1 ...]
                        The annotation fields to copy from file 1 records into
                        file 2 records. If a copied annotation already exists
                        in a file 2 record, then the annotations copied from
                        file 1 will be added to the front of the existing
                        annotation. (default: None)
  --2f FIELDS_2 [FIELDS_2 ...]
                        The annotation fields to copy from file 2 records into
                        file 1 records. If a copied annotation already exists
                        in a file 1 record, then the annotations copied from
                        file 2 will be added to the end of the existing
                        annotation. (default: None)
  --coord {illumina,solexa,sra,454,presto}
                        The format of the sequence identifier which defines
                        shared coordinate information across mate pairs.
                        (default: presto)

output files:
    pair-pass
        successfully paired reads with modified annotations.
    pair-fail
        raw reads that could not be assigned to a mate-pair.

output annotation fields:
    
        annotation fields specified by the --1f or --2f arguments.

ParseHeaders

usage: ParseHeaders.py [-h] [--version]  ...

Parses pRESTO annotations in FASTA/FASTQ sequence headers

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

subcommands:
              Annotation operation
    add       Adds field/value pairs to header annotations
    collapse  Collapses header annotations with multiple entries
    copy      Copies header annotation fields
    delete    Deletes fields from header annotations
    expand    Expands annotation fields with multiple values
    rename    Renames header annotation fields
    table     Writes sequence headers to a table

output files:
    reheader-pass
        reads passing annotation operation and modified accordingly.
    reheader-fail
        raw reads failing annotation operation.
    headers
        tab delimited table of the selected annotations.

output annotation fields:
    
        annotation fields specified by the -f argument.

add

usage: ParseHeaders.py add [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                           [--failed] [--delim DELIMITER DELIMITER DELIMITER]
                           [--outdir OUT_DIR] [--outname OUT_NAME] -f FIELDS
                           [FIELDS ...] -u VALUES [VALUES ...]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to add. (default: None)
  -u VALUES [VALUES ...]
                        List of values to add for each field. (default: None)

collapse

usage: ParseHeaders.py collapse [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                                [--failed]
                                [--delim DELIMITER DELIMITER DELIMITER]
                                [--outdir OUT_DIR] [--outname OUT_NAME] -f
                                FIELDS [FIELDS ...] --act
                                {min,max,sum,first,last,set,cat}
                                [{min,max,sum,first,last,set,cat} ...]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to collapse. (default: None)
  --act {min,max,sum,first,last,set,cat} [{min,max,sum,first,last,set,cat} ...]
                        List of actions to take for each field defining how
                        each annotation will be combined into a single value.
                        The actions "min", "max", "sum" perform the
                        corresponding mathematical operation on numeric
                        annotations. The actions "first" and "last" choose the
                        value from the corresponding position in the
                        annotation. The action "set" collapses annotations
                        into a comma delimited list of unique values. The
                        action "cat" concatenates the values together into a
                        single string. (default: None)

copy

usage: ParseHeaders.py copy [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                            [--failed] [--delim DELIMITER DELIMITER DELIMITER]
                            [--outdir OUT_DIR] [--outname OUT_NAME] -f FIELDS
                            [FIELDS ...] -k NAMES [NAMES ...]
                            [--act {min,max,sum,first,last,set,cat} [{min,max,sum,first,last,set,cat} ...]]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to copy. (default: None)
  -k NAMES [NAMES ...]  List of names for each copied field. If the new field
                        is already present, the copied field will be merged
                        into the existing field. (default: None)
  --act {min,max,sum,first,last,set,cat} [{min,max,sum,first,last,set,cat} ...]
                        List of collapse actions to take on each new field
                        following the copy operation defining how each
                        annotation will be combined into a single value. The
                        actions "min", "max", "sum" perform the corresponding
                        mathematical operation on numeric annotations. The
                        actions "first" and "last" choose the value from the
                        corresponding position in the annotation. The action
                        "set" collapses annotations into a comma delimited
                        list of unique values. The action "cat" concatenates
                        the values together into a single string. (default:
                        None)

delete

usage: ParseHeaders.py delete [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                              [--failed]
                              [--delim DELIMITER DELIMITER DELIMITER]
                              [--outdir OUT_DIR] [--outname OUT_NAME] -f
                              FIELDS [FIELDS ...]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to delete. (default: None)

expand

usage: ParseHeaders.py expand [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                              [--failed]
                              [--delim DELIMITER DELIMITER DELIMITER]
                              [--outdir OUT_DIR] [--outname OUT_NAME] -f
                              FIELDS [FIELDS ...] [--sep SEPARATOR]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to expand. (default: None)
  --sep SEPARATOR       The character separating each value in the fields.
                        (default: ,)

rename

usage: ParseHeaders.py rename [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta]
                              [--failed]
                              [--delim DELIMITER DELIMITER DELIMITER]
                              [--outdir OUT_DIR] [--outname OUT_NAME] -f
                              FIELDS [FIELDS ...] -k NAMES [NAMES ...]
                              [--act {min,max,sum,first,last,set,cat} [{min,max,sum,first,last,set,cat} ...]]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --fasta               Specify to force output as FASTA rather than FASTQ.
                        (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to rename. (default: None)
  -k NAMES [NAMES ...]  List of new names for each field. If the new field is
                        already present, the renamed field will be merged into
                        the existing field and the old field will be deleted.
                        (default: None)
  --act {min,max,sum,first,last,set,cat} [{min,max,sum,first,last,set,cat} ...]
                        List of collapse actions to take on each new field
                        following the rename operation defining how each
                        annotation will be combined into a single value. The
                        actions "min", "max", "sum" perform the corresponding
                        mathematical operation on numeric annotations. The
                        actions "first" and "last" choose the value from the
                        corresponding position in the annotation. The action
                        "set" collapses annotations into a comma delimited
                        list of unique values. The action "cat" concatenates
                        the values together into a single string. (default:
                        None)

table

usage: ParseHeaders.py table [-h] -s SEQ_FILES [SEQ_FILES ...] [--failed]
                             [--delim DELIMITER DELIMITER DELIMITER]
                             [--outdir OUT_DIR] [--outname OUT_NAME] -f FIELDS
                             [FIELDS ...]

optional arguments:
  -h, --help            show this help message and exit
  -s SEQ_FILES [SEQ_FILES ...]
                        A list of FASTA/FASTQ files containing sequences to
                        process. (default: None)
  --failed              If specified create files containing records that fail
                        processing. (default: False)
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to collect. The sequence identifier may
                        be specified using the hidden field name "ID".
                        (default: None)

ParseLog

usage: ParseLog.py [-h] [--delim DELIMITER DELIMITER DELIMITER]
                   [--outdir OUT_DIR] [--outname OUT_NAME] [--version] -l
                   RECORD_FILES [RECORD_FILES ...] -f FIELDS [FIELDS ...]

Parses records in the console log of pRESTO modules

optional arguments:
  -h, --help            show this help message and exit
  --delim DELIMITER DELIMITER DELIMITER
                        A list of the three delimiters that separate
                        annotation blocks, field names and values, and values
                        within a field, respectively. (default: ('|', '=',
                        ','))
  --outdir OUT_DIR      Specify to changes the output directory to the
                        location specified. The input file directory is used
                        if this is not specified. (default: None)
  --outname OUT_NAME    Changes the prefix of the successfully processed
                        output file to the string specified. May not be
                        specified with multiple input files. (default: None)
  --version             show program's version number and exit
  -l RECORD_FILES [RECORD_FILES ...]
                        List of log files to parse. (default: None)
  -f FIELDS [FIELDS ...]
                        List of fields to collect. The sequence identifier may
                        be specified using the hidden field name "ID".
                        (default: None)

output files:
    table
        tab delimited table of the selected annotations.

output annotation fields:
    
        annotation fields specified by the -f argument.

SplitSeq

usage: SplitSeq.py [-h] [--version]  ...

Sorts, samples and splits FASTA/FASTQ sequence files

optional arguments:
  -h, --help  show this help message and exit
  --version   show program's version number and exit

subcommands:
              Sequence file operation
    count     Splits sequences files by number of records
    group     Splits sequences files by annotation
    sample    Randomly samples from unpaired sequences files
    samplepair
              Randomly samples from paired-end sequences files
    sort      Sorts sequences files by annotation

output files:
    part
        reads partitioned by count, where  is the partition number.
    -
        reads partitioned by annotation  and .
    under-
        reads partitioned by numeric threshold where the annotation value is
        strictly less than the threshold .
    atleast-
        reads partitioned by numeric threshold where the annotation value is
        greater than or equal to the threshold .
    sorted
        reads sorted by annotation value.
    sorted-part
        reads sorted by annotation value and partitioned by count, where
         is the partition number.
    sample-n
        randomly sampled reads where  is a number specifying the sampling
        instance and  is the number of sampled reads.

output annotation fields:
    None

count

usage: SplitSeq.py count [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta] [--outdir OUT_DIR] [--outname OUT_NAME] -n MAX_COUNT optional arguments: -h, --help show this help message and exit -s SEQ_FILES [SEQ_FILES ...] A list of FASTA/FASTQ files containing sequences to process. (default: None) --fasta Specify to force output as FASTA rather than FASTQ. (default: None) --outdir OUT_DIR Specify to changes the output directory to the location specified. The input file directory is used if this is not specified. (default: None) --outname OUT_NAME Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files. (default: None) -n MAX_COUNT Maximum number of sequences in each new file (default: None)

group

usage: SplitSeq.py group [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta] [--delim DELIMITER DELIMITER DELIMITER] [--outdir OUT_DIR] [--outname OUT_NAME] -f FIELD [--num THRESHOLD] optional arguments: -h, --help show this help message and exit -s SEQ_FILES [SEQ_FILES ...] A list of FASTA/FASTQ files containing sequences to process. (default: None) --fasta Specify to force output as FASTA rather than FASTQ. (default: None) --delim DELIMITER DELIMITER DELIMITER A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively. (default: ('|', '=', ',')) --outdir OUT_DIR Specify to changes the output directory to the location specified. The input file directory is used if this is not specified. (default: None) --outname OUT_NAME Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files. (default: None) -f FIELD Annotation field to split sequence files by (default: None) --num THRESHOLD Specify to define the split field as numeric and group sequences by value (default: None)

sample

usage: SplitSeq.py sample [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta] [--delim DELIMITER DELIMITER DELIMITER] [--outdir OUT_DIR] [--outname OUT_NAME] -n MAX_COUNT [MAX_COUNT ...] [-f FIELD] [-u VALUES [VALUES ...]] optional arguments: -h, --help show this help message and exit -s SEQ_FILES [SEQ_FILES ...] A list of FASTA/FASTQ files containing sequences to process. (default: None) --fasta Specify to force output as FASTA rather than FASTQ. (default: None) --delim DELIMITER DELIMITER DELIMITER A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively. (default: ('|', '=', ',')) --outdir OUT_DIR Specify to changes the output directory to the location specified. The input file directory is used if this is not specified. (default: None) --outname OUT_NAME Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files. (default: None) -n MAX_COUNT [MAX_COUNT ...] Maximum number of sequences to sample from each file (default: None) -f FIELD The annotation field for sampling criteria (default: None) -u VALUES [VALUES ...] A list of annotation values that sequences must contain one of; requires the -f argument (default: None)

samplepair

usage: SplitSeq.py samplepair [-h] -1 SEQ_FILES_1 [SEQ_FILES_1 ...] -2 SEQ_FILES_2 [SEQ_FILES_2 ...] [--fasta] [--delim DELIMITER DELIMITER DELIMITER] [--outdir OUT_DIR] [--outname OUT_NAME] -n MAX_COUNT [MAX_COUNT ...] [-f FIELD] [-u VALUES [VALUES ...]] [--coord {illumina,solexa,sra,454,presto}] optional arguments: -h, --help show this help message and exit -1 SEQ_FILES_1 [SEQ_FILES_1 ...] An ordered list of FASTA/FASTQ files containing head/primary sequences. (default: None) -2 SEQ_FILES_2 [SEQ_FILES_2 ...] An ordered list of FASTA/FASTQ files containing tail/secondary sequences. (default: None) --fasta Specify to force output as FASTA rather than FASTQ. (default: None) --delim DELIMITER DELIMITER DELIMITER A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively. (default: ('|', '=', ',')) --outdir OUT_DIR Specify to changes the output directory to the location specified. The input file directory is used if this is not specified. (default: None) --outname OUT_NAME Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files. (default: None) -n MAX_COUNT [MAX_COUNT ...] A list of the number of sequences to sample from each file (default: None) -f FIELD The annotation field for sampling criteria (default: None) -u VALUES [VALUES ...] A list of annotation values that both paired sequences must contain one of; requires the -f argument (default: None) --coord {illumina,solexa,sra,454,presto} The format of the sequence identifier which defines shared coordinate information across paired ends (default: presto)

sort

usage: SplitSeq.py sort [-h] -s SEQ_FILES [SEQ_FILES ...] [--fasta] [--delim DELIMITER DELIMITER DELIMITER] [--outdir OUT_DIR] [--outname OUT_NAME] -f FIELD [-n MAX_COUNT] [--num] optional arguments: -h, --help show this help message and exit -s SEQ_FILES [SEQ_FILES ...] A list of FASTA/FASTQ files containing sequences to process. (default: None) --fasta Specify to force output as FASTA rather than FASTQ. (default: None) --delim DELIMITER DELIMITER DELIMITER A list of the three delimiters that separate annotation blocks, field names and values, and values within a field, respectively. (default: ('|', '=', ',')) --outdir OUT_DIR Specify to changes the output directory to the location specified. The input file directory is used if this is not specified. (default: None) --outname OUT_NAME Changes the prefix of the successfully processed output file to the string specified. May not be specified with multiple input files. (default: None) -f FIELD The annotation field to sort sequences by (default: None) -n MAX_COUNT Maximum number of sequences in each new file (default: None) --num Specify to define the sort field as numeric rather than textual (default: False)