trimTails {ShortRead}R Documentation

Trim ends of reads based on nucleotides or qualities

Description

These generic functions remove leading or trailing nucleotides or qualities. trimTails and trimTailw remove low-quality reads from the right end using a sliding window (trimTailw) or a tally of (successive) nucleotides falling at or below a quality threshold (trimTails). trimEnds takes an alphabet of characters to remove from either left or right end.

Usage

trimTailw(object, k, a, halfwidth, ..., ranges=FALSE)

## S4 method for signature 'BStringSet'
trimTailw(object, k, a, halfwidth, ..., alphabet, ranges=FALSE)
## S4 method for signature 'XStringQuality'
trimTailw(object, k, a, halfwidth, ..., ranges=FALSE)

trimTails(object, k, a, successive=FALSE, ..., ranges=FALSE)
## S4 method for signature 'BStringSet'
trimTails(object, k, a, successive=FALSE, ...,
    alphabet, ranges=FALSE)
## S4 method for signature 'XStringQuality'
trimTails(object, k, a, successive=FALSE, ..., ranges=FALSE)

trimEnds(object, a, left=TRUE, right=TRUE, relation=c("<=", "=="),
    ..., ranges=FALSE)
## S4 method for signature 'XStringSet'
trimEnds(object, a, left=TRUE, right=TRUE, relation=c("<=", "=="),
    ..., ranges=FALSE)
## S4 method for signature 'XStringQuality'
trimEnds(object, a, left=TRUE, right=TRUE, relation=c("<=", "=="),
    ..., ranges=FALSE)
## S4 method for signature 'FastqQuality'
trimEnds(object, a, left=TRUE, right=TRUE, relation=c("<=", "=="),
    ..., ranges=FALSE)
## S4 method for signature 'ShortRead'
trimEnds(object, a, left=TRUE, right=TRUE, relation=c("<=", "=="),
    ..., ranges=FALSE)
## S4 method for signature 'ShortReadQ'
trimEnds(object, a, left=TRUE, right=TRUE, relation=c("<=", "=="),
    ..., ranges=FALSE)

Arguments

object

An object for which a methods exist (e.g., ShortReadQ and derived classes); see below to discover these methods.

k

integer(1) describing the number of failing letters required to trigger trimming.

a

For trimTails and trimTailw, a character(1) with nchar(a) == 1L giving the letter at or below which a nucleotide is marked as failing.

For trimEnds a character() with all nchar() == 1L giving the letter at or below which a nucleotide or quality scores marked for removal.

halfwidth

The half width (cycles before or after the current; e.g., a half-width of 5 would span 5 + 1 + 5 cycles) in which qualities are assessed.

successive

logical(1) indicating whether failures can occur anywhere in the sequence, or must be successive. If successive=FALSE, then the k'th failed letter and subsequent are removed. If successive=TRUE, the first succession of k failed and subsequent letters are removed.

left, right

logical(1) indicating whether trimming is from the left or right ends.

relation

character(1) selected from the argument values, i.e., “<=” or “==” indicating whether all letters at or below the alphabet(object) are to be removed, or only exact matches.

...

Additional arguments, perhaps used by methods.

alphabet

character() (ordered low to high) letters on which quality scale is measured. Usually supplied internally (user does not need to specify). If missing, then set to ASCII characters 0-127.

ranges

logical(1) indicating whether the trimmed object, or only the ranges satisfying the trimming condition, be returned.

Details

trimTailw starts at the left-most nucleotide, tabulating the number of cycles in a window of 2 * halfwidth + 1 surrounding the current nucleotide with quality scores that fall at or below a. The read is trimmed at the first nucleotide for which this number >= k. The quality of the first or last nucleotide is used to represent portions of the window that extend beyond the sequence.

trimTails starts at the left-most nucleotide and accumulates cycles for which the quality score is at or below a. The read is trimmed at the first location where this number >= k. With successive=TRUE, failing qualities must occur in strict succession.

trimEnds examines the left, right, or both ends of object, marking for removal letters that correspond to a and relation. The trimEnds,ShortReadQ-method trims based on quality.

Value

An instance of class(object) trimmed to contain only those nucleotides satisfying the trim criterion or, if ranges=TRUE an IRanges instance defining the ranges that would trim object.

Author(s)

Martin Morgan <mtmorgan@fhcrc.org>

Examples

showMethods(trimTails)

sp <- SolexaPath(system.file('extdata', package='ShortRead'))
rfq <- readFastq(analysisPath(sp), pattern="s_1_sequence.txt")

## remove leading / trailing quality scores <= 'I'
trimEnds(rfq, "I")
## remove leading / trailing 'N's
rng <- trimEnds(sread(rfq), "N", relation="==", ranges=TRUE)
narrow(rfq, start(rng), end(rng))
## remove leading / trailing 'G's or 'C's
trimEnds(rfq, c("G", "C"), relation="==")


[Package ShortRead version 1.12.4 Index]