OverlapEncodings-class {IRanges}R Documentation

OverlapEncodings objects

Description

The OverlapEncodings class is a container for storing the "overlap encodings" returned by the encodeOverlaps function.

Usage

## OverlapEncodings accessors:

## S4 method for signature 'OverlapEncodings'
length(x)
## S4 method for signature 'OverlapEncodings'
Loffset(x)
## S4 method for signature 'OverlapEncodings'
Roffset(x)
## S4 method for signature 'OverlapEncodings'
encoding(x)

## Coercing an OverlapEncodings object:

## S4 method for signature 'OverlapEncodings'
as.data.frame(x, row.names=NULL, optional=FALSE, ...)

Arguments

x

An OverlapEncodings object.

row.names

NULL or a character vector.

optional, ...

Ignored.

Details

Given a query and a subject of the same length, both list-like objects with top-level elements typically containing multiple ranges (e.g. RangesList objects), the "overlap encoding" of the i-th element in query and i-th element in subject is a character string describing how the ranges in query[[i]] are qualitatively positioned relatively to the ranges in subject[[i]].

The encodeOverlaps function computes those overlap encodings and returns them in an OverlapEncodings object of the same length as query and subject.

OverlapEncodings accessors

In the following code snippets, x is an OverlapEncodings object typically obtained by a call to encodeOverlaps(query, subject).

length(x): Get the number of elements (i.e. encodings) in x. This is equal to length(query) and length(subject).

Loffset(x), Roffset(x): Get the "left-offsets" and "right-offsets" of the encodings, respectively. Both are integer vectors of the same length as x.

Let's denote Qi = query[[i]], Si = subject[[i]], and [q1,q2] the range covered by Qi i.e. q1 = min(start(Qi)) and q2 = max(end(Qi)), then Loffset(x)[i] is the number L of ranges at the head of Si that are strictly to the left of all the ranges in Qi i.e. L is the greatest value such that end(Si)[k] < q1 - 1 for all k in seq_len(L). Similarly, Roffset(x)[i] is the number R of ranges at the tail of Si that are strictly to the right of all the ranges in Qi i.e. R is the greatest value such that start(Si)[length(Si) + 1 - k] > q2 + 1 for all k in seq_len(L).

encoding(x): Factor of the same length as x where the i-th element is the encoding obtained by comparing each range in Qi with all the ranges in tSi = Si[(1+L):(length(Si)-R)] (tSi stands for "trimmed Si"). More precisely, here is how this encoding is obtained:

  1. All the ranges in Qi are compared with tSi[1], then with tSi[2], etc... At each step (one step per range in tSi), comparing all the ranges in Qi with tSi[k] is done with rangeComparisonCodeToLetter(compare(Qi, tSi[k])). So at each step, we end up with a vector of M single letters (where M is length(Qi)).

  2. Each vector obtained previously (1 vector per range in tSi, all of them of length M) is turned into a single string by pasting its individual letters together.

  3. All the strings obtained previously (1 per range in tSi) are pasted together into a single long string and separated by colons (":"). An additional colon is prepended to the long string and another one appended to it.

  4. Finally, the value of M is prepended to the long string. The final string is the encoding.

Coercing an OverlapEncodings object

In the following code snippets, x is an OverlapEncodings object.

as.data.frame(x): Return x as a data frame with columns "Loffset", "Roffset" and "encoding".

Author(s)

H. Pages

See Also

encodeOverlaps, compare, RangesList-class

Examples

example(encodeOverlaps)  # to make 'ovenc'

length(ovenc)
Loffset(ovenc)
Roffset(ovenc)
encoding(ovenc)
as.data.frame(ovenc)

[Package IRanges version 1.14.4 Index]