Grouping-class {IRanges} | R Documentation |
In this man page, we call "grouping" the action of dividing a collection of NO objects into NG groups (some of which may be empty). The Grouping class and subclasses are containers for representing groupings.
Let's give a formal description of the Grouping core API:
Groups G_i are indexed from 1 to NG (1 <= i <= NG).
Objects O_j are indexed from 1 to NO (1 <= j <= NO).
Every object must belong to one group and only one.
Given that empty groups are allowed, NG can be greater than NO.
Grouping an empty collection of objects (NO = 0) is supported. In that case, all the groups are empty. And only in that case, NG can be zero too (meaning there are no groups).
If x
is a Grouping object:
length(x)
:
Returns the number of groups (NG).
names(x)
:
Returns the names of the groups.
nobj(x)
:
Returns the number of objects (NO). Equivalent to length(togroup(x))
.
Going from groups to objects:
x[[i]]
:
Returns the indices of the objects (the j's) that belong to G_i.
The j's are returned in ascending order.
This provides the mapping from groups to objects (one-to-many mapping).
grouplength(x, i=NULL)
:
Returns the number of objects in G_i.
Works in a vectorized fashion (unlike x[[i]]
).
grouplength(x)
is equivalent to grouplength(x, seq_len(length(x)))
.
If i
is not NULL, grouplength(x, i)
is equivalent to
sapply(i, function(ii) length(x[[ii]]))
.
members(x, i)
:
Equivalent to x[[i]]
if i
is a single integer.
Otherwise, if i
is an integer vector of arbitrary length, it's
equivalent to sort(unlist(sapply(i, function(ii) x[[ii]])))
.
vmembers(x, L)
:
A version of members
that works in a vectorized fashion with
respect to the L
argument (L
must be a list of integer
vectors). Returns lapply(L, function(i) members(x, i))
.
Going from objects to groups:
togroup(x, j=NULL)
:
Returns the index i of the group that O_j belongs to.
This provides the mapping from objects to groups (many-to-one mapping).
Works in a vectorized fashion. togroup(x)
is equivalent to
togroup(x, seq_len(nobj(x)))
: both return the entire mapping in
an integer vector of length NO.
If j
is not NULL, togroup(x, j)
is equivalent to
y <- togroup(x); y[j]
.
tofactor(x)
: Like togroup
, except a factor is formed
with the level set defined as seq_len(length(x))
.
togrouplength(x, j=NULL)
:
Returns the number of objects that belong to the same group as O_j
(including O_j itself).
Equivalent to grouplength(x, togroup(x, j))
.
Given that length
, names
and [[
are defined
for Grouping objects, those objects can be considered List
objects. In particular, as.list
works out-of-the-box on them.
One important property of any Grouping object x
is
that unlist(as.list(x))
is always a permutation of
seq_len(nobj(x))
. This is a direct consequence of the fact
that every object in the grouping belongs to one group and only
one.
[DOCUMENT ME]
A Partitioning container represents a block-grouping, i.e. a grouping
where each group contains objects that are neighbors in the original
collection of objects. More formally, a grouping x
is a
block-grouping iff togroup(x)
is sorted in increasing order
(not necessarily strictly increasing).
A block-grouping object can also be seen (and manipulated) as a Ranges object where all the ranges are adjacent starting at 1 (i.e. it covers the 1:NO interval with no overlap between the ranges).
Note that a Partitioning object is both: a particular type of Grouping
object and a particular type of Ranges object. Therefore all the
methods that are defined for Grouping and Ranges objects can also
be used on a Partitioning object. See ?Ranges
for a description of
the Ranges API.
The Partitioning class is virtual with 2 concrete subclasses: PartitioningByEnd (only stores the end of the groups, allowing fast mapping from groups to objects), and PartitioningByWidth (only stores the width of the groups).
H2LGrouping(high2low=integer())
:
[DOCUMENT ME]
Dups(high2low=integer())
:
[DOCUMENT ME]
PartitioningByEnd(end=integer(), names=NULL)
:
Return the PartitioningByEnd object made of the partitions ending
at the values specified by end
. end
must contain
sorted non-negative integer values. If the names
argument
is non NULL, it is used to name the partitions.
PartitioningByWidth(width=integer(), names=NULL)
:
Return the PartitioningByWidth object made of the partitions with
the widths specified by width
. width
must contain
non-negative integer values. If the names
argument
is non NULL, it is used to name the partitions.
Note that these constructors don't recycle their names
argument
(to remain consistent with what `names<-`
does on standard
vectors).
H. Pages and P. Aboyoun
List-class, Ranges-class, IRanges-class, successiveIRanges, cumsum, diff
showClass("Grouping") # shows (some of) the known subclasses ## --------------------------------------------------------------------- ## A. H2LGrouping OBJECTS ## --------------------------------------------------------------------- high2low <- c(NA, NA, 2, 2, NA, NA, NA, 6, NA, 1, 2, NA, 6, NA, NA, 2) x <- H2LGrouping(high2low) x ## The Grouping core API: length(x) nobj(x) # same as 'length(x)' for H2LGrouping objects x[[1]] x[[2]] x[[3]] x[[4]] x[[5]] grouplength(x) # same as 'unname(sapply(x, length))' grouplength(x, 5:2) members(x, 5:2) # all the members are put together and sorted togroup(x) togroup(x, 5:2) togrouplength(x) # same as 'grouplength(x, togroup(x))' togrouplength(x, 5:2) ## The List API: as.list(x) sapply(x, length) ## --------------------------------------------------------------------- ## B. Dups OBJECTS ## --------------------------------------------------------------------- x_dups <- as(x, "Dups") x_dups duplicated(x_dups) # same as 'duplicated(togroup(x_dups))' ### The purpose of a Dups object is to describe the groups of duplicated ### elements in a vector-like object: x <- c(2, 77, 4, 4, 7, 2, 8, 8, 4, 99) x_high2low <- high2low(x) x_high2low # same length as 'x' x_dups <- Dups(x_high2low) x_dups togroup(x_dups) duplicated(x_dups) togrouplength(x_dups) # frequency for each element table(x) ## --------------------------------------------------------------------- ## C. Partitioning OBJECTS ## --------------------------------------------------------------------- x <- PartitioningByEnd(end=c(4, 7, 7, 8, 15), names=LETTERS[1:5]) x # the 3rd partition is empty ## The Grouping core API: length(x) nobj(x) x[[1]] x[[2]] x[[3]] grouplength(x) # same as 'unname(sapply(x, length))' and 'width(x)' togroup(x) togrouplength(x) # same as 'grouplength(x, togroup(x))' names(x) ## The Ranges core API: start(x) end(x) width(x) ## The List API: as.list(x) sapply(x, length) ## Replacing the names: names(x)[3] <- "empty partition" x ## Coercion to an IRanges object: as(x, "IRanges") ## Other examples: PartitioningByEnd(end=c(0, 0, 19), names=LETTERS[1:3]) PartitioningByEnd() # no partition PartitioningByEnd(end=integer(9)) # all partitions are empty ## --------------------------------------------------------------------- ## D. RELATIONSHIP BETWEEN Partitioning OBJECTS AND successiveIRanges() ## --------------------------------------------------------------------- mywidths <- c(4, 3, 0, 1, 7) ## The 3 following calls produce the same ranges: x1 <- successiveIRanges(mywidths) # IRanges instance. x2 <- PartitioningByEnd(end=cumsum(mywidths)) # PartitioningByEnd instance. x3 <- PartitioningByWidth(width=mywidths) # PartitioningByWidth instance. stopifnot(identical(as(x1, "PartitioningByEnd"), x2)) stopifnot(identical(as(x1, "PartitioningByWidth"), x3))