# Martin Maechler: When you think `class(.) == *`, think again!

## Historical relict: R `matrix` is not an `array`

In a recent discussion on the `R-devel` mailing list, in a thread started on July 8, head.matrix can return 1000s of columns – limit to n or add new argument? Michael Chirico and then Gabe Becker where proposing to generalize the `head()` and `tail()` utility functions, and Gabe noted that current (pre R-4.x.y) `head()` would not treat `array` specially. I’ve replied, noting that R currently typically needs both a `matrix` and an `array` method:

Note however the following historical quirk :

``````sapply(setNames(,1:5),
function(K) inherits(array(7, dim=1:K), "array"))``````

((As I hope this will change, I explicitely put the current R 3.x.y result rather than evaluating the above R chunk: ))

``````     1     2     3     4     5
TRUE FALSE  TRUE  TRUE  TRUE``````

Note that `matrix` objects are not `array` s in that (inheritance) sense, even though — many useRs may not be aware of —

``````identical(
matrix(47, 2,3), # NB  " n, n+1 " is slightly special
array (47, 2:3))
##  TRUE``````

all matrices can equivalently constructed by `array(.)` though slightly more clumsily in the case of `matrix(*, byrow=TRUE)`.

Note that because of that, base R itself has three functions where the `matrix` and the `array` methods are identical, as I wrote in the post: The consequence of that is that currently, “often” `foo.matrix` is just a copy of `foo.array` in the case the latter exists, with `base` examples of foo in {unique, duplicated, anyDuplicated} .

``````for(e in expression(unique, duplicated, anyDuplicated)) { # `e` is a `symbol`
f.m <- get(paste(e, "matrix", sep="."))
f.a <- get(paste(e, "array",  sep="."))
stopifnot(is.function(f.m),
identical(f.m, f.a))
}``````

## In R 4.0.0, will a `matrix()` be an `"array"`?

In that same post, I’ve also asked

Is this something we should consider changing for R 4.0.0 – to have it TRUE also for 2d-arrays aka matrix objects ??

In the mean time, I’ve tentatively answered “yes” to my own question, and started investigating some of the consequences. From what I found, in too eager (unit) tests, some even written by myself, I was reminded that I had wanted to teach more people about an underlying related issue where we’ve seen many unsafe useR’s use R unsafely:

## If you think `class(.) == *`, think again:            Rather `inherits(., *)` …. or `is(., *)`

Most non-beginning R users are aware of inheritance between classes, and even more generally that R objects, at least conceptually, are of more than one “kind”. E.g, `pi` is both `"numeric"` and `"double"` or `1:2` is both `integer` and `numeric`. They may know that time-date objects come in two forms: The `?DateTimeClasses` (or `?POSIXt`) help page describes `POSIXct` and `POSIXlt` and says

`"POSIXct"` is more convenient for including in data frames, and `"POSIXlt"` is closer to human-readable forms. A virtual class `"POSIXt"` exists from which both of the classes inherit …

and for example

``````(tm <- Sys.time())
##  "2019-12-05 11:47:54 CET"
class(tm)
##  "POSIXct" "POSIXt"``````

shows that `class(.)` is of length two here, something breaking a `if(class(x) == "....") ..` call.

### Formal Classes: `S4`

R’s formal class system, called `S4` (implemented mainly in the standard R package `methods`) provides functionality and tools to implement rich class inheritance structures, made use of heavily in package `Matrix`, or in the Bioconductor project with it’s 1800+ R “software” packages. Bioconductor even builds on core packages providing much used S4 classes, e.g., Biostrings, S4Vectors, XVector, IRanges, and GenomicRanges. See also Common Bioconductor Methods and Classes.

Within the formal S4 class system, where extension and inheritance are important and often widely used, an expression such as

``if (class(obj) == "matrix")  { ..... }   # *bad* - do not copy !``

is particularly unuseful, as `obj` could well be of a class that extends matrix, and S4 using programmeRs learn early to rather use

``if (is(obj, "matrix"))  { ..... }        # *good* !!!``

Note that the Bioconductor guidelines for package developers have warned about the misuse of `class(.) == *` , see the section R Code and Best Practices

### Informal “Classical” Classes: `S3`

R was created as dialect or implementation of S, see Wikipedia’s R History, and for S, the “White Book” (Chambers & Hastie, 1992) introduced a convenient relatively simple object orientation (OO), later coined `S3` because the white book introduced S version 3 (where the blue book described S version 2, and the green book S version 4, i.e., `S4`).

The white book also introduced formulas, data frames, etc, and in some cases also the idea that some S objects could be particular cases of a given class, and in that sense extend that class. Examples, in R, too, have been multivariate time series (`"mts"`) extending (simple) time series (`"ts"`), or multivariate or generalized linear models (`"mlm"` or `"glm"`) extending normal linear models `"lm"`.

### The “Workaround”: `class(.)`

So, some more experienced and careful programmers have been replacing `class(x)` by `class(x)` (or `class(x)[1L]`) in such comparisons, e.g., in a good and widely lauded useR! 2018 talk.
In some cases, this is good enough, and it is also what R’s `data.class(.)` function does (among other), or the (user hidden) `methods:::.class1(.)`.

However, programmeRs should be aware that this is just a workaround and leads to their working incorrectly in cases where typical S3 inheritance is used: In some situtation it is very natural to slightly modify or extend a function `fitme()` whose result is of class `"fitme"`, typically by writing `fitmeMore()`, say, whose value would be of class `c("fMore", "fitme")` such that almost all “fitme” methods would continue to work, but the author of `fitmeMore()` would additionally provide a `print()` method, i.e., provide method function `print.fMore()`.

But if other users work with `class(.)` and have provided code for the case `class(.) == "fitme"` that code would wrongly not apply to the new `"fMore"` objects.
The only correct solution is to work with `inherits(., "fitme")` as that would apply to all objects it should.

In a much depended on CRAN package, the following line (slightly obfuscated) which should efficiently determine list entries of a certain class

``isC <- vapply(args, class, "") == "__my_class__"``

was found (and notified to the package maintainer) to need correction to

``isC <- vapply(args, inherits, TRUE, what = "__my_class__")``

### Summary:

Instead `class(x) == "foo"`, you should use `inherits(x, "foo")`
or maybe alternatively `is(x, "foo")`

#### Corollary:

``````switch(class(x),
"class_1" = { ..... },
"class_2" = { ..... },
.......,
.......,
"class_10" = { ..... },
stop(" ... invalid class:", class(x)))``````

may look clean, but is is almost always not good enough, as it is (typically) wrong, e.g., when `class(x)` is `c("class_7", "class_2")`.