This blog is updated daily..
A general description is here.
There is support for vectors longer than 2^31 - 1 elements on 64-bit
platforms. This applies to raw, logical, integer, double, complex and
character vectors, as well as lists. (Elements of character vectors
remain limited to 2^31 - 1 bytes.)
Use of such vectors is work-in-progress.
Most operations which can sensibly be done with long vectors now work:
others may return the error ‘long vectors not supported yet’. Some of
these are because they explicitly work with integer indices (e.g.
‘anyDuplicated()’ and ‘match()’) or because of other limits (e.g. of
character strings or matrix dimensions) would be exceeded or the
operations would be extremely slow.
‘length()’ returns a double for long vectors, and lengths can be set to
2^31 or more by the replacement function with a double value.
Most aspects of indexing are available. Generally double-valued
indices can be used to access elements beyond 2^31 - 1.
There is some support for matrices and arrays with each dimension less
than 2^31 but total number of elements more than that. Only some
aspects of matrix algebra work for such matrices, often taking a very
long time. In other cases the underlying Fortran code has an unstated
restriction (as was found for complex ‘svd()’).
‘dist()’ can produce dissimilarity objects for more than 65536 rows
(but for example ‘hclust()’ cannot process such objects).
‘serialize()’ to a raw vector is no longer limited in size (except by
resources) on 64-bit platforms.
The C-level function ‘R_alloc’ can now allocate 2^35 or more bytes on
64-bit platforms.
‘agrep()’ and ‘grep()’ will return double vectors of indices for long
vector inputs.
Many calls to ‘.C()’ have been replaced by ‘.Call()’ to allow long
vectors to be supported (now or in the future). Regrettably several
packages had copied the non-API ‘.C()’ calls and so failed.
‘.C()’ and ‘.Fortran()’ do not accept long vector inputs. This is a
precaution as it is very unlikely that existing code will have been
written to handle long vectors (and the R wrappers often assume that
‘length(x)’ is an integer).
Most of the methods for ‘sort()’ work for long vectors.
‘rank()’, ‘sort.list()’ and ‘order()’ support long vectors (slowly except for radix sorting).
‘sample()’ can do uniform sampling from a long vector.
‘setRefClass()’ and ‘getRefClass()’ now return class generator functions, similar to ‘setClass()’, but stil with the reference fields and methods as before (suggestion of Romain Francois).
‘browseEnv(html = FALSE)’ would segfault if called from R (not ‘R.app’) on a CRAN-style Mac OS X build of R.
