This blog is updated daily.
A general description is here.
There are new ‘configure’ options ‘--with-internal-iswxxxx’ and ‘--with-internal-towlower’ which allows the system wide-character classification and case-switching routines to be replaced by internal ones. The first has long been used on macOS, AIX (and Windows) but this enables it to be unselected there and selected for other platforms. The second is new in this version of R and is selected by default on macOS.
System versions of these functions are often minimally implemented (sometimes only for ASCII characters) and do not cover the full range of Unicode points: for example Solaris (and Windows) only cover the Basic Multilingual Plane.
The character-classification functions used (by default) to replace the system ‘iswxxxx’ functions on Windows, macOS and AIX have been updated to Unicode 13.0.0 - in particular, many more UTF-8 characters are regarded as printable.
There is a build-time option to replace the system's wide-character ‘wctrans’ C function by tables shipped with R: use ‘configure’ option ‘--with-internal-towlower’ or (on Windows) ‘-DUSE_RI18N_CASE’ in ‘CFLAGS’ when building R. On Windows and Solaris this allows ‘tolower()’ and ‘toupper()’ to work with Unicode characters beyond the Basic Multilingual Plane and is intended to become the default there.
The parser now treats ‘\Unnnnnnnn’ escapes larger than the upper limit for Unicode points (‘\U10FFFF’) as an error as they cannot be represented by valid UTF-8.
Where such escapes are used for outputting non-printable (inclduing unassigned) characters, 6 hex digits are used (rather than 8 with leading zeros). For clarity, braces are used, for example ‘\U{0effff}’.
There are warnings (including from the parser) on the use of unpaired surrogate Unicode points such as ‘\uD834’. (These cannot be converted to valid UTF-8).
The code for evaluating default (extended) regular expressions now uses the same character-classification functions as the rest of R (they previously differed on Windows, macOS and AIX.)