Translating R Messages, R >= 3.0.0

R supports the translation of its messages, including error and warning messages and menu labels. This document is intended for translation teams but starts with a user's view of the process extracted from the R Installation and Administration Manual.

These instructions apply to R 3.x.

The MacOS X GUI also has translatable messages: see the separate section at the end of this document.

Message Domains

Messages are divided into domains, and translations may be available for some or all messages in a domain. R makes use of the following domains.

Dividing up the messages in this way allows R to be extensible: as packages are loaded, their message translation catalogues can be loaded too.

Domains R and RGui are part of the base package, as well as R-base for the interpreted code in that package.

Translations are looked for by domain according to the currently specified language, as specifically as possible, so for example an Austrian (de_AT) translation catalogue will be used in preference to a generic German one (de) for an Austrian user. However, if a specific translation catalogue exists but does not contain a translation, the less specific catalogues are consulted. For example, R has catalogues for en_GB that translate the Americanisms (e.g. gray) in the standard messages into English.

Translations in the right language but the wrong charset can generally be made use of by on-the-fly re-encoding. The LANGUAGE variable can be a colon-separated list, for example se:de, giving a set of languages in decreasing order of preference.

If no suitable translation catalogue is found or a particular message is not translated in the selected catalogue, English is used. The translated catalogues are stored as binary files with extension .mo in the translations package. That package contains one directory for each language. Each language directory has a single subdirectory LC_MESSAGES, and within that a file for each domain. So the Japanese translations are installed in the directory

library/translations/ja/LC_MESSAGES
which currently contains a .mo file for each of 22 domains.

(The `language' en@quot is English with Unicode bidirectional quotation marks for use in a UTF-8 locale.)

The code migration in R 3.0.0 means that packages graphics and utils now have C code (and messages which need translation). You may find that some of the messages were previously translated and can be found at the bottom of the R and R-base catalogues: see the use of compendia below for how to get these copied across.

Translation prerequisites

You need GNU gettext installed: specifically gettext-tools if your system differentiates it from gettext-runtime. (Linux users will not need the development RPM.) The gettext manual is the best reference source. Command-line 32-bit Windows versions of the tools are available from www.stats.ox.ac.uk/pub/Rtools/goodies. Emacs users can use PO mode to help with managing translations: this is described in the gettext manual.

There are some Linux-alike tools developed for the KDE project, notably KBabel.

Windows users can find pre-compiled versions of gettext at any mirror of the GNU archive. The poEdit editor is recommended by the Fedora translators, and comes with the gettext tools needed.

Preparing and installing a translation

Suppose you wanted to prepare a Slovenian translation for the splines package. The ISO 639 code for Slovenian is sl. Go to the src/library/splines/po directory and run (preferably in a Slovenian locale)
    msginit -i R-splines.pot -o R-sl.po
    msginit -i splines.pot -o sl.po
If this does not work for you, just copy the files to the names given and fill in the header.

Now check over the header entries and fill in the msgstr entries in these files. The originals will look like

  # Slovenian translations for R package
  # Slovenski prevodi paketa R.
  # Copyright (C) 2005 THE R'S COPYRIGHT HOLDER
  # This file is distributed under the same license as the R package.
  # Prof Brian Ripley <ripley@stats.ox.ac.uk>, 2005.
  #
  msgid ""
  msgstr ""
  "Project-Id-Version: R 2.1.0\n"
  "Report-Msgid-Bugs-To: bugs@r-project.org\n"
  "POT-Creation-Date: 2005-01-25 17:26+0000\n"
  "PO-Revision-Date: 2005-02-04 08:37+0000\n"
  "Last-Translator: Prof Brian Ripley <ripley@stats.ox.ac.uk>\n"
  "Language-Team: Slovenian\n"
  "MIME-Version: 1.0\n"
  "Content-Type: text/plain; charset=ISO-8859-2\n"
  "Content-Transfer-Encoding: 8bit\n"
  "Plural-Forms: nplurals=4; plural=(n%100==1 ? 0 : n%100==2 ? 1 : n%100==3 || n"
  "%100==4 ? 2 : 3);\n"

  #: splines.c:154
  msgid "'ord' must be a positive integer"
  msgstr ""
If you leave a translation as "" the untranslated message (msgid) will be used.

Some messages have plural forms, e.g.

  msgid        "Warning message:\n"
  msgid_plural "Warning messages:\n"
  msgstr[0]    ""
  msgstr[1]    ""
The Plural-Forms: line tells you what these mean: in languages with just singular and one plural, the first is singular and the second is plural. For languages without plurals, just give one line starting msgstr[0]. Slovenian would need four lines.

Then compile and install the translated catalogues by

    mkdir ${R_HOME}/library/translations/sl/LC_MESSAGES
    msgfmt -c --statistics -o ${R_HOME}/library/translations/sl/LC_MESSAGES/R-splines.mo R-sl.po
    msgfmt -c --statistics -o ${R_HOME}/library/translations/sl/LC_MESSAGES/splines.mo sl.po
and when you next install R the translations will be ready for use. Using -c enables a number of consistency checks that have proven useful. Using --statistics gives some details of the coverage of the translations.

The process is the same for any other package, including base (which contains 3 rather than 2 domains).

Translators can choose any suitable encoding, but for RGui-ll.po it is best to use the native encoding for the language on Windows. Otherwise, if possible choose an encoding that the development team will be able to run your language in as a Linux locale: UTF-8 is their first choice.

Translators may find it useful to consult the ISI glossary of statistical terms.

Compendia

Sometimes you will encounter a translation you know you have done for a different domain (which sometimes happens as code is migrated). Then the compendium facility of msgmerge can be very useful. First delete any fuzzy translation which has been generated, then use e.g.
    msgmerge --update ll.po stats.pot -C other.po
where the final .po file is used to pick up existing translations (even those not in use there). More than one compendium file can be supplied.

C-like format strings

Some messages contain C-like format strings such as "%s" for use either by gettextf in R or by the C-level error functions.

It is important that these match exactly in the msgid and msgstr lines: mismatches can cause R to crash or nonsense to be output.

The function checkPoFiles in package tools implements a check: please make use of it before finalizing your translations. (It is run as part of the installation procedure for translations: see the next sub-section.)

Installing all translations

If you are working within a copy of the R sources you can do
cd po
make update-pkg-po update-RGui
(On Windows, use make -f Makefile.win: it skips the en@auot translations.) This will update the translations for all languages, check them with tools::checkPoFiles, compile all those which pass the checks and install the translations in the translations package. The latter can then be copied to an R installation to test the new translations.

This can be done on a per-package basis via tools::update_pkg_po. (This does work on Windows but skips the en@auot translations when not in a UTF-8 locales.)

Submitting a translation

To submit a translation for inclusion in the R sources please make a tar with all the source files you added, perhaps by (untested)
    tar zcvf sl-po.tar.gz src/library/*/po/*sl.po
and send it to the R core team. It does not help to submit the .mo files!

Sending single files via email has caused a lot of corruption in the past. To be safe, send a binary attachment as .tar.gz or .zip file.

Translations of messages in recommended packages should be sent to their maintainers.


The MacOS X GUI

The MacOS X Cocoa GUI currently uses Apple's Cocoa localization framework which is different from the gettext approach used by R, but still very similar from translator's point of view. In the following we will assume that the translation is done on a machine running Mac OS X with Apple Developer Tools installed and the Mac-GUI project sources present.

Practically any resource, file or text used in the Mac-GUI can be localized. The first elements to start with are text messages, followed by GUI elements and finally images and supplemental documentation. All those items are located in the "Resources" group of the Mac-GUI project.

Adding support for a new language

Adding a new translation of a resource is done as follows:

There are three basic sets of resources that need to be translated:

Text messages are located in the Localizable.strings resource which is a text file of the form "english"="translation";, for example (from the German translation):

"Cancel"="Abbrechen"; "Choose File"="Wählen Sie eineDatei";
The corresponding file is created when a new localization is added. Please use UTF-8 encoding when editing those files externally (in Xcode select Format -> File Encoding -> Unicode (UTF-8)).

GUI elements are located inside NIB files (NeXT Interface Builder files). First step to add a localized version of a NIB file is to follow the "Adding a new translation" step above which will produce a copy of the English original that will have to be translated. One way to translate such file would be to edit it directly in the Interface Builder, but this is rather tedious and would require some knowledge of Cocoa, therefore we use a more generic approach. It is possible to generate the same "strings" files as used for text messages and use those to translate GUI elements. To generate such files, run

update.localization
in the Mac-GUI directory. This script will create a new directory Translated.strings which contains strings files for each NIB file and language, for example: MainMenu.de.strings. This file has the same format as the text messages strings file and should be translated the same way. Before editing the file it is crucial to set the file encoding to UTF-8! Once all necessary strings files are translated, they can be used to translate the NIB file by running the script as follows:
update.localization -t
Don't forget the -t switch, otherwise your files in Translated.strings will be overwritten! The existing NIB files are updated by this script to reflect any changes in the strings files. Optionally the NIB files can be manually tweaked if necessary, such as if the labels of some GUI elements are longer than the element itself. Such minor changes will be preserved by the update.localization script later.

Other resources are localized by simply editing the copy that was made when using "Add Localization...". There are no automated ways of modifying images or other resource files. Please note that any resources that are no different from the English version don't need to be copied using "Add Localization...". The GUI will automatically use the English version if there is no localized version of the resource.

Submitting a Mac-GUI translation

There are basically two options how to perform and submit Mac-GUI translations:

Brian Ripley
2012-07-14