Todo List for R for Duncan Temple Lang

The following is an unordered list of items that I intend to work on that relate to R. (There are others, but I am not admitting to the them.) The time frame for these is completely uncertain. I am very keen to hear peoples' suggestion of which should be done first, etc.
  • Structuring the Event Loop.
    We want to be able to host other systems such as Tcl/Tk, Gtk applications, etc. and allow them to integrate their event sources into our event loop. Similarly, we need to be able to do the reverse when R is embedded in other applications.

    The immediate things to do are

    • Replace the current PolledEvents setup with a linked-list approach managed centrally in the same way that addInputHander works.
    • Provide routines in the Tcl/Tk library to implement the Tcl Notifier mechanism so that Tcl event sources are handled by the R event loop. Also, we can do a similar thing for Gtk facilities.
    • Provide implementations of R's addInputHandler(), addPolledEventHandler(), etc. that pass the information to Tcl/Tk, Gtk, etc. when R is embedded in those applications.
    • Add new event sources
      • setReader() for connections,
      • timer events,
      • signals,
      • potentially, assignments, idle tasks, etc.
  • Extending the "database" concept to provide support for S4-style user defined elements of the search path.
    This is needed generally by many of the inter-system interface packages. These include the Perl, Java, Python, JavaScript interfaces, and also packages such as CORBA, RPgSQL, RMySQL, etc. which can richly exploit the concept of a proxy object or a foreign reference. Much of this has been done in the Omegahat language and the setup there can be implemented here.
  • C-level API for embedding R.
    Including exceptions which Robert is implementing for R.
  • Integrating Tom Vogel's mechanism for using a Tk widget as an X11 device canvas.
    This should be relatively easy given the changes motivated by the SNetscape package for using an arbitrary X Window structure.
  • Packages
    I continuing to develop several packages.
    RGnumeric
    A plugin for Gnumeric that allows R functions to be called from Gnumeric and for these functions to access Gnumeric workbooks, sheets and cells.
    SNetscape
    Fixing the event queue problem associated with the embedded graphics devices is urgent here. Also, using S foreign references needs to be added.
    SXalan (and using libxslt).
    REmbeddedPostgres
    using R functions within SQL queries that are executed in the server via an embedded R.
    RSHelp
    documentation mechanism using XML
    SXMLObjects
    Serializing S (R and S-Plus) objects in XML format so that they can be read and written by other systems (e.g. each other, Matlab, SAS, etc.)
    Rggobi
    An interface for embedding ggobi in R.
    Work on others such as RSPerl, SJava, Python, XML
  • R as a shared library & shared libraries.
    It is useful for both technical and non-technical reasons to always use the R engine as a shared library.
    • This simplifies providing different front ends such as GUIs, command lines, embedding, and makes these uniform;
    • simplifies some dynamic loading issues;
    • licensing uncertainties may become clearer.
    We can pursue this further, making more than just the Rmath library available as separately usable components of the R engine. For example, we might be able to allow certain parts of the graphics engine and devices be used in other applications, separately from the rest of R.
  • Integrating an XSLT translator into R
    As opposed to the SXalan package which embeds R in an XSL translator, this would allow us to "out-source" sub-computations for XML operations, such as when processing help files that contain style rather than top-level structural information.
  • Interactive graphics.
    It may be interesting to allow users events on graphics devices to be programmable at the user-level. And then take this to a level of abstraction that makes it platform neutral.
  • An interface generator like SWIG
    Using Lcc, we can parse C code and extract the structure and routine definitions. Using this information, we can automatically generate S and C code that interfaces to these and provide quick and easy access to "arbitrary" C code.
  • Internal and User-level threads in the S language.
    Luke and I are working on getting concurrency and potentially parallelism in R so that one can (at least appear to) be executing different commands simultaneously. Ideally we will be able to exploit multiple processors in a machine and run certain computations in parallel.

    To do this, we need to

    • provide support multiple interpreters and this involves removing all the global variables. This is underway.
    • Luke is investigating the extent to which the eval() can be made non-recursive. This has the potential to make it possible for us to schedule different interpreters ourselves without the need for operating system threads.

    Hopefully we will also be able to provide an interface which supports both process-level threads and distributed computing across machines.

  • Safe Interpreter
    When R is embedded in applications such as a Database server, Netscape, security issues arise as one can evaluate arbitrary S code including system calls, and also load and invoke arbitrary C code. We need to provide a minimal interpreter that prohibits execution of these and potentially other functions. This needs to be configurable for different installation sites, user groups, etc.

  • Duncan Temple Lang <duncan@research.bell-labs.com>
    Last modified: Sat Apr 28 09:49:28 EDT 2001