.C and .Fortran
dyn.load and dyn.unload
.Call and .External
This is a guide to extending R, describing the process of creating R add-on packages, writing R documentation, R's system and foreign language interfaces, and the R API.
The current version of this document is 2.8.0 (2008-10-20).
ISBN 3-900051-11-9
The contributions of Saikat DebRoy (who wrote the first draft of a guide
to using .Call and .External) and of Adrian Trapletti (who
provided information on the C++ interface) are gratefully acknowledged.
Packages provide a mechanism for loading optional code and attached documentation as needed. The R distribution provides several packages.
In the following, we assume that you know the ‘library()’ command,
including its ‘lib.loc’ argument, and we also assume basic
knowledge of the INSTALL utility. Otherwise, please look at R's
help pages
?library
?INSTALL
before reading on.
A computing environment including a number of tools is assumed; the “R Installation and Administration” manual describes what is needed. Under a Unix-alike most of the tools are likely to be present by default, but Microsoft Windows and MacOS X will require careful setup.
Once a source package is created, it must be installed by
the command R CMD INSTALL.
See Add-on-packages, for further details.
Other types of extensions are supported: See Package types.
A package consists of a subdirectory containing a file DESCRIPTION and the subdirectories R, data, demo, exec, inst, man, po, src, and tests (some of which can be missing). The package subdirectory may also contain files INDEX, NAMESPACE, configure, cleanup, LICENSE, LICENCE, COPYING and NEWS. Other files such as README or ChangeLog will be ignored by R, but may be useful to end-users.
The DESCRIPTION and INDEX files are described in the sections below. The NAMESPACE file is described in Package name spaces.
The optional files configure and cleanup are (Bourne shell) script files which are executed before and (provided that option --clean was given) after installation on Unix-alikes, see Configure and cleanup.
The optional file LICENSE/LICENCE or COPYING (where the former names are preferred) contains a copy of the license to the package, e.g. a copy of the GNU public license. Whereas you should feel free to include a license file in your source distribution, please do not arrange to install yet another copy of the GNU COPYING or COPYING.LIB files but refer to the copies on http://www.r-project.org/Licenses/ and included in the R distribution (in directory share/licenses).
For the conventions for files NEWS and ChangeLog in the GNU project see http://www.gnu.org/prep/standards/standards.html#Documentation.
The package subdirectory should be given the same name as the package. Because some file systems (e.g., those on Windows) are not case-sensitive, to maintain portability it is strongly recommended that case distinctions not be used to distinguish different packages. For example, if you have a package named foo, do not also create a package named Foo.
To ensure that file names are valid across file systems and supported
operating system platforms, the ASCII control characters as
well as the characters ‘"’, ‘*’, ‘:’, ‘/’,
‘<’, ‘>’, ‘?’, ‘\’, and ‘|’ are not allowed
in file names. In addition, files with names ‘con’, ‘prn’,
‘aux’, ‘clock$’, ‘nul’, ‘com1’ to ‘com9’, and
‘lpt1’ to ‘lpt9’ after conversion to lower case and
stripping possible “extensions” (e.g., ‘lpt5.foo.bar’), are
disallowed. Also, file names in the same directory must not differ
only by case (see the previous paragraph). In addition, the names of
‘.Rd’ files will be used in URLs and so must be ASCII
and not contain %. For maximal portability filenames should
only contain only ASCII characters not excluded already
(that is A-Za-z0-9._!#$%&+,;=@^(){}'[] we exclude space as
many utilities do not accept spaces in file paths): non-English alphabetic
characters cannot be guaranteed to be supported in all locales. It would
be good practice to avoid the shell metacharacters (){}'[]$.
A source package if possible should not contain binary executable files: they are not portable, and a security risk if they are of the appropriate architecture. R CMD check will warn about them1 unless they are listed (one filepath per line) in a file BinaryFiles at the top level of the package or bundle.
The R function package.skeleton can help to create the
structure for a new package: see its help page for details.
The DESCRIPTION file contains basic information about the package in the following format:
Package: pkgname Version: 0.5-1 Date: 2004-01-01 Title: My First Collection of Functions Author: Joe Developer <Joe.Developer@some.domain.net>, with contributions from A. User <A.User@whereever.net>. Maintainer: Joe Developer <Joe.Developer@some.domain.net> Depends: R (>= 1.8.0), nlme Suggests: MASS Description: A short (one paragraph) description of what the package does and why it may be useful. License: GPL (>= 2) URL: http://www.r-project.org, http://www.another.url
Continuation lines (for example, for descriptions longer than one line) start with a space or tab. The ‘Package’, ‘Version’, ‘License’, ‘Description’, ‘Title’, ‘Author’, and ‘Maintainer’ fields are mandatory, the remaining fields (‘Date’, ‘Depends’, ‘URL’, ...) are optional.
The DESCRIPTION file should be written entirely in ASCII for maximal portability.
The ‘Package’ and ‘Version’ fields give the name and the version of the package, respectively. The name should consist of letters, numbers, and the dot character and start with a letter. The version is a sequence of at least two (and usually three) non-negative integers separated by single ‘.’ or ‘-’ characters. The canonical form is as shown in the example, and a version such as ‘0.01’ or ‘0.01.0’ will be handled as if it were ‘0.1-0’. (Translation packages are allowed names of the form ‘Translation-ll’.)
The ‘License’ field should specify the license of the package in the following standardized form. Alternatives are indicated via vertical bars. Individual specifications must be one of
GPL-2 GPL-3 LGPL-2 LGPL-2.1 LGPL-3 AGPL-3 Artistic-1.0 Artistic-2.0
as made available via http://www.r-project.org/Licenses/ and contained in subdirectory share/licenses of the R source or home directory.
License: GPL-2
License: GPL (>= 2) | BSD
License: LGPL (>= 2.0, < 3) | Mozilla Public License
License: GPL-2 | file LICENCE
Please note in particular that “Public domain” is not a valid license. It is very important that you include this information! Otherwise, it may not even be legally correct for others to distribute copies of the package.
The ‘Description’ field should give a comprehensive description of what the package does. One can use several (complete) sentences, but only one paragraph.
The ‘Title’ field should give a short description of the package. Some package listings may truncate the title to 65 characters in order to keep the overall size of the listing limited. It should be capitalized, not use any markup, not have any continuation lines, and not end in a period. Older versions of R used a separate file TITLE for giving this information; this is now defunct, and the ‘Title’ field in DESCRIPTION is required.
The ‘Author’ field describes who wrote the package. It is a plain text field intended for human readers, but not for automatic processing (such as extracting the email addresses of all listed contributors).
The ‘Maintainer’ field should give a single name with a valid (RFC 2822) email address in angle brackets (for sending bug reports etc.). It should not end in a period or comma.
The optional ‘Date’ field gives the release date of the current version of the package. It is strongly recommended to use the yyyy-mm-dd format conforming to the ISO standard.
The optional ‘Depends’ field gives a comma-separated list of
package names which this package depends on. The package name may be
optionally followed by a comment in parentheses. The comment should
contain a comparison operator (only ‘>=’ and ‘<=’ were
supported prior to R 2.7.0), whitespace and a valid version number).
(List package names even if they are part of a bundle.) You can
also use the special package name ‘R’ if your package depends on a
certain version of R. E.g., if the package works only with R
version 2.4.0 or newer, include ‘R (>= 2.4.0)’ in the
‘Depends’ field. Both library and the R package checking
facilities use this field, hence it is an error to use improper syntax
or misuse the ‘Depends’ field for comments on other software that
might be needed. Other dependencies (external to the R system)
should be listed in the ‘SystemRequirements’ field or a separate
README file. The R INSTALL facilities check if the
version of R used is recent enough for the package being installed,
and the list of packages which is specified will be attached (after
checking version dependencies) before the current package, both when
library is called and when saving an image of the package's code
or preparing for lazy-loading.
As from R 2.7.0 a package (or ‘R’) can appear more than once in the ‘Depends’, but only the first occurrence will be used in earlier versions of R. (Unfortunately all occurrences will be checked, so only ‘>=’ and ‘<=’ can be used.)
The optional ‘Imports’ field lists packages whose name spaces are imported from but which do not need to be attached. Name spaces accessed by the ‘::’ and ‘:::’ operators must be listed here, or in ‘Suggests’ or ‘Enhances’ (see below). Ideally this field will include all the standard packages, and it is important to include S4-using packages (as their class definitions can change and the DESCRIPTION file is used to decide which packages to re-install when this happens).
The optional ‘Suggests’ field uses the same syntax as ‘Depends’ and lists packages that are not necessarily needed. This includes packages used only in examples or vignettes (see Writing package vignettes), and packages loaded in the body of functions. E.g., suppose an example from package foo uses a dataset from package bar. Then it is not necessary to have bar for routine use of foo, unless one wants to execute the examples: it is nice to have bar, but not necessary.
Finally, the optional ‘Enhances’ field lists packages “enhanced” by the package at hand, e.g., by providing methods for classes from these packages.
The general rules are
library(pkgname) must be listed in the ‘Imports’
field.
library(pkgname) must be listed in the ‘Depends’
field.
R CMD check on
the package must be listed in one of ‘Depends’ or ‘Suggests’
or ‘Imports’.
In particular, large packages providing “only” data for examples or vignettes should be listed in ‘Suggests’ rather than ‘Depends’ in order to make lean installations possible.
The optional ‘URL’ field may give a list of URLs separated by commas or whitespace, for example the homepage of the author or a page where additional material describing the software can be found. These URLs are converted to active hyperlinks on CRAN.
Base and recommended packages (i.e., packages contained in the R source distribution or available from CRAN and recommended to be included in every binary distribution of R) have a ‘Priority’ field with value ‘base’ or ‘recommended’, respectively. These priorities must not be used by “other” packages.
An optional ‘Collate’ field (or OS-specific variants ‘Collate.OStype’, such as e.g. ‘Collate.windows’) can be used for controlling the collation order for the R code files in a package when these are concatenated into a single file upon installation from source. The default is to try collating according to the ‘C’ locale. If present, the collate specification must list all R code files in the package (taking possible OS-specific subdirectories into account, see Package subdirectories) as a whitespace separated list of file paths relative to the R subdirectory. Paths containing white space or quotes need to be quoted. An applicable OS-specific collation field (‘Collate.unix’ or ‘Collate.windows’) will be used instead of ‘Collate’.
The optional ‘LazyLoad’ and ‘LazyData’ fields control whether the R objects and the datasets (respectively) use lazy-loading: set the field's value to ‘yes’ or ‘true’ for lazy-loading and ‘no’ or ‘false’ for no lazy-loading. (Capitalized values are also accepted.)
If the package you are writing uses the methods package, specify ‘LazyLoad: yes’.
The optional ‘ZipData’ field controls whether the automatic Windows build will zip up the data directory or no: set this to ‘no’ if your package will not work with a zipped data directory.
If the DESCRIPTION file is not entirely in ASCII it
should contain an ‘Encoding’ field specifying an encoding. This is
currently used as the encoding of the DESCRIPTION file itself and
of the R and NAMESPACE files, and as the default encoding
of .Rd files. The examples are assumed to be in this encoding
when running R CMD check. As from R 2.8.0 it is used for
the encoding of the CITATION file. Only encoding names
latin1, latin2 and UTF-8 are known to be portable.
(Do not specify an encoding unless one is actually needed: doing so
makes the package less portable.)
The optional ‘OS_type’ field specifies the OS(es) for which the
package is intended. If present, it should be one of unix or
windows, and indicates that the package should only be installed
on a platform with ‘.Platform$OS.type’ having that value.
The optional ‘Type’ field specifies the type of the package: see Package types.
Note: There should be no ‘Built’ or ‘Packaged’ fields, as these are added by the package management tools.
The optional file INDEX contains a line for each sufficiently
interesting object in the package, giving its name and a description
(functions such as print methods not usually called explicitly might not
be included). Normally this file is missing, and the corresponding
information is automatically generated from the documentation sources
(using Rdindex() from package tools) when installing from
source and when using the package builder (see Checking and building packages).
Rather than editing this file, it is preferable to put customized information about the package into an overview man page (see Documenting packages) and/or a vignette (see Writing package vignettes).
The R subdirectory contains R code files, only. The code
files to be installed must start with an ASCII (lower or upper
case) letter or digit and have one of the extensions .R,
.S, .q, .r, or .s. We recommend using
.R, as this extension seems to be not used by any other software.
It should be possible to read in the files using source(), so
R objects must be created by assignments. Note that there need be no
connection between the name of the file and the R objects created by it.
Ideally, the R code files should only directly assign R objects
and definitely should not call functions with side effects such as
require and options. If computations are required to
create objects these can use code `earlier' in the package (see the
‘Collate’ field) plus, only if lazyloading is used,
functions in the ‘Depends’ packages provided that the objects
created do not depend on those packages except via name space
imports. (Packages without namespaces will work under somewhat less
restrictive assumptions.)
Two exceptions are allowed: if the R subdirectory contains a file
sysdata.rda (a saved image of R objects) this will be
lazy-loaded into the name space/package environment – this is intended
for system datasets that are not intended to be user-accessible via
data. Also, files ending in ‘.in’ will be allowed in the
R directory to allow a configure script to generate
suitable files.
Only ASCII characters (and the control characters tab,
formfeed, LF and CR) should be used in code files. Other characters are
accepted in comments, but then the comments may not be readable in
e.g. a UTF-8 locale. Non-ASCII characters in object names
will normally2 fail when the package is installed. Any byte will be
allowed3 in a quoted character string (but \uxxxx
escapes should not be used), but non-ASCII character strings
may not be usable in some locales and may display incorrectly in others.
Various R functions in a package can be used to initialize and clean
up. For packages without a name space, these are .First.lib and
.Last.lib. (See Load hooks, for packages with a name space.)
It is conventional to define these functions in a file called
zzz.R. If .First.lib is defined in a package, it is
called with arguments libname and pkgname after the
package is loaded and attached. (If a package is installed with version
information, the package name includes the version information, e.g.
‘ash_1.0.9’.) A common use is to call library.dynam
inside .First.lib to load compiled code: another use is to call
those functions with side effects. If .Last.lib exists in a
package it is called (with argument the full path to the installed
package) just before the package is detached. It is uncommon to detach
packages and rare to have a .Last.lib function: one use is to
call library.dynam.unload to unload compiled code.
The man subdirectory should contain (only) documentation files for the objects in the package in R documentation (Rd) format. The documentation filenames must start with an ASCII (lower or upper case) letter or digit and have the extension .Rd (the default) or .rd. Further, the names must be valid in ‘file://’ URLs, which means4 they must be entirely ASCII and not contain ‘%’. See Writing R documentation files, for more information. Note that all user-level objects in a package should be documented; if a package pkg contains user-level objects which are for “internal” use only, it should provide a file pkg-internal.Rd which documents all such objects, and clearly states that these are not meant to be called by the user. See e.g. the sources for package grid in the R distribution for an example. Note that packages which use internal objects extensively should hide those objects in a name space, when they do not need to be documented (see Package name spaces).
The R and man subdirectories may contain OS-specific subdirectories named unix or windows.
The sources and headers for the compiled code are in src, plus
optionally file Makevars or Makefile. When a package is
installed using R CMD INSTALL, Make is used to control
compilation and linking into a shared object for loading into R.
There are default variables and rules for this (determined when R is
configured and recorded in
R_HOME/etcR_ARCH/Makeconf), providing support for C,
C++, FORTRAN 77, Fortran 9x5, Objective C and Objective
C++ with associated extensions .c, .cc or .cpp or
.C, .f, .f90 or .f95, .m, and
.mm or .M, respectively. We recommend using .h for
headers, also for C++6 or Fortran 9x include files.
The default rules can be tweaked by setting macros in a file
src/Makevars (see Using Makevars). Note that this mechanism
should be general enough to eliminate the need for a package-specific
src/Makefile. If such a file is to be distributed, considerable
care is needed to make it general enough to work on all R platforms.
It should have an appropriate first target (conventionally called
‘all’) and a (possibly empty) target ‘clean’ which removes all
files generated by Make (to be used by ‘R CMD INSTALL --clean’ and
‘R CMD INSTALL --preclean’). There are platform-specific file
names on Windows: src/Makevars.win takes precedence over
src/Makevars and src/Makefile.win must be used.
The data subdirectory is for additional data files the package
makes available for loading using data(). Currently, data files
can have one of three types as indicated by their extension: plain R
code (.R or .r), tables (.tab, .txt, or
.csv, see ?data for the file formats), or save()
images (.RData or .rda). (All ports of R use the same
binary (XDR) format and can read compressed images. Use images saved
with save(, compress = TRUE), the default, to save space.) Note
that R code should be “self-sufficient” and not make use of extra
functionality provided by the package, so that the data file can also be
used without having to load the package. It is no longer necessary to
provide a 00Index file in the data directory—the
corresponding information is generated automatically from the
documentation sources when installing from source, or when using the
package builder (see Checking and building packages). If your data
files are enormous you can speed up installation by providing a file
datalist in the data subdirectory. This should have one
line per topic that data() will find, in the format ‘foo’ if
data(foo) provides ‘foo’, or ‘foo: bar bah’ if
data(foo) provides ‘bar’ and ‘bah’.
The demo subdirectory is for R scripts (for running via
demo()) that demonstrate some of the functionality of the
package. Demos may be interactive and are not checked automatically, so
if testing is desired use code in the tests directory. The
script files must start with a (lower or upper case) letter and have one
of the extensions .R or .r. If present, the demo
subdirectory should also have a 00Index file with one line for
each demo, giving its name and a description separated by white
space. (Note that it is not possible to generate this index file
automatically.)
The contents of the inst subdirectory will be copied recursively to the installation directory. Subdirectories of inst should not interfere with those used by R (currently, R, data, demo, exec, libs, man, help, html, latex, R-ex, chtml, and Meta). The copying of the inst happens after src is built so its Makefile can create files to be installed. Note that with the exceptions of INDEX, LICENSE/LICENCE, COPYING and NEWS (from R 2.7.0), information files at the top level of the package will not be installed and so not be known to users of Windows and MacOS X compiled packages (and not seen by those who use R CMD INSTALL or install.packages on the tarball). So any information files you wish an end user to see should be included in inst.
One thing you might like to add to inst is a CITATION file
for use by the citation function. Note that if the named
exceptions also occur in inst, the version in inst will be
that seen in the installed package. If you want NEWS to be
installed by your package in earlier versions of R, you need to
include it in inst.
Subdirectory tests is for additional package-specific test code,
similar to the specific tests that come with the R distribution.
Test code can either be provided directly in a .R file, or via a
.Rin file containing code which in turn creates the corresponding
.R file (e.g., by collecting all function objects in the package
and then calling them with the strangest arguments). The results of
running a .R file are written to a .Rout file. If there
is a corresponding .Rout.save file, these two are compared, with
differences being reported but not causing an error. The directory
tests is copied to the check area, and the tests are run with the
copy as the working directory and with R_LIBS set to ensure that
the copy of the package installed during testing will be found by
library(pkg_name).
Subdirectory exec could contain additional executables the package needs, typically scripts for interpreters such as the shell, Perl, or Tcl. This mechanism is currently used only by a very few packages, and still experimental.
Subdirectory po is used for files related to localization: see Internationalization.
Sometimes it is convenient to distribute several packages as a bundle. (An example is VR which contains four packages.) The installation procedures on both Unix-alikes and Windows can handle package bundles.
The DESCRIPTION file of a bundle has a ‘Bundle’ field and no ‘Package’ field, as in
Bundle: VR Priority: recommended Contains: MASS class nnet spatial Version: 7.2-36 Date: 2007-08-29 Depends: R (>= 2.4.0), grDevices, graphics, stats, utils Suggests: lattice, nlme, survival Author: S original by Venables & Ripley. R port by Brian Ripley <ripley@stats.ox.ac.uk>, following earlier work by Kurt Hornik and Albrecht Gebhardt. Maintainer: Brian Ripley <ripley@stats.ox.ac.uk> BundleDescription: Functions and datasets to support Venables and Ripley, 'Modern Applied Statistics with S' (4th edition). License: GPL-2 | GPL-3 URL: http://www.stats.ox.ac.uk/pub/MASS4/
The ‘Contains’ field lists the packages (space separated), which should be contained in separate subdirectories with the names given. During building and installation, packages will be installed in the order specified. Be sure to order this list so that dependencies are met appropriately.
The packages contained in a bundle are standard packages in all respects except that the DESCRIPTION file is replaced by a DESCRIPTION.in file which just contains fields additional to the DESCRIPTION file of the bundle, for example
Package: spatial Description: Functions for kriging and point pattern analysis. Title: Functions for Kriging and Point Pattern Analysis
Any files in the package bundle except the DESCRIPTION file and the named packages will be ignored.
The ‘Depends’ field in the bundle's DESCRIPTION file should list the dependencies of all the constituent packages (and similarly for ‘Imports’ and ‘Suggests’), and then DESCRIPTION.in files should not contain these fields.
Note that most of this section is Unix-specific: see the comments later on about the Windows port of R.
If your package needs some system-dependent configuration before
installation you can include an executable (Bourne shell) script
configure in your package which (if present) is executed by
R CMD INSTALL before any other action is performed. This can be
a script created by the Autoconf mechanism, but may also be a script
written by yourself. Use this to detect if any nonstandard libraries
are present such that corresponding code in the package can be disabled
at install time rather than giving error messages when the package is
compiled or used. To summarize, the full power of Autoconf is available
for your extension package (including variable substitution, searching
for libraries, etc.).
Under a Unix-alike only, an executable (Bourne shell) script
cleanup is executed as last thing by R CMD INSTALL if
option --clean was given, and by R CMD build when
preparing the package for building from its source. It can be used to
clean up the package source tree. In particular, it should remove all
files created by configure.
As an example consider we want to use functionality provided by a (C or
FORTRAN) library foo. Using Autoconf, we can create a configure
script which checks for the library, sets variable HAVE_FOO to
TRUE if it was found and with FALSE otherwise, and then
substitutes this value into output files (by replacing instances of
‘@HAVE_FOO@’ in input files with the value of HAVE_FOO).
For example, if a function named bar is to be made available by
linking against library foo (i.e., using -lfoo), one
could use
AC_CHECK_LIB(foo, fun, [HAVE_FOO=TRUE], [HAVE_FOO=FALSE])
AC_SUBST(HAVE_FOO)
......
AC_CONFIG_FILES([foo.R])
AC_OUTPUT
in configure.ac (assuming Autoconf 2.50 or later).
The definition of the respective R function in foo.R.in could be
foo <- function(x) {
if(!@HAVE_FOO@)
stop("Sorry, library 'foo' is not available"))
...
From this file configure creates the actual R source file foo.R looking like
foo <- function(x) {
if(!FALSE)
stop("Sorry, library 'foo' is not available"))
...
if library foo was not found (with the desired functionality).
In this case, the above R code effectively disables the function.
One could also use different file fragments for available and missing functionality, respectively.
You will very likely need to ensure that the same C compiler and compiler flags are used in the configure tests as when compiling R or your package. Under Unix, you can achieve this by including the following fragment early in configure.ac
: ${R_HOME=`R RHOME`}
if test -z "${R_HOME}"; then
echo "could not determine R_HOME"
exit 1
fi
CC=`"${R_HOME}/bin/R" CMD config CC`
CFLAGS=`"${R_HOME}/bin/R" CMD config CFLAGS`
CPPFLAGS=`"${R_HOME}/bin/R" CMD config CPPFLAGS`
(using ‘${R_HOME}/bin/R’ rather than just ‘R’ is necessary
in order to use the `right' version of R when running the script as
part of R CMD INSTALL.)
Note that earlier versions of this document recommended obtaining the
configure information by direct extraction (using grep and sed) from
R_HOME/etcR_ARCH/Makeconf, which only works for
variables recorded there as literals. You can use R CMD config
for getting the value of the basic configuration variables, or the
header and library flags necessary for linking against R, see R
CMD config --help for details. (This works on Windows as from R
2.6.0.)
To check for an external BLAS library using the ACX_BLAS macro
from the official Autoconf Macro Archive, one can simply do
F77=`"${R_HOME}/bin/R" CMD config F77`
AC_PROG_F77
FLIBS=`"${R_HOME}/bin/R" CMD config FLIBS`
ACX_BLAS([], AC_MSG_ERROR([could not find your BLAS library], 1))
Note that FLIBS as determined by R must be used to ensure that
FORTRAN 77 code works on all R platforms. Calls to the Autoconf macro
AC_F77_LIBRARY_LDFLAGS, which would overwrite FLIBS, must
not be used (and hence e.g. removed from ACX_BLAS). (Recent
versions of Autoconf in fact allow an already set FLIBS to
override the test for the FORTRAN linker flags. Also, recent versions
of R can detect external BLAS and LAPACK libraries.)
You should bear in mind that the configure script may well not work on Windows systems (this seems normally to be the case for those generated by Autoconf, although simple shell scripts do work). If your package is to be made publicly available, please give enough information for a user on a non-Unix platform to configure it manually, or provide a configure.win script to be used on that platform. (Optionally, there can be a cleanup.win script as well. Both should be shell scripts to be executed by ash, which is a minimal version of Bourne-style sh.)
In some rare circumstances, the configuration and cleanup scripts need to know the location into which the package is being installed. An example of this is a package that uses C code and creates two shared object/DLLs. Usually, the object that is dynamically loaded by R is linked against the second, dependent, object. On some systems, we can add the location of this dependent object to the object that is dynamically loaded by R. This means that each user does not have to set the value of the LD_LIBRARY_PATH (or equivalent) environment variable, but that the secondary object is automatically resolved. Another example is when a package installs support files that are required at run time, and their location is substituted into an R data structure at installation time. (This happens with the Java Archive files in the SJava package.) The names of the top-level library directory (i.e., specifiable via the ‘-l’ argument) and the directory of the package itself are made available to the installation scripts via the two shell/environment variables R_LIBRARY_DIR and R_PACKAGE_DIR. Additionally, the name of the package (e.g., ‘survival’ or ‘MASS’) being installed is available from the shell variable R_PACKAGE_NAME.
Sometimes writing your own configure script can be avoided by supplying a file Makevars: also one of the most common uses of a configure script is to make Makevars from Makevars.in.
The most common use of a Makevars file is to set additional
preprocessor (for example include paths) flags via PKG_CPPFLAGS,
and additional compiler flags by setting PKG_CFLAGS,
PKG_CXXFLAGS and PKG_FFLAGS, for C, C++, or FORTRAN
respectively (see Creating shared objects).
Also, Makevars can be used to set flags for the linker, for example ‘-L’ and ‘-l’ options.
When writing a Makevars file for a package you intend to distribute, take care to ensure that it is not specific to your compiler: flags such as -O2 -Wall -pedantic are all specific to GCC.
There are some macros which are built whilst configuring the building of R itself, are stored on Unix-alikes in R_HOME/etcR_ARCH/Makeconf and can be used in Makevars. These include
FLIBSPKG_LIBS.
BLAS_LIBSPKG_LIBS. Beware that if it is empty then
the R executable will contain all the double-precision and
double-complex BLAS routines, but no single-precision or complex
routines. If BLAS_LIBS is included, then FLIBS also needs
to be7, as most BLAS libraries are written in FORTRAN.
LAPACK_LIBSPKG_LIBS. This may point to a dynamic library libRlapack
which contains all the double-precision LAPACK routines as well as those
double-complex LAPACK and BLAS routines needed to build R, or it
may point to an external LAPACK library, or may be empty if an external
BLAS library also contains LAPACK.
[There is no guarantee that the LAPACK library will provide more than all the double-precision and double-complex routines, and some do not provide all the auxiliary routines.]
The macros BLAS_LIBS and FLIBS should always be included
after LAPACK_LIBS.
SAFE_FFLAGSPKG_FFLAGS, but a replacement for
FFLAGS, and that it is intended for the FORTRAN-77 compiler
‘F77’ and not necessarily for the Fortran 90/95 compiler ‘FC’.
See the example later in this section.
Setting certain macros in Makevars will prevent R CMD SHLIB setting them: in particular if Makevars sets ‘OBJECTS’ it will not be set on the make command line. This can be useful in conjunction with implicit rules to allow other types of source code to be compiled and included in the shared object.
Note that Makevars should not normally contain targets, as it is (except on Windows) included before the default makefile and make is called without an explicit target. To circumvent that, use a suitable phony target before any actual targets: for example fastICA has
SLAMC_FFLAGS=$(R_XTRA_FFLAGS) $(FPICFLAGS) $(SHLIB_FFLAGS) $(SAFE_FFLAGS)
all: $(SHLIB)
slamc.o: slamc.f
$(F77) $(SLAMC_FFLAGS) -c -o slamc.o slamc.f
to ensure that the LAPACK routines find some constants without infinite looping. The Windows equivalent is
slamc.o: slamc.f
$(F77) $(SAFE_FFLAGS) -c -o slamc.o slamc.f
More generally, on a Unix-alike one could have something like
.PHONY: all
all: before $(SHLIB) after
before:
Things that need to be done first like creating libraries
after:
Cleanup needed after 'before'
On Windows, one can add dependencies to the ‘all’ target (which is what will get called), e.g. (based on package rcom)
all: ../inst/tst/bin/rcom_test.exe extraclean
../inst/tst/bin/rcom_test.exe: rcom_test.exe
$(MKDIR) -p ../inst/tst/bin
$(CP) $? $ rcom_test.exe: rcom_test.o
rcom_test-LIBS = -L. -lsupc++ -luuid -lole32 -loleaut32
extraclean:
$(RM) rcom_test.exe
The added dependencies will be built after the DLL: it is also possible (but not advisable) to have a target ‘all’ with commands (rather than dependencies)
There are two another targets, ‘before’ and ‘after’, which by default have neither dependencies nor commands so can be overridden in a Makevars.win. See the example in Linking to other packages on Windows.
It may be helpful to give an extended example of using a configure script to create a src/Makevars file: this is based on that in the RODBC package.
The configure.ac file follows: configure is created from this by running autoconf in the top-level package directory (containing configure.ac).
AC_INIT([RODBC], 1.1.8) dnl package name, version
dnl A user-specifiable option
odbc_mgr=""
AC_ARG_WITH([odbc-manager],
AC_HELP_STRING([--with-odbc-manager=MGR],
[specify the ODBC manager, e.g. odbc or iodbc]),
[odbc_mgr=$withval])
if test "$odbc_mgr" = "odbc" ; then
AC_PATH_PROGS(ODBC_CONFIG, odbc_config)
fi
dnl Select an optional include path, from a configure option
dnl or from an environment variable.
AC_ARG_WITH([odbc-include],
AC_HELP_STRING([--with-odbc-include=INCLUDE_PATH],
[the location of ODBC header files]),
[odbc_include_path=$withval])
RODBC_CPPFLAGS="-I."
if test [ -n "$odbc_include_path" ] ; then
RODBC_CPPFLAGS="-I. -I${odbc_include_path}"
else
if test [ -n "${ODBC_INCLUDE}" ] ; then
RODBC_CPPFLAGS="-I. -I${ODBC_INCLUDE}"
fi
fi
dnl ditto for a library path
AC_ARG_WITH([odbc-lib],
AC_HELP_STRING([--with-odbc-lib=LIB_PATH],
[the location of ODBC libraries]),
[odbc_lib_path=$withval])
if test [ -n "$odbc_lib_path" ] ; then
LIBS="-L$odbc_lib_path ${LIBS}"
else
if test [ -n "${ODBC_LIBS}" ] ; then
LIBS="-L${ODBC_LIBS} ${LIBS}"
else
if test -n "${ODBC_CONFIG}"; then
odbc_lib_path=`odbc_config --libs | sed s/-lodbc//`
LIBS="${odbc_lib_path} ${LIBS}"
fi
fi
fi
dnl Now find the compiler and compiler flags to use
: ${R_HOME=`R RHOME`}
if test -z "${R_HOME}"; then
echo "could not determine R_HOME"
exit 1
fi
CC=`"${R_HOME}/bin/R" CMD config CC`
CPP=`"${R_HOME}/bin/R" CMD config CPP`
CFLAGS=`"${R_HOME}/bin/R" CMD config CFLAGS`
CPPFLAGS=`"${R_HOME}/bin/R" CMD config CPPFLAGS`
AC_PROG_CC
AC_PROG_CPP
if test -n "${ODBC_CONFIG}"; then
RODBC_CPPFLAGS=`odbc_config --cflags`
fi
CPPFLAGS="${CPPFLAGS} ${RODBC_CPPFLAGS}"
dnl Check the headers can be found
AC_CHECK_HEADERS(sql.h sqlext.h)
if test "${ac_cv_header_sql_h}" = no ||
test "${ac_cv_header_sqlext_h}" = no; then
AC_MSG_ERROR("ODBC headers sql.h and sqlext.h not found")
fi
dnl search for a library containing an ODBC function
if test [ -n "${odbc_mgr}" ] ; then
AC_SEARCH_LIBS(SQLTables, ${odbc_mgr}, ,
AC_MSG_ERROR("ODBC driver manager ${odbc_mgr} not found"))
else
AC_SEARCH_LIBS(SQLTables, odbc odbc32 iodbc, ,
AC_MSG_ERROR("no ODBC driver manager found"))
fi
dnl for 64-bit ODBC need SQL[U]LEN, and it is unclear where they are defined.
AC_CHECK_TYPES([SQLLEN, SQLULEN], , , [# include <sql.h>])
dnl for unixODBC header
AC_CHECK_SIZEOF(long, 4)
dnl substitute RODBC_CPPFLAGS and LIBS
AC_SUBST(RODBC_CPPFLAGS)
AC_SUBST(LIBS)
AC_CONFIG_HEADERS([src/config.h])
dnl and do substitution in the src/Makevars.in and src/config.h
AC_CONFIG_FILES([src/Makevars])
AC_OUTPUT
where src/Makevars.in would be simply
PKG_CPPFLAGS = @RODBC_CPPFLAGS@
PKG_LIBS = @LIBS@
A user can then be advised to specify the location of the ODBC driver manager files by options like (lines broken for easier reading)
R CMD INSTALL
--configure-args='--with-odbc-include=/opt/local/include
--with-odbc-lib=/opt/local/lib --with-odbc-manager=iodbc'
RODBC
or by setting the environment variables ODBC_INCLUDE and
ODBC_LIBS.
R currently does not distinguish between FORTRAN 77 and Fortran 90/95
code, and assumes all FORTRAN comes in source files with extension
.f. Commercial Unix systems typically use a F95 compiler, but
only since the release of gcc 4.0.0 in April 2005 have Linux and
other non-commercial OSes had much support for F95. Only wih R 2.6.0
did the Windows port adopt a Fortran 90 compiler.
This means that portable packages need to be written in correct FORTRAN 77, which will also be valid Fortran 95. See http://developer.r-project.org/Portability.html for reference resources. In particular, free source form F95 code is not portable.
On some systems an alternative F95 compiler is available: from the
gcc family this might be gfortran or g95.
Configuring R will try to find a compiler which (from its name)
appears to be a Fortran 90/95 compiler, and set it in macro ‘FC’.
Note that it does not check that such a compiler is fully (or even
partially) compliant with Fortran 90/95. Packages making use of
Fortran 90/95 features should use file extension .f90 or
.f95 for the source files: the variable PKG_FCFLAGS
specifies any special flags to be used. There is no guarantee that
compiled Fortran 90/95 code can be mixed with any other type of code,
nor that a build of R will have support for such packages.
Before using these tools, please check that your package can be
installed and loaded. R CMD check will inter alia do
this, but you will get more informative error messages doing the checks
directly.
Using R CMD check, the R package checker, one can test whether
source R packages work correctly. It can be run on one or
more directories, or gzipped package tar
archives8 with extension
.tar.gz or .tgz. This runs a series of checks, including
library. Another check is that all packages mentioned in
library or requires or from which the NAMESPACE
file imports or are called via :: or ::: are listed
(in ‘Depends’, ‘Imports’, ‘Suggests’ or ‘Contains’):
this is not an exhaustive check of the actual imports.
To allow a configure script to generate suitable files, files ending in ‘.in’ will be allowed in the R directory.
A warning is given for directory names that look like R package check directories – many packages have been submitted to CRAN containing these.
library.dynam (with
no extension). In addition, it is checked whether methods have all
arguments of the corresponding generic, and whether the final argument
of replacement functions is called ‘value’. All foreign function
calls (.C, .Fortran, .Call and .External
calls) are tested to see if they have a PACKAGE argument, and if
not, whether the appropriate DLL might be deduced from the name space of
the package. Any other calls are reported. (The check is generous, and
users may want to supplement this by examining the output of
tools::checkFF("mypkg", verbose=TRUE), especially if the
intention were to always use a PACKAGE argument)
\name, \alias, \title
and \description) fields. The Rd name and title are checked for
being non-empty, and the keywords found are compared to the standard
ones. There is a check for missing cross-references (links).
\usage
sections of Rd files are documented in the corresponding
\arguments section.
\examples to create executable example code.)
Of course, released packages should be able to run at least their own
examples. Each example is run in a `clean' environment (so earlier
examples cannot be assumed to have been run), and with the variables
T and F redefined to generate an error unless they are set
in the example: See Logical vectors.
Use R CMD check --help to obtain more information about the usage of the R package checker. A subset of the checking steps can be selected by adding flags.
You do need to ensure that the package is checked in a suitable locale
if it contains non-ASCII characters. Such packages are likely
to fail some of the checks in a C locale, and R CMD
check will warn if it spots the problem. You should be able to check
any package in a UTF-8 locale (if one is available). Beware that
although a C locale is rarely used at a console, it may be the
default if logging in remotely or for batch jobs.
Using R CMD build, the R package builder, one can build R
packages from their sources (for example, for subsequent release).
Prior to actually building the package in the common gzipped tar file format, a few diagnostic checks and cleanups are performed. In particular, it is tested whether object indices exist and can be assumed to be up-to-date, and C, C++ and FORTRAN source files and relvant make files are tested and converted to LF line-endings if necessary.
Run-time checks whether the package works correctly should be performed
using R CMD check prior to invoking the build procedure.
To exclude files from being put into the package, one can specify a list
of exclude patterns in file .Rbuildignore in the top-level source
directory. These patterns should be Perl regexps, one per line, to be
matched against the file names relative to the top-level source
directory. In addition, directories from source control
systems9, directories with names ending
.Rcheck or Old or old and files GNUMakefile
or with base names starting with ‘.#’, or starting and ending with
‘#’, or ending in ‘~’, ‘.bak’ or ‘.swp’, are
excluded by default. In addition, those files in the R,
demo and man directories which are flagged by R CMD
check as having invalid names will be excluded.
Use R CMD build --help to obtain more information about the usage of the R package builder.
Unless R CMD build is invoked with the --no-vignettes option, it will attempt to rebuild the vignettes (see Writing package vignettes) in the package. To do so it installs the current package/bundle into a temporary library tree, but any dependent packages need to be installed in an available library tree (see the Note: below).
One of the checks that R CMD build runs is for empty source
directories. These are in most cases unintentional, in which case they
should be removed and the build re-run.
It can be useful to run R CMD check --check-subdirs=yes on the
built tarball as a final check on the contents.
R CMD build can also build pre-compiled version of packages for
binary distributions, but R CMD INSTALL --build is preferred (and
is considerably more flexible). In particular, Windows users are
recommended to use R CMD INSTALL --build and install into the
main library tree (the default) so that HTML links are resolved.
Note:R CMD checkandR CMD buildrun R with --vanilla, so none of the user's startup files are read. If you need R_LIBS set (to find packages in a non-standard library) you will need to set it in the environment.
Note to Windows users:R CMD checkandR CMD buildneed you to have installed the files for building source packages (which is the default), as well as the Windows toolset (see the “R Installation and Administration” manual). You may need to set TMPDIR to point to a suitable writeable directory with a path not containing spaces – use forward slashes for the separators.
In addition to the available command line options, R CMD check
also allows customization by setting (Perl) configuration variables in a
configuration file, the location of which can be specified via the
--rcfile option and defaults to $HOME/.R/check.conf
provided that the environment variable HOME is set.
The following configuration variables are currently available.
$R_check_use_install_log$R_check_all_non_ISO_C$R_check_weave_vignettes$R_check_latex_vignettes$R_check_weave_vignettes is also true),
latex package vignettes in the process of checking them: this
will show up Sweave source errors, including missing source
files. Default: true.
$R_check_subdirs_nocase$R_check_subdirs_strict$R_check_force_suggests$R_check_use_codetools$R_check_Rd_style\method markup.
Default: true.
$R_check_Rd_xrefsValues ‘1’ or a string with lower-cased version ‘"yes"’ or ‘"true"’ can be used for setting the variables to true; similarly, ‘0’ or strings with lower-cased version ‘"no"’ or ‘"false"’ give false.
For example, a configuration file containing
$R_check_use_install_log = "TRUE";
$R_check_weave_vignettes = 0;
results in using install logs and turning off weaving.
Future versions of R may enhance this customization mechanism, and
provide a similar scheme for R CMD build.
There are other internal settings that can be changed via environment variables _R_CHECK_*_: see the Perl source code.
In addition to the help files in Rd format, R packages allow the inclusion of documents in arbitrary other formats. The standard location for these is subdirectory inst/doc of a source package, the contents will be copied to subdirectory doc when the package is installed. Pointers from package help indices to the installed documents are automatically created. Documents in inst/doc can be in arbitrary format, however we strongly recommend to provide them in PDF format, such that users on all platforms can easily read them. To ensure that they can be accessed from a browser, the file names should start with an ASCII letter and be comprised entirely of ASCII letters or digits or minus or underscore.
A special case are documents in Sweave format, which we call
package vignettes. Sweave allows the integration of LaTeX
documents and R code and is contained in package utils which is
part of the base R distribution, see the Sweave help page for
details on the document format. Package vignettes found in directory
inst/doc are tested by R CMD check by executing all R
code chunks they contain to ensure consistency between code and
documentation. Code chunks with option eval=FALSE are not
tested. The R working directory for all vignette tests in R CMD
check is the installed version of the doc
subdirectory. Make sure all files needed by the vignette (data sets,
...) are accessible by either placing them in the inst/doc
hierarchy of the source package, or using calls to system.file().
R CMD build will automatically create PDF versions of the
vignettes for distribution with the package sources. By including the
PDF version in the package sources it is not necessary that the
vignettes can be compiled at install time, i.e., the package author can
use private LaTeX extensions which are only available on his machine.
10
By default R CMD build will run Sweave on all files in
Sweave format. If no Makefile is found in directory
inst/doc, then texi2dvi --pdf is run on all vignettes.
Whenever a Makefile is found, then R CMD build will try to
run make after the Sweave step, such that PDF manuals
can be created from arbitrary source formats (plain LaTeX files,
...). The Makefile should take care of both creation of PDF
files and cleaning up afterwards, i.e., delete all files that shall not
appear in the final package archive. Note that the make step is
executed independently from the presence of any files in Sweave format.
It is no longer necessary to provide a 00Index.dcf file in the
inst/doc directory—the corresponding information is generated
automatically from the \VignetteIndexEntry statements in all
Sweave files when installing from source, or when using the package
builder (see Checking and building packages). The
\VignetteIndexEntry statement is best placed in LaTeX comment,
such that no definition of the command is necessary.
At install time an HTML index for all vignettes is automatically
created from the \VignetteIndexEntry statements unless a file
index.html exists in directory inst/doc. This index is
linked into the HTML help system for each package.
CRAN is a network of WWW sites holding the R distributions and contributed code, especially R packages. Users of R are encouraged to join in the collaborative project and to submit their own packages to CRAN.
Before submitting a package mypkg, do run the following steps to test it is complete and will install properly. (Unix procedures only, run from the directory containing mypkg as a subdirectory.)
R CMD build to make the release .tar.gz file.
R CMD check on the .tar.gz file to check that the
package will install and will run its examples, and that the
documentation is complete and can be processed. If the package contains
code that needs to be compiled, try to enable a reasonable amount of
diagnostic messaging (“warnings”) when compiling, such as e.g.
-Wall -pedantic for tools from GCC, the Gnu Compiler
Collection. (If R was not configured accordingly, one can achieve
this e.g. via PKG_CFLAGS and related variables.)
Please ensure that you can run through the complete procedure with only
warnings that you understand and have reasons not to eliminate. In
principle, packages must pass R CMD check without warnings to be
admitted to the main CRAN package area.
When all the testing is done, upload the .tar.gz file, using ‘anonymous’ as log-in name and your e-mail address as password, to ftp://CRAN.R-project.org/incoming/ (note: use ‘ftp’ and not ‘sftp’ to connect to this server) and send a message to CRAN@R-project.org about it. The CRAN maintainers will run these tests before putting a submission in the main archive.
Note that CRAN generally does not accept submissions of precompiled binaries due to security reasons.
R has a name space management system for packages. This system allows the package writer to specify which variables in the package should be exported to make them available to package users, and which variables should be imported from other packages.
The current mechanism for specifying a name space for a package is to
place a NAMESPACE file in the top level package directory. This
file contains name space directives describing the imports and
exports of the name space. Additional directives register any shared
objects to be loaded and any S3-style methods that are provided. Note
that although the file looks like R code (and often has R-style
comments) it is not processed as R code. Only very simple
conditional processing of if statements is implemented.
Like other packages, packages with name spaces are loaded and attached
to the search path by calling library. Only the exported
variables are placed in the attached frame. Loading a package that
imports variables from other packages will cause these other packages to
be loaded as well (unless they have already been loaded), but they will
not be placed on the search path by these implicit loads.
Name spaces are sealed once they are loaded. Sealing means that imports and exports cannot be changed and that internal variable bindings cannot be changed. Sealing allows a simpler implementation strategy for the name space mechanism. Sealing also allows code analysis and compilation tools to accurately identify the definition corresponding to a global variable reference in a function body.
Note that adding a name space to a package changes the search strategy. The package name space comes first in the search, then the imports, then the base name space and then the normal search path.
Exports are specified using the export directive in the
NAMESPACE file. A directive of the form
export(f, g)
specifies that the variables f and g are to be exported.
(Note that variable names may be quoted, and reserved words and
non-standard names such as [<-.fractions must be.)
For packages with many variables to export it may be more convenient to
specify the names to export with a regular expression using
exportPattern. The directive
exportPattern("^[^\\.]")
exports all variables that do not start with a period.
A package with a name space implicitly imports the base name space.
Variables exported from other packages with name spaces need to be
imported explicitly using the directives import and
importFrom. The import directive imports all exported
variables from the specified package(s). Thus the directives
import(foo, bar)
specifies that all exported variables in the packages foo and
bar are to be imported. If only some of the exported variables
from a package are needed, then they can be imported using
importFrom. The directive
importFrom(foo, f, g)
specifies that the exported variables f and g of the
package foo are to be imported.
It is possible to export variables from a name space that it has imported from other namespaces.
If a package only needs a few objects from another package it can use a
fully qualified variable reference in the code instead of a formal
import. A fully qualified reference to the function f in package
foo is of the form foo:::f. This is less efficient than a
formal import and also loses the advantage of recording all dependencies
in the NAMESPACE file, so this approach is usually not
recommended. Evaluating foo:::f will cause package foo to
be loaded, but not attached, if it was not loaded already—this can be
an advantage is delaying the loading of a rarely used package.
Using foo:::f allows access to unexported objects: to confine
references to exported objects use foo::f.
The standard method for S3-style UseMethod dispatching might fail
to locate methods defined in a package that is imported but not attached
to the search path. To ensure that these methods are available the
packages defining the methods should ensure that the generics are
imported and register the methods using S3method directives. If
a package defines a function print.foo intended to be used as a
print method for class foo, then the directive
S3method(print, foo)
ensures that the method is registered and available for UseMethod
dispatch. The function print.foo does not need to be exported.
Since the generic print is defined in base it does not need
to be imported explicitly. This mechanism is intended for use with
generics that are defined in a name space. Any methods for a generic
defined in a package that does not use a name space should be exported,
and the package defining and exporting the methods should be attached to
the search path if the methods are to be found.
(Note that function and class names may be quoted, and reserved words
and non-standard names such as [<- and function must
be.)
There are a number of hooks that apply to packages with name spaces.
See help(".onLoad") for more details.
Packages with name spaces do not use the .First.lib function.
Since loading and attaching are distinct operations when a name space is
used, separate hooks are provided for each. These hook functions are
called .onLoad and .onAttach. They take the same
arguments as .First.lib; they should be defined in the name space
but not exported.
However, packages with name spaces do use the .Last.lib
function. There is also a hook .onUnload which is called when
the name space is unloaded (via a call to unloadNamespace) with
argument the full path to the directory in which the package was
installed. .onUnload should be defined in the name space and not
exported, but .Last.lib does need to be exported.
Packages are not likely to need .onAttach (except perhaps for a
start-up banner); code to set options and load shared objects should be
placed in a .onLoad function, or use made of the useDynLib
directive described next.
There can be one or more useDynLib directives which allow shared
objects that need to be loaded to be specified in the NAMESPACE
file. The directive
useDynLib(foo)
registers the shared object foo for loading with
library.dynam. Loading of registered object(s) occurs after the
package code has been loaded and before running the load hook function.
Packages that would only need a load hook function to load a shared
object can use the useDynLib directive instead.
User-level hooks are also available: see the help on function
setHook.
The useDynLib directive also accepts the names of the native
routines that are to be used in R via the .C, .Call,
.Fortran and .External interface functions. These are given as
additional arguments to the directive, for example,
useDynLib(foo, myRoutine, myOtherRoutine)
By specifying these names in the useDynLib directive, the
native symbols are resolved when the package is loaded and R variables
identifying these symbols are added to the package's name space with
these names. These can be used in the .C, .Call,
.Fortran and .External calls in place of the
name of the routine and the PACKAGE argument.
For instance, we can call the routine myRoutine from R
with the code
.Call(myRoutine, x, y)
rather than
.Call("myRoutine", x, y, PACKAGE = "foo")
There are at least two benefits to this approach. Firstly, the symbol lookup is done just once for each symbol rather than each time it the routine is invoked. Secondly, this removes any ambiguity in resolving symbols that might be present in several compiled libraries. In particular, it allows for correctly resolving routines when different versions of the same package are loaded concurrently in the same R session.
In some circumstances, there will already be an R variable in the
package with the same name as a native symbol. For example, we may have
an R function in the package named myRoutine. In this case,
it is necessary to map the native symbol to a different R variable
name. This can be done in the useDynLib directive by using named
arguments. For instance, to map the native symbol name myRoutine
to the R variable myRoutine_sym, we would use
useDynLib(foo, myRoutine_sym = myRoutine, myOtherRoutine)
We could then call that routine from R using the command
.Call(myRoutine_sym, x, y)
Symbols without explicit names are assigned to the R variable with that name.
In some cases, it may be preferable not to create R variables in the
package's name space that identify the native routines. It may be too
costly to compute these for many routines when the package is loaded
if many of these routines are not likely to be used. In this case,
one can still perform the symbol resolution correctly using the DLL,
but do this each time the routine is called. Given a reference to the
DLL as an R variable, say dll, we can call the routine
myRoutine using the expression
.Call(dll$myRoutine, x, y)
The $ operator resolves the routine with the given name in the
DLL using a call to getNativeSymbol. This is the same
computation as above where we resolve the symbol when the package is
loaded. The only difference is that this is done each time in the case
of dll$myRoutine.
In order to use this dynamic approach (e.g., dll$myRoutine), one
needs the reference to the DLL as an R variable in the package. The
DLL can be assigned to a variable by using the variable =
dllName format used above for mapping symbols to R variables. For
example, if we wanted to assign the DLL reference for the DLL
foo in the example above to the variable myDLL, we would
use the following directive in the NAMESPACE file:
myDLL = useDynLib(foo, myRoutine_sym = myRoutine, myOtherRoutine)
Then, the R variable myDLL is in the package's name space and
available for calls such as myDLL$dynRoutine to access routines
that are not explicitly resolved at load time.
If the package has registration information (see Registering native routines), then we can use that directly rather than specifying the
list of symbols again in the useDynLib directive in the
NAMESPACE file. Each routine in the registration information is
specified by giving a name by which the routine is to be specified along
with the address of the routine and any information about the number and
type of the parameters. Using the .registration argument of
useDynLib, we can instruct the name space mechanism to create
R variables for these symbols. For example, suppose we have the
following registration information for a DLL named myDLL:
R_CMethodDef cMethods[] = {
{"foo", &foo, 4, {REALSXP, INTSXP, STRSXP, LGLSXP}},
{"bar_sym", &bar, 0},
{NULL, NULL, 0}
};
R_CallMethodDef callMethods[] = {
{"R_call_sym", &R_call, 4},
{"R_version_sym", &R_version, 0},
{NULL, NULL, 0}
};
Then, the directive in the NAMESPACE file
useDynLib(myDLL, .registration = TRUE)
causes the DLL to be loaded and also for the R variables foo,
bar_sym, R_call_sym and R_version_sym to be
defined in the package's name space.
Note that the names for the R variables are taken from the entry in
the registration information and do not need to be the same as the name
of the native routine. This allows the creator of the registration
information to map the native symbols to non-conflicting variable names
in R, e.g. R_version to R_version_sym for use in an
R function such as
R_version <- function()
{
.Call(R_version_sym)
}
Using argument .fixes allows an automatic prefix to be added to
the registered symbols, which can be useful when working with an
existing package. For example, package KernSmooth has
useDynLib(KernSmooth, .registration = TRUE, .fixes = "F_")
which makes the R variables corresponding to the FORTRAN symbols
F_bkde and so on, and so avoid clashes with R code in the name
s