This tutorial provides an overview of the PLT Scheme module system. Complete details can be found in PLT MzScheme: Language Manual. If you have DrScheme, search with Help Desk to quickly find details on any particular topic.
(module module-name-identifier implementation-language-name expression-or-definition ...)
The module system works best (for reasons that will become clear) when
each module is declared in its own file, and when the
module-name-identifier matches the name of the file, minus
the directory path and file extension. The
implementation-language-name is usually .mzscheme
For example, the following canvas module declaration would be
placed in a "canvas.scm" file:
|
(module canvas
|
The provide declaration at the end of the module exports the
functions new-canvas, draw-dot!, and
display-canvas. A provide declaration can appear
anywhere within a module, either before or after the exported names
are defined. Any number of provide declarations can appear
in a module, and macros can expand to provide declarations.
The function iterate is private to the canvas
module, because it is not exported by any provide
declaration. Similarly, the canvas structure definition is private.
Since the canvas structure definition is private, code outside the
module cannot manipulate the canvas representation directly.
To use the canvas module in a read-eval-print loop, load it
with require. If "canvas.scm" is in the current
directory, then
> (require "canvas.scm")
loads the module, executes it, and binds the module's exports in the top-level environment.1 The canvas functions are then available for interactive exploration:
> (define c (new-canvas 3 3)) > (draw-dot! c 2 2) > (display-canvas c) * >
A program that uses canvas to draw a face could (and should)
be in its own module. Here's a module for "face.scm":
|
(module face
|
Any number of modules can be listed in a require declaration;
any number of require declarations can appear in a module
body; they can appear anywhere in the module body;2 and macros can expand to
require declarations. Mutually recursive requires
are not allowed (i.e., the module-import graph must be acyclic).
Given the above module, evaluating (require "face.scm") in
the read-eval-print loop will display a face. In other words,
require executes the expressions in a module, as well as
evaluating the module's definitions. However, a module is executed
only once, and only the first time it is required. Evaluating
(require "face.scm") a second time will have no
effect.3
The (require "canvas.scm") declaration in face
imports a module relative to the "face.scm" file, independent of
the current directory at the time that "face.scm" is loaded.
More generally, module filenames can be relative paths using Unix
notation (regardless of the execution platform): "/" is the path
separator, and ".." accesses a parent directory.
The iterate function is somwhat out of place in the
canvas module; it's useful in other places (as we'll see),
and probably should be in its own module. Moreover, if we expect to
write iteration expressions always with an immediate lambda
argument, we might want to use a do-iterate syntactic form,
instead.
We can define a new iteration module (in
"iteration.scm") that contains the definition of the
iterate function and the do-iterate macro:
|
(module iteration
|
Given the iteration module, we can modify canvas to
use the new do-iterate form:
|
(module canvas
|
Macro definitions respect the scope of module declarations,
so the use of iterate in the expansion of
do-iterate always refers to the definition in the
iteration module, even though the iterate is
otherwise private to the iteration module.
A new shapes module can use do-iterate to define
draw-square! and draw-circle! functions:
|
(module shapes
|
At this point, both shapes and canvas use the
iteration module. The iteration module will be
executed only once.
We can use the shapes module in a new face program:
|
(module face2
|
Again, both the face2 and shapes modules use the
canvas module, which means that they access a single canvas
datatype. Consequently, the face2 module can apply the
functions from shapes to the result of new-canvas
from canvas.
Instead of having to import both canvas and shapes,
a programmer writing a shape-drawing program might prefer to import a
single shapes-canvas module. We can implement such a module
by importing both canvas and shapes, and then
re-exporting all of the definitions:
|
(module shapes-canvas
|
The all-from form is one of many possible forms within
provide. Other forms allow renaming on exports, and
exporting all names from an imported module except for designated
exceptions. The require declaration similarly supports a
variety of forms, including a form that prefixes all of the names
imported from a module.
A revised face2 module can use shapes-canvas instead
of shapes and canvas:
|
(module face2
|
So far, all of our examples assume that the module-declaring files
exist in a single directory. In fact, module filenames can be
relative paths; for portability, relative module paths always use
Unix notation (regardless of the execution platform): "/" is the
path separator, and ".." accesses a parent directory.
Relative paths allow a programmer to provide other programmers with a group of related modules, but module programmers also need a way to refer to common library modules without hard-coding an exact path to the libraries.
When a module name is of the form (, the module loader finds the directory for
lib file-name
collection-name)collection, and then loads file-name from that
directory. The directory for collection-name is determined
by searching a set of collection directories determined by the PLTCOLLECTS environment variable. The default collection directory
is the "collects" directory of the PLT Scheme distribution.
PLT Scheme is distributed with several library collections. For
example, the "net" collection contains an "smtp.ss" library
that provides functions for sending e-mail through an SMTP server,
and the "head.ss" library provides functions for constructing
e-mail headers. The following program uses the "smtp.ss" and
"head.ss" libraries to send a simple message:
|
(module mail-to-matthew
|
One collection distributed with PLT Scheme is special: the the default
collection-name for a form is
lib"mzlib". The "mzlib" libraries include a class system, a
component system, and pattern-matching utilities. See PLT MzLib: Libraries Manual
for more information.
Of course, PLT is not the only supplier of libraries. Other library
suppliers are encouraged to distribute libraries in collection
form. The collection system is hierarchical, so that the SuperSoft
library supplier, for example, might distrbute all of its libraries
in sub-collections of a "super-soft" collection. To support
hierarchical collections, the general syntax of a
directive is lib(, where
the first lib file-name collection-name ...)collection-name is a root collection (found
through the current collection path), and each successive
collection-name enters a sub-collection.
Going back to "shapes-canvas.scm", we can provide even more
convenience to programmers by supoprting modules written in a
shapes-canvas-lang language, instead of the
language. The first step is to define mzschemeshapes-canvas-lang as
exporting everything from , as well as from
mzschemeshapes and canvas:
|
(module shapes-canvas-lang
|
The second step is to use "shapes-canvas-lang.scm" in the
language position of a module, which is actually an import that
provides the initial bindings for the module body:
|
(module face2 "shapes-canvas-lang.scm" ; <---- changed ---- (define c (new-canvas 10 10)) (draw-circle! c 5 5 5) (draw-dot! c 4 4) (draw-dot! c 6 4) (draw-dot! c 5 5) (display-canvas c))
|
The ``language'' import for a module can't be written with
require, because even the require form must be
provided by some language (usually ).mzscheme
In fact, the module system does not force any a priori syntactic structure on the module's body, other than the lexical structure of S-expressions. The initial syntax is determined entirely by the initial import, all the more because PLT Scheme's macro system allows an import to define the meaning of application forms, literal constants, and the module body as a whole. Thus, a programmer can create a base language for a module that is quite different from standard Scheme.
For example, the "infotab.ss" module of the "setup"
collection defines a language used by "info.ss" files in other
collections. The "info.ss" file is special: tools such as
DrScheme expect "info.ss" to provide information about a
collection, such as the collection's name and whether the collection
defines a plug-in tool for DrScheme.
Naturally, Scheme would be a convenient language for defining
"info.ss" entries (i.e., better than a mere association list):
(define name "Canvas")
(define help-blurb
(list name " provides functions for generating ASCII art"))
However, arbitrary Scheme code might go wrong (e.g., loop forever) as
DrScheme looks for plug-in tools or help blurbs. To avoid such
problems, DrScheme requires "info.ss" files to contain a module
implemented in the ( language;
DrScheme won't load a lib "infotab.ss" "setup")"info.ss" file that has any other shape.
Implementing the "infotab.ss" language requires a more flexible
macro language than syntax-rules. We can at least prevent
loops with syntax-rules, however, because PLT Scheme's macro
expander treats every function call as an implicit use of the
#%app form. To prevent loops in a simple-info
language, we can replace 's mzscheme#%app with one
that restricts function calls to direct calls of a few well-behaved
primitives:
|
(module simple-info
|
Then, the following info module can be loaded:
|
(module info "simple-info.scm"
(define name "Canvas")
(define help-blurb
(
|
However, the following bad-info module is syntactically
rejected, because it conatins an illegal application:
|
(module bad-info "simple-info.scm" (define (loop) (loop)) (loop))
|
The do-iterate macro from iteration is convenient
when used properly, but a misuse can trigger a confusing error
message. For example, if we accidentally use a number in place of the
loop variable, we get an error message from lambda:
> (do-iterate 6 i (printf"~a~n" i)) lambda:notan identifier in: 6
The programmer using do-iterate should not have to know that
it is implemented with lambda.
Implementing a macro that provides a better error mesage requires more
than syntax-rules. In addition to syntax-rules, PLT
Scheme provides syntax-case,4 which combines the
pattern-matching convenience of syntax-rules with the
expressiveness of Scheme. The main difference between
syntax-rules and syntax-case is that a pattern in
syntax-case is followed by arbitrary code to be executed at
expansion time. Within that code, a syntax form explicitly
introduces a template.
Using syntax-case, we can re-implement do-iterate as
follows:
|
(module iteration
|
With the new implementation, we receive an appropriate error message
for the ill-formed do-iterate expression:
> (do-iterate 6 i (printf"~a~n" i)) do-iterate: expected an identifier, found somethingelsein: 6
The error message could be even better if, instead of reporting that
``something else'' was found, the error message reported specifically
that a number was found. Such detailed reporting would be too much
work for just do-iterate, but if we already had an
identify function that describes common syntactic elements,
then we'd use it. Suppose that we do have such a function, in its own
module:
|
(module identify
|
We can use identify to create an especially clear
error message:
(unless (identifier?(syntax i)) (raise-syntax-error#f (format"expected an identifier, found ~a" (identify (syntax i))) stx (syntax i)))
To make identify available to our macro implementation, we do
not use require. A require declaration imports
functions into a module that are to be used when the module is
executed, which implies that the required module is also executed at
run time. In the case of identify, we want the
identify module to be executed at expansion time, when
do-iterate expressions are expanded (instead of run time,
when the iterate function is executed).
This phase distinction is important to compilers and syntax checkers,
which need to distinguish code that must be executed to expand a
program from code that must not be executed until the program is
run. Since module enforces the phase distinction,
module-based code that runs in an interpreter will also compile
reliably, without extra instructions to the compiler.5
Instead of require, functions imported for use at expansion
time must be imported with require-for-syntax, as shown in
the following revision of the iteration module:
|
(module iteration
|
With this version of do-iterate, the error message is
especially informative:
> (do-iterate 6 i (printf "~a~n" i))
do-iterate: expected an identifier, found a number in: 6
A module is allowed to import another module at both expansion time
and run time. Indeed, the language inserts
mzscheme(require-for-syntax into the body of any module
using the mzscheme) language, which is why
mzschemedo-iterate's implementation can use functions like
and identifier?. Consequently,
the raise-syntax-error module is always executed in both phases for
any module in the mzscheme language.mzscheme
For each phase in which a module is executed, the module is instantiated afresh. No state is shared across distinct instantiations of a module in distinct phases, even if the phases happen to be executed in a single run of the PLT Scheme system. Enforcing the separation ensures that a program will continue to run when it is compiled today, then executed tomorrow on a different machine.
Beware: the original syntax-case macro system is somewhat
complex. When mixed with module scope and module phases, it becomes
considerably more complex. Furthermore, require-for-syntax
chains can create nested phases to an arbitrary depth. See
PLT MzScheme: Language Manual for details.
The module form is for namespace management. It constrains
the scope of definitions, and it declares explicitly the constructs
and bindings used to implement a module's code.
The module form does not support abstractions over a
set of definitions, at least not directly. For example, although the
face2 module's implementation is independent of the
implementation details of new-canvas and
draw-rect!, face2 nevertheless works only with the
implementation that originates from the canvas and
shapes modules specifically.
To define a face component that can be linked to any canvas
implementation, we can use the unit/sig form, which is
defined by "unitsig.ss" in the "mzlib" collection:
|
(module face-unit
|
The face-unit module exports a face@ component that
must be linked to an implementation of the shapes-canvas^
signature. (The characters @ and ^ are not
special; the @ in face@ and the ^ in
shapes-canvas^ are merely naming conventions for identifers
bound to components and signatures, respectively.)
The shapes-canvas^ signature is defined in its own module:
|
(module shapes-canvas-sig
|
The signature-defining module shapes-canvas-sig is used by
both face-unit and by any module that defines an
implementation of the signature. Only the signature module is needed
to compile face@.
Unlike modules, units can have mutual dependencies: A@ can
import from B@, while unit B@ simultaneously
imports from A@. So, another potential reason to use
unit/sig (in addition to module) is to implement
separate program fragments that have mutual dependendencies.
Although module and unit/sig organize code at
roughly the same granularity, the features needed in a namespace
management system are quite different from those needed in a
component system. More generally, module provides an
expansion and compilation foundation on which new programming
constructs can be built, including constructs for implementing
components.
PLT MzScheme: Language Manual provides complete details on the module system. including information about dynamically loading modules, the effect of redeclaring modules, and the way that modules interact with the top-level environment.
1 Even better, DrScheme provides a module language that expects a single module declaration in the definitions window. Clicking Execute makes the module's definitions available in the interactions window. In addition, the module's private definitions are made available in the interactions window, since DrScheme's module language is intended for debugging modules.
2 Although
require declarations can appear anywhere in a module,
they normally appear at the beginning. The order can be significant;
if a macro expands to a definition, a require
declaration, or a provide declaration, the macro must
be imported before its use.
3 It's possible to use the procedure on
load"face.scm", which would declare a module named face in
the top-level environment, but the face module would not be
executed; after face is declared, (require face)
would execute it. Nevertheless, using (require "face.scm")
is preferable to ( followed by
load "face.scm")(require face), because (require "face.scm") does
not declare a module named face in the top-level
environment, where it might conflict with modules defined by other
programmers. Instead, (require "face.scm") invents a
top-level name for the module based on the full path of the source
file, so that module declarations from different programmers never
collide.
4 Dybvig, Hieb, and Bruggeman, ``Syntactic abstraction in Scheme'' in Lisp and Symbolic Computation, December 1993.
5 In
particular, programmers need not annotate programs with fragile
eval-when or begin-elaboration-time annotations.