At this point we are ready to put everything together and write the most
useful parentheses-addition program. The first improvement to our program
is that it uses an empty default CONVERSION-TABLE; the second
improvement is that the shell script accepts a command-line argument: a
file that contains the table.
Naturally we will specify the table as a data file with parentheses, e.g.,
(("," " comma ") (";" " semi ") ("\\\\" "\\\\\\\\"))
read. Thus, assuming that this table is in a file called table, we want to type
> ./addparens -f table < pre1958-grades.dat
and get the following output:
( ( Adam 78 comma 88 comma 69 comma 66) ( Brad 88 comma 87 comma 86 comma 22) ( Carr 99 comma 88 comma 88 comma 90) ( Dave 77 comma 78 comma 77 comma 78) ( Fawn 90 comma 89 comma 81 comma 60) ( Gege 67 comma 78 comma 81 comma 85) )
That is, the program adds parenthese and performs all the substitution specified in table. In this case: the output contains `` comma '' instead of ``,'' and no other conversions apply.
To add this new functionality, we modify the script since the extension concerns a command-line argument:
#!/bin/sh string=? ; exec mzscheme -g -l mzlib.ss -r DOLLAR0 "DOLLAR@" (load "addparens3.ss") (define MinusF "-f") (cond ((= (vector-length argv) 0) (void)) ((and (= (vector-length argv) 2) ;; now we know that two arguments were passed on the command line (string=? (vector-ref argv 0) MinusF)) (set! CONVERSION-TABLE (call-with-input-file (vector-ref argv 1) read))) (else (error 'addparens "bad format"))) (add-parens-to-file)
The revised script acts as before if no command-line arguments are present. If the command line specifies a table via "-f <filename>", the script must change the default conversion table to the one in the specified file. In all other cases, we signal an error.
The Scheme function call-with-input-file is the only new element
in the revised script. Since the table is specified via a command-line
argument as a filename, we cannot use plain read to get hold of
the table. Instead we connect the file with read via the function
call-with-input-file. Roughly speaking,
(call-with-input-file "file.dat" read)
makes read take its data from "file.dat" instead of the
standard input.
Exercise 3.0.4. Add the -t option to our script. Using the -t option, a script user can specify a single entry into the conversion table on the command line. For example,
> ./addparens -f table < pre1958-grades.dat
could be specified with
> ./addparens -t "," " comma " < pre1958-grades.dat
and still obtain the same output as above.
Last, but not least, we can also start worrying about the validity of
arguments to our programs, especially if we want to make them available to
our friends. Consider the addparens program. First, we should check
that the command-line arguments are of the proper shape. At the moment, we
only know that the first of two arguments is "-f". We should also
check that the second one is the name of an existing file. To this end we
can use
(file-exists? (vector-ref argv 1))
The Scheme function file-exists? consumes a string and returns
#t if, and only if, the string specifies a file.
Second, we should also check that the file specifies a table, i.e., a list
of pairs of strings. The change to the program is again straightforward:
see figure 14. Instead of just assigning the value
of (call-with-input-file "file.dat" read) to
CONVERSION-TABLE, we first check that it is a list and that each
element in the list is a list with two strings.
In general, we should develop error checking code for all data that enters
our computation through read. For many problems, this step is easy
because we can get away with the simple input techniques we have developed
here. On occasion we may need to process more complex forms of text. In
that case, we must study more general parsing techniques as typically
taught in compiler courses.