Sunday, August 15, 2010

Tcl and Astronomy

SAOTk and TclXPA: Tcl/Tk Extensions for Astronomy

W. Joye, E. Mandel

Smithsonian Astrophysical Observatory, Cambridge, MA 02138

Abstract:

SAOTk is an integrated set of Tcl/Tk canvas widgets for astronomical imaging and data visualization. The widget set is composed of the Frame, Panner, Magnifier, and Colorbar widgets. In addition to ``classical'' support for imaging FITS data, manipulating colormaps, region marking, coordinate readout (including WCS), etc., SAOTk widgets also support arbitrary image scaling and rotation, advanced PostScript printing, truecolor graphics support, and image mosaics.

TclXPA is a Tcl package that implements both the client and server aspects of the XPA messaging system within Tcl. TclXPA allows a Tcl program to exchange data and commands with Tcl programs, Xt programs, or Unix programs.

SAOTk and TclXPA are being used in a wide variety of astronomical applications. These applications include general visualization and analysis, real-time instrumentation and calibration, and interactive modeling. These Tcl extensions can be utilized in any standard Tcl/Tk environment to build custom data analysis and visualization applications.

1. Introduction

A common denominator among astronomical software is the need to display and manipulate FITS data. In many applications, such as proposal submission tools, survey and catalog tools, mission planning, instrument calibration, or general analysis, users need to visualize FITS data. But to develop high-quality astronomical visualization software, one almost must make a career of it. The developer must become an expert on world coordinate systems, graphics hardware, 2D and 3D computer graphics concepts, the FITS file standard, legacy software and file formats, and interprocess communications.

One of the strengths of Tcl/Tk is the concept of comprehensive, reusable, industrial-strength widgets. Examples include the Canvas and Text widgets. Because such widgets are complete and self-contained, they can be integrated into fully-featured applications with minimal buy-in. The SAO/HEAD Research and Development group has developed a set of comprehensive, reusable, industrial- strength Tk widgets for astronomical imaging and data visualization, called SAOTk. We also have developed interprocess communications support that encompasses the Tcl environment, called TclXPA.

2. Goals

The SAOTk and TclXPA projects have had several goals in mind during development. The first goal is minimal buy-in. This means that there should be no installation required. No auxiliary files, libraries, or scripts should be needed. The widgets and libraries should be easy to use and encourage rapid prototyping. Finally, it should be possible to build complete, stand-alone applications using these tools.

The second goal of these projects is support for a wide variety of hardware platforms. Newer machines provide fast CPUs, vast amounts of memory, and truecolor graphics. Older equipment consist of slower CPUs, limited memory, and pseudocolor graphics. The widget set must allow users to take advantage of the new capabilities, while not excluding the older, more limited equipment.

The third goal is cross-platform compatibility. The widget set must run on a wide variety of software platforms, including Solaris, Linux, and Windows.

Finally, the fourth goal is support for very large data sets. With a number of new mosaic ccd ground-based instruments and space-based observatories coming on-line in the near future, data sets of size .5GB to 1GB and greater are now becoming common.

3. SAOTk, TclXPA and Tcl/Tk

When designing SAOTk and TclXPA, it was decided that no alterations of Tcl/Tk should be necessary. These packages use the existing Tcl/Tk API. The SAOTk widgets are implemented as Tk Canvas widgets because the latter provides support for both image data and line graphics, flexible layout management and event processing. The TclXPA package is implemented as Tcl commands. Both packages conform to Tcl/Tk package initialization procedures.

In applications built with SAOTk, TclXPA, and Tcl/Tk, Tk is used to define the GUI, while SAOTk and TclXPA (along with other widget sets, such as BLT and TkTable) provide the functionality. Stand-alone applications are created by use of ET or Mktclapp.

4. SAOTk: Tk Imaging Widgets

The SAOTk widget set consists of four widgets: Frame, Colorbar, Magnifier, and Panner. SAOTk provides support for 8-bit pseudocolor and 8-bit, 16-bit, and 24-bit truecolor environments.

The Frame widget supports display of FITS images, FITS binary tables, FITS mosaic images, and raw data arrays. The FITS keywords BLANK, BZERO, and BSCALE are supported. Data may be accessed via the file system, standard I/O, or Tcl Channels. Memory management via mmap, allocated memory, or shared memory is also supported.

The Frame widget allows arbitrary rotation and zooming of the displayed data. Render time is based on the size of the window, not the size of the data. Scaling, byte-swapping, and binning are all implemented on the fly.

Support for world, image, physical, and IRAF mosaic coordinate systems is provided. The widget supports standard region shapes (circle, ellipse, box, etc.), region properties (color, font, text, etc.) and region file formats (SAOtng, PROS, SAOimage). True PostScript printing is provided with both level 1 and level 2 drivers, with options such as paper size and print resolution.

The Colorbar widget manages the color environment. It has 20 built-in colormaps, and support for external colormap formats from SAOtng, SAOimage, and RTD/Skycat. The Panner widget provides panning functionality using a panner bounding box, along with an image/wcs compass. The Magnifier widget provides a close up view of the current cursor position and can be configured with a pixel cursor.

5. TclXPA: Public Access to Data and Algorithms

The XPA messaging system provides seamless communication between many kinds of Unix programs, including Tcl/Tk programs and X programs. It also provides an easy way for users to communicate with XPA-enabled programs by executing XPA client commands in the shell or by utilizing such commands in scripts. Because XPA works both at the programming level and the shell level, it is a powerful tool for unifying any analysis environment: users and programmers have great flexibility in choosing the best level or levels at which to access XPA services, and client access can be extended or modified easily at any time.

A program becomes an XPA-enabled server by defining named points of public access through which data and commands can be exchanged with other client programs (and users). Using standard TCP sockets as a transport mechanism, XPA supports both single-point and broadcast messaging to and from these servers. It supports direct communication between clients and servers, or indirect communication via an intermediate message bus emulation program. Host-based access control is implemented, as is as the ability to communicate with XPA servers across a network.

XPA client and server subroutines from the standard XPA library can be used to add C-based XPA support to Tcl/Tk programs. Once an XPA service has been defined, it can be added to the Tcl event loop (either vwait() or the Tk event loop) simply by calling the XPATclAddInput() routine before entering the loop. However, the TclXPA package goes beyond support for C-based XPA calls within Tcl. This package allows one to write XPA servers and to make XPA client calls within the Tcl environment, using the Tcl language directly. When the TclXPA package is loaded, Tcl versions of all XPA routines are available for communication with other XPA-enabled programs, including X, Tcl, and Unix programs. The TclXPA interface has been designed to match the Unix XPA interface as nearly as possible, so that programmers can use one standard interface for interprocess communication.

6. Applications

One of the first applications to be developed using the SAOTk and TclXPA packages is DS9. DS9 is an astronomical imaging and data visualization application. DS9 supports FITS images and binary tables, multiple frame buffers, region manipulation, many scale algorithms and colormaps, and easy communication with external analysis tasks. It is highly configurable and extensible.

DS9 is a stand-alone application. It requires no installation or support files. Versions of DS9 currently exist for Sun Solaris, Linux, and Windows. All versions and platforms supports a consistent set of GUI and functional capabilities.

DS9 supports advanced features such as multiple frame buffers, mosaic images, tiling, blinking, geometric markers, colormap manipulation, scaling, arbitrary zoom, rotation, pan, and a variety of coordinate systems. DS9 also supports FTP and HTTP access.

The GUI for DS9 is user-configurable. GUI elements such as the coordinate display, panner, magnifier, horizontal and vertical graphs, button bar, and colorbar can be configured via menus or the command line.

7. Conclusion

More information about the SAOTk widgets, DS9, and XPA (including our latest software offerings) can be found on the World Wide Web at:
http://hea-www.harvard.edu/RD/

Acknowledgments

This work was performed under a grant from NASA's Applied Information System Research Program (NAG5-3996), with additional support from the AXAF Science Center (NAS8-39073).

Tcl and Nanotechnology

NanoDesign: Concepts and Software for a Nanotechnology Based on Functionalized Fullerenes

Al Globus, MRJ, Inc. at NASA Ames Research Center and Richard Jaffe, NASA Ames Research Center.


Introduction

Eric Drexler [Drexler 92a] has proposed a hypothetical nanotechnology based on diamond and there is informed speculation that this technology could have tremendous aerospace applications [Globus 96b]. Unfortunately, no one knows how to build diamonoid components in the laboratory. To gain the benefits of nanotechnology, a more accessible chemical basis is needed. We have chosen to investigate fullerene nanotechnology and develop software to support this work. Software development is at a very early stage. This paper is a status report, not an exposition of finished work.

Fullerene Nanotechnology

A nanotechnology based on fullerenes has been suggested by others. C60 and other cage-like fullerenes provide points, carbon nanotubes provide lines, and these can -- in principle -- be combined to create three dimensional objects. Since fullerenes can be functionalized by a wide variety of molecular fragments [Dresselhaus 96], a wide array of objects with many properties may be created. One measure of the accessibility of fullerenes is the number of patents that have been issued. Another is this email advertisement I received in September 1996 selling fullerenes by the gram.

The first systems we have investigated are various gears built out of single walled carbon nanotubes with o-benzyne groups attached to form the teeth. [Thess 96] has demonstrated a 70% yield in carbon nanotube production so the tube should be synthetically accessible, although [Thess 96] generated (10,10) tubes whereas most of our simulations use (14,0) tubes. [Hoke 92] has shown that benzyne reacts with C60 to form a 1-2 bond between six membered rings and quantum calculations [Jaffe 96a] suggest that a similar reaction should take place on carbon nanotubes, although 1-4 bonds are slightly preferred. Adding aromatic rings to the tube should give us relatively stiff molecular gear teeth, and this has proved to be the case [Han 96].

A typical gear configuration.

Using the NanoDesign design and simulation software described below, [Han 96] has shown that -- assuming you believe the force field -- a number of gear and gear/shaft systems will function mechanically in a vacuum. These simulations used a software thermostat and motor, but there is reason to believe that physical implications of these functions can be provided. Preliminary simulations suggest that cooling is possible using an inert atmosphere. Experimental evidence (Sunny Bains reports in Science , volume 273, 5 July 1996, p. 36 on upcoming papers) and simulation [Tuzun 95] suggest that lasers may be used to rotate the gears. The tube is functionalizing with positive and negative charges in appropriate locations and the lasers are used to create a rotating electric field.

Design Software

The simple molecular machines simulated so far can be easily designed and modeled using ad hoc software and molecule development. However, to design complex systems such as the molecular assembler/replicators envisioned by the NASA Ames Computational Molecular Nanotechnology Project [Globus 96b], a more sophisticated software architecture will be needed. The current NanoDesign software architecture is a set of c++ classes with a tcl front end for interactive molecular gear design. Simulation is via a parallelized FORTRAN program which reads files produced by the design system. We envision a future architecture centered around an object oriented database of molecular machine components and systems with distributed access via CORBA from a user interface based on a WWW universal client.

Current Software Architecture

Current NanoDesign software architecture.

The current system consists of a parallelized FORTRAN program to simulate hydrocarbon systems. Supramolecular conformations come from xyz files (the force field does not require a bond network in the input) produced by a c++ and tcl program using the tcl_c++ interface generator. The software also creates FORTRAN files with indices into an array of atoms indicating where each component (e.g., gear teeth) begins and ends. The user creates tcl files with tcl functions to create and modify c++ objects. For example, this tcl fragment creates a buckytube:
# create a buckytube

set tube [aBuckytube]

# it will be 14,0 tube

$tube setRingCircumference 14

# make it 21 rings long

$tube setRingLength 21

# set the FORTRAN variable name for the tube

$tube setVariableName "tube"

# tell c++ to create the tube

$tube build

# write the confirmation into a file

$tube writeXyz "tube.xyz"

# write the FORTRAN declarations and index assignments into a file

$tube writeFORTRANVariables "tube.f"
See here for details on the FORTRAN output.

tcl_c++

C++ was chosen for molecular design for its object oriented properties and high performance. However, c++ is a compiled language so changes to the code take a bit of time. This is inconvenient when designing molecular systems; an interpreted language would be better. Tcl is meant to be used as an embedded interpreted command language in c and c++ programs. Tcl [Ousterhout 94] is a full featured language with loops, procedures, variables, conditionals, expressions and other capabilities of procedural computer languages. C++ programs can add new tcl functions to any tcl interpreter linked in. Thus, tcl gives us an interpreted interface to the c++ class library so molecules can be designed at interactive rates. Note that both Cerius2 and Insight/Discover commercial computational chemistry packages use tcl for their command language.

The Visualization Toolkit project [Schroeder 96] discovered that a tcl interface to a large c++ class library can require substantial programmer effort to write the glue that allows tcl to control c++ classes. The vtk project avoided this by writing a partial c++ header file parser that reads the c++ header file for a class and automatically generates the tcl interface code. We wanted more control over which c++ member functions were tcl accessible, so the tcl_c++ system requires a file for each c++ class to define which member functions, variables, and constants are tcl accessible. This file is read by a tcl interpreter with tcl procedures defined to generate c++ code to allow another tcl interpreter to control the c++ class in question. Fortunately, although tcl_c++ itself was hard to program, it is easy and convenient for a programmer to use. For details of tcl_c++ see here.

Proposed Future Software Architecture

Future distributed NanoDesign software architecture. Note that each box may represent many instances distributed onto almost any machine.

The current NanoDesign molecular design software appears to the user as an interpreted language based on tcl. This is very effective for design of simple parts and systems. To design and computationally test complex replicators will require a more sophisticated system similar to the mechanical CAD systems available in the commercial marketplace. Furthermore, it would be of substantial practical advantage if the design team could be geographically dispersed. Therefore, we are investigating an software architecture based on a universal client (for example, a WWW browser), CORBA distributed objects, an object oriented database, and encapsulated computational chemistry legacy software. We are also interested in using command language fragments to control remote objects. Software that communicates this way is sometimes called agents [Genesereth 94].

Universal Client

With the advent of modern WWW browsers implementing languages such as Java and JavaScript, it is possible to write applications using these browsers as the user interface. This saves development time since most user interface functionality comes free, integration with the WWW is trivial, and the better browsers run on a wide variety of platforms so portability is almost free. VRML can be used for 3D graphics and plug-ins such as the recently announced Biosym/MSI, Inc. molecule browser provide crucial functionality without much work.

Recently, Netscape, Inc. announced that the netscape WWW browser would be made CORBA (see below) compliant offering a standard way to communicate between application code loaded by the browser and databases and computational chemistry software resident on servers and supercomputers. Previously, only the stateless http protocol was available to web browsers. Hopefully, other companies in the extremely competitive WWW browser market will follow suit.

These developments suggest that a single program can function as the user interface for a wide variety of applications, including computational nanotechnology. These applications load software (e.g. Java applets and JavaScript) into the browser when the user requests it. The applications then communicate with databases and remote objects (such as encapsulated legacy software) to meet user needs.

CORBA (Common Object Request Broker Architecture)

The universal browser is of little use in developing complex molecular machines if it cannot communicate with databases of components and systems and invoke high performance codes on fast machines to do the analysis. CORBA, a distributed object standard developed by the OMG (Object Management Group), provides a means for distributed objects -- for example the universal browser application, a database containing an evolving molecular machine design, and simulation codes -- to communicate and control each other. The simplest description of CORBA is that each object is represented by an interface described by the CORBA IDL (interface description language). Operations and data defined in the IDL may be accessed by other CORBA objects on the network. System software (called ORBs -- object request brokers) is responsible for communicating between objects whether they be on the same machine or widely distributed. See [Siegel 96] for a description of CORBA.

Object Oriented Database

To develop complex molecular machines, databases of components and processes as well as complex databases describing individual systems will be required. Object oriented databases appear to be better than relational databases for design systems for products such as aircraft and molecular machines.

Encapsulated Computational Chemistry Legacy Software

Like most research centers, NASA Ames has a number of very capable codes that do not fit the object model. However, it is often possible to create a c++ object that 'encapsulates' the legacy software. That is, the c++ object has methods that reformat their parameters, execute the legacy software, reformat the result and return it. When the legacy software does IO, the encapsulating object must intervene between the legacy software and the CORBA system. This technique allows existing codes to operate within an object oriented framework with minimal modification.

Agent Style Communication

In this context, agent software means software components that communicate by sending programs to each other. When each component is controlled by a command language, this is relatively easy to implement. Thus, a user interface component could control the tcl/c++ design software by writing a tcl command file and sending it to the design software for execution. This approach to software is powerful but not yet well understood.

Conclusions

The NanoDesign software is intended to design and test fullerene based hypothetical molecular machines and components. The system is in an early stage of development. Presently, tcl provides an interpreted interface, c++ objects represent design components, and a parallelized FORTRAN program simulates the machine. In the future, an architecture based on distributed objects is envisioned. A key requirement for this vision is a standard set of interfaces to various computational chemistry capabilities (e.g., force fields, integrators, etc.). A standard set of interfaces would allow vendors to supply small, high quality components to a distributed system. If you're interested in helping establish these standards, please contact the author at globus@nas.nasa.gov.

Acknowledgments

I would like to thank my colleagues at the Molecular Engineering Laboratory at the University of California at Santa Cruz, led by Dr. Todd Wipke, for many fruitful discussions and an environment tremendously conducive to molecular design.

Sunday, August 8, 2010

TCL and Fortran

FTCL

The Ftcl library offers an easy-to-use set of subroutines and functions, callable from Fortran to run Tcl commands within a Fortran program or to create an extension in Fortran useable by Tcl programs. This manual page describes the use of these routines.

INITIALISATION

The library can be used in two ways:  
  • Interacting with Tcl from within a Fortran program
  • Creating an extension (package of functions, commands and so on) to Tcl, written in Fortran
In the first case, you will need to call the routine ftcl_start to create a Tcl interpreter and initialise it. During this initialisation phase, the usual start-up sequence is run, which means the Tcl routines will look for a useable file "init.tcl" and a script library.
In the second case, you need to provide a routine called package_init that takes care of registering the commands this package provides and registering the name of the package. Tcl will be already running in this case.
New Tcl commands are registered via the routine ftcl_make_command in each case.
OVERVIEW OF ROUTINES
The Ftcl library contains the following functions and subroutines:  

use ftcl 
The Ftcl library is to be accessed via the Ftcl module. If you are using FORTRAN 77, then you will have to use the specific routines.  

call ftcl_get( varname, value )  

Get the value of the Tcl variable whose name is stored in varname.  

character(len=*) varname 

Name of the Tcl variable to get. 
integer/real/double precision/logical/character(len=*) value  Variable that will get the value, cast to whatever type the variable is.  
call ftcl_get_arg( iarg, value )  
Get the value of the iarg'th argument. Used in Fortran-based command routines (i.e. the ones registered via ftcl_make_command).  
integer/real/double precision/logical/character(len=*) value 
Variable that will get the value of the argument, cast to whatever type the variable is.  

call ftcl_get_array( varname, values ) 
Get the values of the Tcl array whose name is stored in varname.  

character(len=*) varname
Name of the Tcl variable to get.  
integer/real/double precision/logical/character(len=*), dimension(:) value
One-dimensional Fortran array that will get the values, cast to whatever type the Fortran array is. Note: The index into the Tcl array is assumed to be an ordinary number. The dimension on the Fortran side determines which elements are filled.

 call ftcl_main_loop  
Enter the Tcl event loop. Useful when creating GUIs. It will not return. 
call ftcl_make_command( procedure, cmdname ) 
Register a Fortran routine to serve as a Tcl command

subroutine procedure
The Fortran routine in question. The interface is this: 
subroutine procedure( cmdname, noargs, ierror )
character(len=*) :: cmdname 
integer :: noargs 
integer :: ierror 
end subroutine procedure 
where: cmdname is the name of the command that was used to call the routine noargs is the number of arguments that was given ierror can be used to indicate success or an error (0 means success) nteger/real/double precision/logical/character(len=*) value Variable that will get the value, cast to whatever type the variable is.
call ftcl_provide_package( pkgname, version, error )  
Register a package by name and version number. To be used from within a package initialisation routine (package_init).
character(len=*) pkgname
Name of the package
character(len=*) version
A version string like "1.0" or "2.2.1"
integer error Variable that indicates whether the package was successfully registered (0) or not (1).
call ftcl_put( varname, value )
Transfer the value to the Tcl variable whose name is stored in varname.  
character(len=*) varname
Name of the Tcl variable to set.
integer/real/double precision/logical/character(len=*) value 
Variable that will get the value, cast to whatever type the variable is. Note: if the Tcl variable is a (Tcl) array, then this may also be a one-dimensional array. The index into the Tcl array is assumed to be an ordinary number. The dimension on the Fortran side determines which elements are filled. 
call ftcl_put_array( varname, values )  
Transfer the values in the one-dimensional Fortran array to the Tcl variable whose name is stored in varname. Note: The index into the Tcl array will be an ordinary number. The dimension on the Fortran side determines which elements are filled.  
character(len=*) varname
Name of the Tcl variable to set.
integer/real/double precision/logical/character(len=*), dimension(:) value 
Variable that will get the value, cast to whatever type the variable is. Note: The index into the Tcl array will be an ordinary number. The dimension on the Fortran side determines which elements are filled.
call ftcl_script( script )  
Run the script contained in the variable "script".  
character(len=*) script 
String containing the Tcl script to be run.  
call ftcl_set_result( value ) 
Set the result of the command - used within command routines.  
integer/real/double precision/logical/character(len=*) value 
The value to be set as the result of the command. This is the value that Tcl returns.  
call ftcl_start( filename )   
Start-up routine for Ftcl. The filename argument should ideally point to the executable, so that the init.tcl file can be found.
character(len=*) filename 
Full name of the executable or another string that enables Tcl to find the file init.tcl and the script library. (See section on deployment) 
EXTENSIONS AND PACKAGES
Tcl itself provides a concise mechanism to load shared libraries or DLLs (the terminology depends on the operating system, we will use "DLL" to mean either). This allows you to extend any Tcl shell (tclsh, wish or a custom interpreter) with your own commands. Ftcl makes it possible to create such extensions in Fortran:
 You should write a Fortran routine with the fixed name package_init. Since the routine must be callable from C, it can not reside in a module. This routine has the task of initialising the package:
  
The routine takes one argument, an integer that on return indicates whether everything was okay or whether there was an error. Here is a sketch:
subroutine package_init( error )
use mypkg 
integer :: error 
call ftcl_provide_package( "mypkg", "1.0", error ) 
if ( error .ne. 0 ) return 
call ftcl_make_command( ... ) 
...
end sobroutine package_init 
Register one or more commands (via ftcl_make_command)  
Do whatever is required for the package to become properly initialised  
Register the package by name and version (via ftcl_provide_package)
Registering the package properly means that Tcl's package mechanism knows about it. It will not attempt to load the DLL again on the next "package require" command.  
You need to write a package index file, called pkgIndex.tcl, that loads the DLL. This takes the following form: 
package ifneeded Mypkg version [list load [file join $dir mypkg.dll Ftclpkg]] 
For your convenience, this can be done via the script mkpackage.tcl in the tools directory. There are three arguments, in the given order:
The name of the package (in the example: Mypkg). This name is case-sensitive  
The version, a string like "1.0" or "2.2.1"  
The name of the actual DLL (without the extension). This name is case-sensitive on some operating systems. The proper extension is added by the script.  
The reason that the argument "Ftclpkg" appears, is that this argument is used by Tcl to construct the name of the initialisation routine for the package. By keeping that name fixed we can avoid any C programming, however simple.  
All that is needed now is to build the DLL with the proper compile and link commands. The section DEPLOYMENT has more to say about possible issues with runtime libraries, so please consult that as well.

Friday, August 6, 2010

Tcl Packages

Following are the Tcl packages for Windows

gdi - Direct access to Windows GDI
iocpsock - Improved sockets for Windows NT
printer - Windows printer contexts
resolver - Asynchronous name resolution
storage - Access and manipulate Microsoft "Structured Storage" files
tcom - Access and implement COM objects
tkribbon - Windows Ribbon
twapi - Tcl Windows API extension
winico - Extended icon handling for Windows
winutils - A random collection of stuff for Win32 Tcl
wmf - Windows metafile handling

Cross-language interoperability


combat - CORBA scripting with Tcl
critcl - C Runtime in Tcl
dbus - Interface to D-BUS
femtol - Tcl in FORTRAN 77
ffidl - Foreign Function Interface
ftcl - A library to combine Fortran and Tcl
memcached - Tcl interface to memcached caching service
tclae - Apple Event Manager interface
tclblend - Integrating Tcl and Java
tcljs - JavaScript interpreter
tclmatlab - Tcl/Matlab integration
tclpython - Call Python from Tcl
webservices - Web Services for Tcl


adbsql - A(nother) database engine for Tcl/Tk
aejaks - Tcl interface to Echo2 web framework
critcl - C Runtime in Tcl
cups - Tcl interface to CUPS
curl - Tcl binding to libcurl, multiprotocol file transfer library
dbi - Generic Tcl interface to SQL databases
dbus - Interface to D-BUS
festtcl - Speech synthesis via festival
gnocl - Tcl meets GTK+ and Gnome
iaxclient - Voice over IP using IAX Protocol
memcached - Tcl interface to memcached caching service
milter - Tcl interface to Sendmail's mail filtering API
mysqltcl - Tcl interface to MySQL database
oratcl - Tcl interface to Oracle database
pgintcl - Pure-Tcl interface to PostgreSQL
pgtcl - Tcl interface to PostgreSQL database
qtcl - Tcl meets Qt and KDE
readline - Interactive command-line editing
snodbc - ODBC bindings
tclgd - Interface to gd-2 graphics library
tclmagick - Image manipulation
tclodbc - Open Data Base Connectivity driver
tclodbc-new - ODBC interface
tclogl - OpenGL bindings
tileqt - Tk/Tile integration with Qt/KDE
twapi - Tcl Windows API extension
uno - Interface to OpenOffice.org
wtcl - Tcl module for Apache web server
xosd - Bindings for libxosd


festtcl - Speech synthesis via festival
image - Image management
img - Additional image formats
pdf4tcl - Generate PDF files
pixane - Advanced image processing
snack - Audio processing toolkit
tclgd - Interface to gd-2 graphics library
tclmagick - Image manipulation
tkpng - PNG support
tkvideo - A video widget for use with Windows.


adbsql - A(nother) database engine for Tcl/Tk
dbi - Generic Tcl interface to SQL databases
fbsql - Tcl interface to MySQL database
memcached - Tcl interface to memcached caching service
metakit - Embedded database library
mysqltcl - Tcl interface to MySQL database
oratcl - Tcl interface to Oracle database
pgintcl - Pure-Tcl interface to PostgreSQL
pgtcl - Tcl interface to PostgreSQL database
ral - Relational Algebra package
raloo - Relation oriented programmming system
snodbc - ODBC bindings
speedtables - High-performance, memory-resident database
sqlite - Lightweight SQL database
sybtcl - Interface to Sybase databases
tcldb - Unified database API
tcldbi - Access to the GNU libdbi generic database interface
tclodbc - Open Data Base Connectivity driver
tclodbc-new - ODBC interface
tdbc - Tcl Database Connectivity
tgdbm - Store key/value pairs in portable files using GDBM

Networking

amazons3 - Tcl interface to Amazon S3 web service
ceptcl - Communication Endpoints
combat - CORBA scripting with Tcl
cups - Tcl interface to CUPS
curl - Tcl binding to libcurl, multiprotocol file transfer library
irdasock - IrDA (infrared) socket support
ldap - LDAP (Lightweight Directory Access Protocol) bindings
milter - Tcl interface to Sendmail's mail filtering API
osc - Open Sound Control interface
pcap - Packet capture
radclient - RADIUS client library
resolver - Asynchronous name resolution
soap - Remote procedure call using SOAP or XML-RPC over HTTP
tls - Secure Sockets Layer
tnm - Network management
udp - UDP socket extension
webservices - Web Services for Tcl
wub - An HTTP server

Tcl on Mars

Tcl and Concurrent Object-Oriented Flight Software:

Tcl on Mars

David E. Smyth
david@devvax.jpl.nasa.gov
Mars Pathfinder Flight Software Team
Jet Propulsion Laboratory
MIDCOM Corporation

Abstract

Mars Pathfinder is the first of a new class of “better, faster, cheaper” interplanetary spacecraft missions being developed for NASA at Caltech’s Jet Propulsion Laboratory. This spacecraft will be launched during the Mars launch window in December 1996 and will land on Mars in July 1997. The lander carries a small 10kg 6-wheeled robotic Rover that will roam the surface and take samples of soil and rocks.

The flight software for Mars Pathfinder has a concurrent object-oriented architecture, using many concepts adapted from the Actor school of thought, as espoused by Carl Hewitt, Gul Agha, and others. The flight computer, a 22 MIP radiation hardened derivative of the PowerPC has 128 megabytes of RAM and uses the VxWorks real-time POSIXish operating system. Subsystems (each of which is an object, and contains and/or controls other objects) executes in one or more threads. Each object is event and message driven.

This paper describes the current early development effort, where we are using Tcl and its object oriented extension itcl, combined with tclX, blt, and tk, as the language for inter-object messages, for the monitor and control environment, and for the initial implementation of several flight software responsibilities. As the system develops, the flight software may remain as Tcl, or it may evolve into C. The similarity between Tcl and C makes the translation of objects from Tcl to C reasonably straightforward.

Overview The Mars Pathfinder flight software is being developed as current object-oriented software. Objects with extensive collaborations are grouped into subsystems. Each subsystem instantiates its own Tcl interpreter, and provides its set of commands that can then be used to invoke the methods of the various
objects within the subsystem. Each subsystem has a “well known” socket through which it reads Tcl scripts. Most subsystems also use other sockets, timers, and interrupt handlers to interact with spacecraft devices such as the camera, transmitter, and Rover. In addition, a central interpreter allows commands to be distributed to all subsystems. For historical purposes, we call this central interpreter the “Sequencer” or “Sequence Engine” and scripts interpreted by this central interpreter “Sequences.” This terminology is common in the spacecraft industry, but is left over from the days of yore, when spacecraft were more like Rube Goldberg machines than the interplanetary robots they are today. Spacecraft Sequencing using Tcl In the spirit of “better, cheaper, faster” Tcl has been proposed as the sequencing and command language for the Mars Pathfinder flight software. The specific reasons include:

1. It is already defined. No effort, cost, nor schedule are required to define, document, develop flight software, adapt ground software (SEQ,
SEQTRAN and CMD) and test a sequencing and command language.

ð Faster k Cheaper

2. Tcl defines only these simple but general concepts:
• A simple syntax: Tcl consists of statements, each statement having the form:
keyword
• a couple mechanisms to group parameters (curly braces and quotes)
• the concept that “statements” can return string values
• a way to indicate one wants the result of a statement (brackets)
• global and local variables, and a way to indicate one wants the value of a variable (dollar sign)
• The itcl extension also provides a simple and useful object model.

Tcl itself defines nothing else. Typically, Tcl “registers” procedures to implement what one would normally consider the “reserved” words of the language, including case, for, while, and if. Even set (for assignment) and proc (for defining Tcl procedures) are registered procedures: they are not part of the syntax of the language. Therefore, we can tailor Tcl to include only those capabilities we really want aboard the spacecraft. All superfluous keywords can be deleted to save memory, or can be added later for additional capability.

ð Elegance is Better

3. Since Tcl is all ASCII, we don't have to also define an ASCII to binary translation for sequences: the SEQTRAN ground software system does not need to be adapted for Mars Pathfinder, as it does with other JPL missions. The CMD ground software system can provide the data envelope required by DSN with little or no project specific adaptation.

ð Cheaper

4. Tcl is scalable: from 11k bytes for the basic kernel to 200k bytes for a full UNIX shell.

ð Smaller is Better

In the most desirable situation, where we never need any logic nor variables in sequences, we only need the absolute “core” capabilities of Tcl. Tcl is designed and implemented such that we can fly a Tcl consisting
only of two small files
tclBasic.c
609 lines of C
bytes compiled:
4384 text, 1568 data
tclParse.c
631 lines of C
bytes compiled:
4688 text, 568 data
Total:
1240 lines of C
bytes compiled:
9172 text, 2136 data
These files do the following
(extracted from the C source header):
/*
* tclBasic.c --
*
* Contains the basic facilities
* for TCL command interpretation,
* including interpreter creation
* and deletion, command creation
* and deletion, and command
* parsing and execution.
*/
/*
* tclParse.c --
*
* This file contains a collection
* of procedures that are used to
* parse Tcl commands or parts of
* commands (like quoted strings or
* nested sub-commands).
*/
5. Tcl allows us (the Mars Pathfinder Flight Software Team) to act confident in claiming that all we need to provide are the high level commands which don't need any logic, because we know that we have in our back pocket the ability to upload arbitrarily complex scripts!

ð CYA is Better

6. If we discover, next year, that we need some more capabilities for our sequencing system, like logic and event triggers, then there is no cost: Tcl already does this. We simply leave these capabilities installed.

ð CYA is Better
Tcl as Object Communication Language Besides ground control of the spacecraft as a system, it is also necessary to control individual software components (threads or Subsystems). Tcl, passed via message
queues (specifically, VxWorks pipes), is proposed as the mechanism used to control the flight software components. Therefore:

• Sequences consist of Tcl scripts
• High level commands are names of on-board Tcl scripts.
• On-board software
components pass Tcl scripts to each other to implement inter-object messaging The advantages of the proposed approach include:

1. We will need some form of scripting system in order to do testing, especially regression testing.

Tcl has been proven to be effective as a scripting language for regression testing of multi-threaded software, and the Mars Pathfinder Flight Software will be multi-threaded. By using Tcl as the mechanism for all
inter-software communication (i.e., between the ground and the spacecraft, and between the software subsystems), then we get a free mechanism for running regression tests. This also means that all
development and testing use the “standard” or flight interfaces, thereby better demonstrating capabilities at an earlier date.

ð Better, Faster k Cheaper

2. Using Tcl for our inter-subsystem communication means we don’t need to define and implement something new -- this is identical to the sequencing problem.

ð Faster k Cheaper

3. Using Tcl for our inter-subsystem communication makes it easy to pass such information across message queues. The ASCII representation insulates components from each other’s completeness (a command can be safely passed even if the recipient has not yet implemented it) and many data type concerns (numeric types can be changed to any other numeric type without interface concerns). This allows different parts of the flight software to evolve at different rates -- and it certainly will!

ð Faster Development is Better

4. Using Tcl for our inter-subsystem communication makes it easy to perform unit testing: a unit test includes the Tcl scripts (commands) the unit expects to get from other subsystems, and generates Tcl scripts (commands) which it should be sending to other subsystems. Tcl is standard text, so the output is easy to validate.

ð Better Testing

5. Using Tcl as the external control for any given subsystem makes it trivial to pass "low level commands" via the same ground environment. We will want to do this during development, and we might want to have this capability for contingencies. Using Tcl adds significant flexibility at zero costduring development and operations.

ð Better Flexibility

6. Using Tcl as the external control for any given subsystem enables us to provide behaviors as scripts. I agree that this is very optional: it is always possible to provide different behaviors all in C, and select between
which one is active via a switch. However, implementing independent, primitive behaviors in C and then implementing higher level, complex behaviors in Tcl can make it easy to upload “software” with less “political” ramifications: we don’t need to uplink binaries, only textual scripts (i.e., Sequences).

ð Better Flexibility

7. Tcl and C are very similar languages in look and feel: therefore, we don't need to sweat the limits: what is done in C, and what is done in Tcl (i.e., via Sequences). We can easily move this “limit” over time without big impacts. We can start out writing many things in Tcl, and evolve the system until very little is in Tcl.

ð CYA is Better

The NIH Factor: Why Tcl?

There are many different possibilities for languages. In the past, JPL has developed languages for commanding and sequencing gspacecraft as a matter of course. Many of the advantages of using Tcl can also be achieved by developing a special purpose language here at JPL. The question of why, then, we should use Tcl has arisen. The reasons for Tcl include:

1. Developing languages is an iterative process: they often start out  elegantly and devolve into a nightmare (e.g., C++), or they start out being ugly and slowly evolve into something reasonably nice (Fortran90). This process is both slow and expensive. Tcl has been tested by fire literally around the world. People like it, even if they have no vested interest in the language. That is quite rare (consider Ada and C++ as examples). I doubt that we will trivially invent something better. Yet using Tcl is trivial.

2. We can say that we are done with the sequencing language already. We can honestly say "yes" to any question anyone asks about its capability ("Can you compile it?" Or "Can you write loops?" Or "Can you
do things based on time?" Or "Can you utilize the VxWorks capabilities?" Or ...).

MVC and Tcl

MVC, or Model-View-Controller, is a useful paradigm for software. For spacecraft flight software, it is especially useful. First, let’s establish the meaning of MVC. Any given software object can be considered to have three aspects: The Model aspect is the inate characteristics and behavior; The View aspect is that which is visible about the object, how it looks, how it shows its inate characteristics to the outside world (people or other objects); The Control aspect is that which allows the object to be manipulated, technically, how the object’s methods are invoked.

A graphical user interface is ideal for viewing and controlling an object. Often, an MVC programmer implements mechanisms to view an control an object using various mechanisms provided by a GUI environment, and then implementes the “model” or innate object code using some general purpose programming language which can be easily connected to the GUI environment.

On Mars Pathfinder, we start out by implementing objects using the itcl  object-oriented extension to Tcl. The view and controller aspects are then directly coded, also in Tcl, using the X Window Widgets provided by Tk. A single object, for example, the Packet Buffer, contains the model, view, and control aspects all jumbled together.

This set of itcl objects can be easily executed on our development  workstations, and demonstrated to “customers,” in our case, managers, the scientists, spacecraft systems engineers, and the spacecraft operations staff. Working with these itcl objects, we can evolve the behavior and collaborations between the objects rapidly and conveniently. Once the objects seem to work pretty well, then we can start evolving
them into flight software.

The first step in the evolution is to revisit the objects, and rip each one in two, resulting in one object which provides the “Model” or innate capabilitiy of the object, and another which provides both the View and
Control aspects. The “Model” object is the prototype for the actual flight software, while the “View-Control” object remains workstation software.

The View-Control object provides a GUI for the flight software. It is used to control and monitor the software during testing. We expect that the same software will continue to be useful for monitoring the software during flight. However, the 20 minute light-time delay between Mars and Earth will probably prevent us from using these objects to control the flight software during the actual mission. One of the primary reasons Tcl was chosen for this task was that local and remote transfers of control require only trivial transformations of Tcl code. Therefore, ripping the objects apart so some method invocations may be across the net is usually very simple. Nevertheless, some changes to themethods do occur, because state data must now be transferred between the two parts: in the all-in-one object, the View widgets can often directly display information as it is updated by the Model methods. Once split, the Model methods must explicitly transfer the changed information back to the View-Control part in order to display data. Initially, we can run the Model object on a flight-like-test-system running the VxWorks operating system somewhere over the net. The View- Control software running on the development workstations can use “expect” to rlogin to a VxWorks host and start a Tcl interpreter and load the Model objects as flight software. Another primary reason for using Tcl is that Tcl and C are reasonably similar in structure. The objectmodel supported by itcl is easily implemented in C, using the guidelines from the author’s paper on Object-Oriented Programming in C. Therefore, we can start by running the Model objects as itcl objects under a Tcl shell running on the VxWorks target. We can then evolve the itcl methods into C code, if needed, as needed. Summary Using Tcl and MVC concepts during prototyping and developing allows us to provide rapid prototypes for requirement refinement, while giving us a simple path for evolving the actual flight software directly from the prototype. We get the user interfaces we need to monitor the spacecraft, and a testing environemt, essentially for free. Using Tcl gives us more capability than we hope we need for sequencing, but it also gives us necessary capability for testing, and desirable capability for intersubsystem communication.

Tcl and CGI

Abstract

CGI scripts enable dynamic generation of HTML pages. This paper describes how to write CGI scripts using Tcl. Many people use Tcl for this purpose already but in an ad hoc way and without realizing many of the more nonobvious benefits. This paper reviews these benefits and provides a framework and examples. Canonical solutions to HTML quoting problems are presented. This paper also discusses using Tcl for the generation of different formats from the same document. As an example, FAQ generation in both text and HTML are described.

Keywords: CGI; FAQ; HTML generation; Tcl; World Wide Web

Introduction

CGI scripts enable dynamic generation of HTML pages [BLee]. Specifically, CGI scripts generate HTML in response to requests for Web pages. For example, a static Web page containing the date might look like this:

The date is Mon Mar 4 12:50:10 EST 1996.

This page was constructed by manually running the date command and pasting its output in the page. The page will show that same date each time it is requested, until the file is manually rewritten with a different date.

Using a CGI script, it is possible to dynamically generate the date. Each time the file is requested, it will show the current date. This script (and all others in this paper) are written in Tcl [Ouster].

puts "Content-type: text/html\n"

puts "
The date is [exec date]."

The first puts command identifies how the browser should treat the remainder of the data - in this case, as text to be interpreted as HTML. For all but esoteric uses, this same first line will be required in every CGI script.

CGI scripts have many advantages over statically written HTML. For example, CGI scripts can automatically adapt to changes in the environment, such as the date in the previous example. CGI scripts can run programs, include and process data, and just about anything that can be done in traditional programs.

CGI scripts are particularly worthwhile in handling Web forms. Web forms allow users to enter data into a page and then send the results to a Web server for processing. The Web form itself does not have to be generated by a CGI script. However, data entered by a user may require a customized response. Therefore, a dynamically generated response via a CGI script is appropriate. Since the response may produce another form, it is common to generate forms dynamically as well as their responses.

CGI Scripts Are Just a Subset of Dynamic HTML Generation

CGI scripts are a special case of generated HTML. Generated HTML means that another program produced the HTML. There can be a payoff in programmatic generation even if it is not demanded by the CGI environment. I will describe this idea further later in the paper.

Simply embedding HTML in Tcl scripts does not in itself provide any payoff. For instance, consider the preparation of a page describing various types of widgets, such as button widgets, dial widgets, etc. Ignoring the body paragraphs, the headers could be generated as follows:

puts "

Button Widgets

"

puts "

Dial Widgets

"

Much of this is redundant and suggests the use of a procedure such as this one:

proc h3 {header} {

puts "

$header

"

}

Now the script can be rewritten:

h3 "Button Widget"

h3 "Dial Widget"

Notice that you no longer have to worry about adding closing tags such as /h3 or putting them in the right place. Also, changing the heading level is isolated to one place in each line.

Using a procedure name specifically tied to an HTML tag has drawbacks. For example, consider code that has level 3 headings for both Widgets and Packages. Now suppose you decide to change just the Widgets to level 2. You would have to look at each h3 instance and manually decide whether it is a Widget or a Package.

In order to change groups of headers that are related, it is helpful to use a logical name rather than one specifically tied to an HTML tag. This can be done by defining an application-specific procedure such as one for widget headers:

proc widget_header {heading} {

h2 "$header Widget"

}

proc package_header {heading} {

h3 "$heading Package"

}

The script can then be written:

widget_header "Button"

widget_header "Dial"

package_header "Object"

Now all the widget header formats are defined in one place - the widget_header procedure. This includes not only the header level, but any additional formatting. Here, the word "Widget" is automatically appended, but you can imagine other formatting such as adding hyperlinks, rules, and images.

This style of scripting makes up for a deficiency of HTML: HTML lacks the ability to define application-specific tags.

Form Generation

The idea of logical tags is equally useful for generation of Web forms. For example, consider generation of an entry box. Naively rendered in Tcl, a 10-character entry box might look this way:

puts ""

This is fine if there is only one place in your code which requires a username. If you have several, it is more convenient to place this in a procedure. Dumping this all into a procedure simplifies things a little, but enough additional attributes on the input tag can quickly render the new procedure impenetrable. Applying the same technique shown earlier suggests two procedures: text and username. text is the application-independent HTML interface. username is the application-specific interface. An example definition for username is shown below. Remember that this is specific to a particular application. In this case, a literal prompt is shown (the HTML markup for this would be defined in yet another procedure). Then the 10-character entry box containing some default value.

proc username {name defvalue} {

prompt "Username"

text $name $defvalue 10

}

When the form is filled out, the user's new value will be provided as the value for the variable named by the first parameter, stored here in "name". Later in this paper, I'll go into this in more detail.

A good definition for text is relatively ugly because it must do the hard work of adding quotes around each value at the same time as doing the value substitutions. This is a good demonstration of something you want to write as few times as possible - once, ideally. In contrast, you could have hundreds of application-specific text boxes. Those procedures are trivial to write and make all forms consistent. In the example above, each call to username would always look identical.

proc text {name defvalue {size 50}} {

puts ""

}

Once all these procedures exist, the actual code to add a username entry to a form is trivial:

username new_user $user

Many refinements can be made. For example, it is common to use Tcl variables to mirror the form variables. The rewrite in Figure 1 tests whether the named form variable is also a Tcl variable. If so, the value is used as the default for the entry.

If username called this procedure, the second argument could be omitted if the variable name was identical to the first argument. For example:

username User

An explicit value can be supplied in this way:

username User=don

And arbitrary tags can be added as follows:

username User=don size=10 \
maxlength=5

Many other procedures are required for a full implementation. Here are two more which will be used in the remainder of the paper. The procedure "p" starts a new paragraph and prints out its argument. The procedure "put" prints its argument with no terminating newline. And puts, of course, can be called directly.

proc p {s} {

puts "
$s"

}

proc put {s} {

puts -nonewline "$s"

}

Inline Directives

Some HTML tags affect characters rather than complete elements. For example, a word can be made bold by surrounding it with and . As before, redundancy can be eliminated by using a procedure:

proc bold {s} {

puts "$s"

}

Unlike the earlier examples, it is not desirable to have character-based procedures call puts directly. Otherwise, scripts end up looking like this:

put "I often use "

bold "Tcl"

put "to program."

These character-based procedures can be made more readable by having them return their results like this:

proc bold {s} {

return "$s"

}

Using these inline directives, scripts become much more readable:

p "I often use [bold Tcl] to program."

Explicit use of a procedure such as bold shares the same drawbacks as explicit use of procedures such as h2 and h3. If you later decide to change a subset of some uses, you must examine all of them. By using logical procedure names, that trap is avoided. For example, suppose that you want hostnames to always appear the same way. But there is no hostname directive in HTML. So you could arbitrarily choose bold and write:

proc hostname {s} {

return [bold $s]

}

An example using this is:

p "You may ftp the files from [host $ftphost] or [host $ftpbackuphost]."

If you later decide to change the appearance of hostnames to, say, italics, it is now very easy to do so. Simply change the one-line definition of the hostname procedure.

URLs

URLs have a great deal of redundancy in them, so using procedures can provide dramatic benefits in readability and maintainability. Similarly to the previous section, hyperlinks can be treated as inline directives. By pre-storing all URLs, generation of a URL then just requires a reference to the appropriate one. While separate variables can be used for each URL, a single array (_cgi_link) provides all URL tags with their own namespace. This namespace is managed with a procedure called link. For example, suppose that you want to produce the following display in the browser:

I am married to Don Libes who works in the Manufacturing Collaboration Technologies Group at NIST.

Using the link procedure, with appropriate link definitions, the scripting to produce this is simple:

p "I am married to [link Libes] who works in the [link MCTG] at [link NIST]."

This expands to a sizeable chunk of HTML:

I am married to Don Libes who works in the Manufacturing Collaboration Technologies Group at NIST.

Needless to say, working on such raw text is the bane of HTML page maintainers. Yet HTML has no provisions itself for reducing this complexity.1

The link procedure is shown in Figure 2. It returns the formatted link given the tag name as its first argument. The second argument, if given, declare a name to be displayed by the browser. The third argument is the URL.

Links can be defined by handcoding the complete absolute URL. However, it is much simpler to create a few helper variables to further minimize redundancy. Figure 3 shows to refer to several of my colleagues whose home pages all exist in the same staff directory.

If the location of any one staff member's page changes, only one line needs to be changed. More importantly, if the directory for the MSID staff pages changes, only one line needs to be changed. MSID_STAFF is dependent on another variable that defines the hostname. The hostname is stored in a separate variable because 1) it is likely to change and 2) there are other links that depend on it.

Figure 4 shows some examples of hosts.

There are no restrictions on tag names or display names. For example, sometimes it is useful to display "Don". Sometimes, the more formal "Don Libes" is appropriate. This is done by defining two links with different names but pointing to the same URL. This is shown in Figure 5.

Similarly, there are no restrictions on the tag names themselves. Consider the link definitions in Figure 6. These are used in paragraphs such as this one:

p "You can ftp Expect from ftp.cme.nist.gov as [link Expect.Z] or [link Expect.gz]"

A browser shows this as:

You can ftp Expect from ftp.cme.nist.gov as pub/expect/expect.tar.Z or ...gz.

Having link dependencies localized to one place greatly aids maintenance and testing. For example, if you have a set of pages that use the definitions (i.e., by sourcing them), editing that one file automatically updates all of the other pages the next time they are regenerated. This is useful for testing groups of pages on a different server, such as a test server before moving them over to a production location. Even smaller moves can benefit. For example, it is common to move directories around or create new directories and just move some of the files around.

Quoting

HTML values must be quoted at different times and in different ways. Unfortunately, the standards are hard to read so most people guess instead. However, intuitively figuring out the quoting rules is tricky because simple cases don't require quoting and many browsers handle various error cases differently. It can be very difficult to deduce what is correct when your own browser accepts erroneous code. This section presents procedures for handling quoting.

CGI Arguments

CGI scripts can receive input from either forms or URLs. For example, in a URL specification such as http://www.nist.gov/expect?help=input+foo, anything to the right of the question mark becomes input to the CGI script (which conversely is to the left of the question mark).

Various peculiar translations must be performed on the raw input to restore it to the original values supplied by the user. For example, the user-supplied string "foo bar" is changed to "foo+bar". This is undone by the first regsub in unquote_input (shown in Figure 7). The remaining conversions are rather interesting but understanding them is outside the point of this paper.

The converse procedure to unquote_input is shown below. This transformation is usually done automatically by Web browsers. However, it can be useful if your CGI script needs to send a URL through some other means such as an advertisement on TV.

proc quote_url {in} {

regsub -all " " $in "+" in

regsub -all "%" $in "%25" in

return $in

}

In theory, this procedure should perform additional character translations. However, you should avoid generating such characters since receiving URLs outside of a browser requires hand-treatment by users. In these situations, all bizarre character sequences should be avoided. For the purposes of testing (feeding input back), additional translation is also unnecessary since any other unquoted characters will be passed untouched.

Suppressing HTML Interpretation
In most contexts, strings which contain strings that look like HTML will be interpreted as HTML. For example, if you want to display the literal string  it must be encoded so that the "<" is not turned into a hyperlink specification. Other special characters must be similarly protected. This can be done using quote_html, shown below: proc quote_html {s} { # ampersand must be done first! regsub -all {&} $s {\&} s regsub -all {"} $s {\"} s regsub -all {<} $s {\<} s regsub -all {>} $s {\>} s

return $s

}

This can be used to simplify other procedures. Adding explicit double quotes before returning the final value allows simplification of many other procedures. Assuming this new procedure is called dquote_html, consider the earlier text entry procedure which had the code fragment

value=\"$defvalue\"

This could be rewritten:

value=[dquote_html $defvalue]

Argument Cracking
As described earlier, input strings to a CGI script are encoded by the browser. Besides the transformations described already, the browser also packs all variable values together in the form variable1=value1&variable2
=value2&variableN=valueN.

The input procedure (Figure 8) splits the input back into its specific variable/value pairs leaving them in a global array called _cgi_var. Any variable ending with the string "List" causes its value to be treated as a Tcl list. This allows, for example, multiple elements of a listbox to be extractable as individual elements.

If the procedure is run in the CGI environment (i.e., via an HTTPD server), input is automatically read from the environment. If not run from the CGI environment (i.e., via the command line), the argument is used as input. This is very useful for testing. An explicit argument obviates the need for using a real form page to drive the script and means it is easily run from the command line or a debugger.

If the global variable _cgi(debug) is set to 1, the procedure prints the input string before doing anything else. This is useful because it may then be cut and pasted into the procedure argument for debugging purposes, as was just mentioned.

Import/Export

Variables are not automatically entered into separate global variables or the env array because that would open a security hole. Instead, variables must be explicitly requested. Several procedures simplify this. The procedure most commonly used is "import".

import is called for each variable defined from the invoking form. For example, if a form used an entry with "name=foo", the command "import foo" would define foo as a Tcl variable with the value contained in the entry. The command import_cookie is a variation that obtains the value from a cookie variable - a mechanism that allows client-side caching of variables.

proc import {name} {

upvar $name var

upvar #0 _cgi_uservar($name) val


set var $val

}

Form variables are automatically exported to the called CGI script. It is sometimes necessary to export other variables. This must be done explicitly. Figure 9 shows the export procedure which exports the named variable. Similar to the text procedure, if the first argument is in the form "var=value", the variable as exported with the given value. Otherwise, the variable is treated as a Tcl variable and its value is used.

Error Handling

The CGI environment makes no special provisions for errors. Thus, error processing requires explicit handling by the application programmer. If none is made, any error messages produced (e.g., by the Tcl interpreter) are sent on to the client browser. These are rarely meaningful to the user. Even worse, they can be misinterpreted as HTML in which case the result is incomprehensible even to the script creator.

The procedure in Figure 10 provides a framework to evaluate the body of the CGI script, to automatically catch errors, and attempt to do something useful. The two arguments, head and body, are blocks of Tcl commands which create the head and body of an HTML form. An example is shown later.

If the global value _cgi(debug) is 1, the script error is formatted and printed to the screen so that it is readable. If debug is 0, a simple message is printed saying that an error occurred and that the "diagnostics are being emailed to the service system administrator". At the same time, mail is sent to the service administrator. The mail includes everything about the environment that is necessary to reproduce the problem including the error, the script name, and the input. The implementation shown here is skeletal. In the actual definition, a variety of other interesting problems are handled. For instance, cookie definitions must appear in the output before any HTML. However, cookies are more easily generated as one of the final results in a script. This and other problems are solved by the full implementation, however the details are beyond the scope of this paper.

Using the procedures defined, CGI scripts become very simple. They all start out by sourcing the CGI support routines. Then cgi_eval is called with arguments to create the head and body. The head generates titles, link colors, etc., while the body is responsible for importing, exporting, and generation of text and graphical elements as has already been described. A skeletal example is shown in Figure 11

The title procedure (not shown) produces all of the usual HTML boilerplate including titles, backgrounds, etc. A form procedure simplifies the calling conventions for establishing any forms. This is not difficult. However, of critical importance is noting that a form is in progress. Because some browsers won't show anything if a form hasn't been ended (i.e., "/form"), the error handler must prematurely close the form if an unexpected error occurs. Saving this information is done with a simple global variable. The form procedure is shown in Figure 12.

Many other utilities are necessary such as procedures for each type of form element. Space prevents inclusion of them. Several other miscellaneous utilities complete the basic implementation of the procedures that appear in this paper. A few are mentioned here to give a flavor for what is necessary:

cgi Converts a form name to a complete URL.

mail_start Generates headers and writes them to a new file representing a mail message to be sent.

mail_add Writes a new line to the temporary mail file.

mail_end Appends a signature to the temporary mail file, sends it, and deletes the file.

cgi_body_start Generates the tag and handles user requests such as backgrounds and various color options. cgi_body_end is analogous.

All of the procedures described so far can be invoked with "cgi_" prepended (if they do not already begin that way). In practice, CGI scripts are generally quite short so this isn't often useful - and writing things like "cgi_h2" is particularly irritating. However conflicts with other namespaces can occasionally make such prefixes a necessary evil.

Several procedures are expected to be redefined by the user. Here are two examples that appear in the body procedure earlier.

app_body_start Application-supplied procedure, typically for writing initial images or headers common to all pages.

app_body_end Application-supplied procedure, typically for writing signature lines, last-update-by, etc.

FAQ generation

Earlier I mentioned that CGI scripts are just a subset of HTML generation. As an example, consider the task of building an FAQ in HTML. There is no benefit to dynamically generating an FAQ - it rarely changes. However, an FAQ has some of the same problems as I described earlier. For example, it can include many links which must be kept current.

Another reason that it makes sense to think about generating HTML for an FAQ is that an FAQ is highly stylized. For example, an FAQ always has a set of questions. These questions are then repeated but with answers. Written manually, you would have to literally repeat the questions and create the links. If a new question was added or an old one deleted, you would have to carefully make sure that both entries were handled identically.

Intuitively, this could be automated using two loops. First, the questions and answers would be defined. Then the first loop would print the questions. The second loop would print the questions (again) interspersed with the answers. In pseudocode:

define QAs ;# pseudocode!


foreach qa $QAs {

print_question $qa

}


foreach qa $QAs {

print_question $qa

print_answer $qa

}

It suffices to store the questions and answers in an array. The following code numbers each pair and stores question N in qa(N,q) and the corresponding answer in qa(N,a). At the same time, the question is printed out. Thus, there is no need for the first loop in the earlier pseudocode.

proc question {q a} {

global index qa


incr index


set qa($index,q) $q

set qa($index,a) $a


Each question automatically links to its corresponding answer, linked as #qN. When the question/answer pairs are later printed, they will have A HREF tags defining the #qN targets.

The source for an example question/answer definition is shown in Figure 13.

The question is now only stated once and it is always paired with the answer. This simplifies maintenance.

Notice that the answer is not simply a string. The answer is Tcl code. This makes it possible to use all of the techniques mentioned earlier. For example, the example above uses p to generate new paragraphs and link to generate hyperlinks.

The code is evaluated by passing the answer to eval whenever it is needed. An answer procedure does this and generates the hyperlink target at the same time.

proc answer {i} {

global qa


puts "
"

puts ""

puts "
"

puts "
"

eval $qa($i,a)

}

For example, "answer 0" would produce the beginning of the output from the earlier question. The full HTML would begin like this:
Expect is a tool primarily for automating interactive . . .

The answer procedure itself is called from a loop in another procedure called answers (Figure 14). An answer_header procedure prints out a header if one has been associated with the current question. This provides a way of breaking the FAQ into sections. A matching procedure (question_header) defines and prints the headers as they are encountered.

proc answer_header {i} {

global qa


h3 "$qa($i,h)"

}


proc question_header {h} {

global index qa


set qa($index,h) $h

puts ""

h3 $h

puts "
"

}

Translation to Other Formats

Another benefit of using logical tags is that different output formats can be generated by changing the application-specific procedures. For instance, suppose a horizontal rule is produced using the hr command. Obviously this can be defined as "puts

". It is easily changed to produce text using the following procedure:

proc hr {} {

puts ============================

}

Here are analogous definitions for h1 and h2. Others are similar.

proc h1 {s} {

puts ""

puts "*"

puts "* $s"

puts "*"

puts ""

}


proc h2 {s} {

puts "*** $s ***"

}

For example, with this new definition, "h1 Questions" reasonably simulates a level 1 header using only text as:

*

* Questions

*

The ability to generate the FAQ in different forms is convenient. For example, it means that people can read the FAQ without having an HTML browser.

The generation of different formats is simplified by avoiding use of explicit HTML tags and instead using logical procedure names. A particular output format can be produced merely by providing an appropriate set of procedure definitions. Although I have not done so, it should be possible to adapt the framework and ideas shown here to produce output in such formats as TEX, MIF, and others. Even without translation, avoiding explicit HTML is a good idea for the reasons mentioned earlier - maintenance and readability.

A Translation Framework

Translation is further simplified by separating the application-specific definitions from the content of the particular document. For example, multiple FAQs could reuse the same set of FAQ support definitions. Each FAQ would start by loading the FAQ definitions by means of a source command appropriate to the desired output:

source FAQdriver.$argv

A driver for each output format defines the procedures to produce the FAQ in that particular format. For example, FAQdriver.html would begin:

# driver.html - Tcl to HTML procs

proc hr {} {puts "

"}

FAQdriver.text would start similarly:

# driver.text - Tcl to text procs

proc hr {} {puts ===================}

If short enough, all of the different definitions can be maintained as a single file which simply uses a switch to define the appropriate definitions.

switch $argv {

html {

proc emphasis {s} {

puts "$s"

}

. . .

}

text {

proc emphasis {s} {puts "*$s*"}

. . .

}

}

In either case, output generation is then accomplished by executing the document with the argument describing the desired format. For example, assuming the FAQ source is stored in ExpectFAQ, HTML is generated from the command line as:

% ExpectFAQ html

Text output is generated as:

% ExpectFAQ text

Experiences

The techniques described in this paper have been used successfully in building several projects consisting of large numbers of pages including the NIST Application Protocol Information Base [Lubell] and the NIST Identifier Collaboration Service [Libes95]. In addition, they have been used to construct and maintain several FAQs including the Expect FAQ [Libes96].

Readers interested in comparative strategies to CGI generation should consult the Yahoo database [Yahoo] which lists CGI libraries for dozens of languages, often with multiple entries for each. Readers should also explore alternative strategies to CGI, such as the Tcl-based server-side programming demonstrated by Audience1 [Sah] and NeoScript [Lehen] which elegantly solve problems that CGI alone cannot address adequately.

The other aspect of this paper, dynamic document generation, is also an area rich in development. Various attempts are being made to solve this in other ways including SGML and its extensions and alternatives. Good discussion of these can be found in [Harman].

Concluding Notes

This paper has shown the benefits of generating HTML from Tcl scripts. CGI scripts are an obvious use of this. However, even static documents benefit by increasing readability and improving maintainability.

Traditionally, Perl has been the language of choice for CGI scripting. However, use of Tcl for CGI scripting has increased significantly. Part of this is simply due to the number of people who already know Tcl. But Tcl brings with it many beneficial attributes: Tcl is a simple language to learn. Its portability is excellent, it is robust, and it has no significant startup overhead. And of course it is easily embeddable in other applications making it that much easier to leverage ongoing development in languages such as C and C++.

These are all characteristics that make Tcl very attractive for CGI scripting. However, Tcl does not have a history of use for CGI scripting and there is little documentation to help beginners get started. Hopefully, this paper will make it easier for more people to get starting writing CGI scripts in Tcl.

Availability

The CGI library described is available at http://www.cme.nist.gov/pub/expect/cgi.tcl.tar.Z. The FAQ library described can be retrieved from the Expect FAQ itself [Libes96]. This software is in the public domain. NIST and I would appreciate credit if you use this software.

Acknowledgments

Thanks to Josh Lubell, John Buckman, Mark Williamson, Steve Ray, and the Tcl `96 program committee for valuable suggestions on this paper.

References

[BLee] T. Berners-Lee, D. Connolly, "Hypertext Markup Language - 2.0, RFC 1866, HTML Working Group, IETF, Corporation for National Research Initiatives, URL: http://www.w3.org/pub/WWW/MarkUp/html-spec/html-spec_toc.html, September 22, 1995.

[Harman] Harman, D., "Overview of the Third Text REtrieval Conference (TREC-3), NIST Special Publication 500-225, NIST, Gaithersburg, MD, April 1995.

[Lehen] Lehenbauer, K., "NeoScript", URL: http://www.NeoSoft.com/neoscript/, 1996.

[Libes95] Libes, D., "NIST Identification Collaboration Service", URL: http://www-i.cme.nist.gov/cgi-bin/ns/src/welcome.cgi, National Institute of Standards and Technology, 1995.

[Libes96] Libes, D., "Expect FAQ", URL: http://www.cme.nist.gov/pub/expect/FAQ.html", National Institute of Standards and Technology, 1996.

[Lubell] Lubell, J., "NIST Identification Collaboration Service", URL: http://www-i.cme.nist.gov/proj/apde/www/apib.htm, National Institute of Standards and Technology, 1996.

[Ouster] Ousterhout, J., "Tcl and the Tk Toolkit", Addison-Wesley Publishing Co., 1994.

[Sah] Sah, A., Brown, K., and Brewer, E., "Programming the Internet from the Server-Side with Tcl and Audience1", Tcl/Tk Workshop 96, Monterey, CA, July 10-13, 1996.

[Yahoo] "Yahoo!", URL: http://www.yahoo.com/Computers_and_Internet/Internet/World_Wide_Web/CGI___Common_Gateway_Interface/, April, 1996.


Writing CGI scripts in Tcl

--------------------------------------------------------------------------------

Don Libes

National Institute of Standards and Technology

libes@nist.gov

Reprinted from The Proceedings of the Fourth Annual Tcl/Tk Workshop `96, Monterey, CA, July 10-13, 1996.



--------------------------------------------------------------------------------

1 It is tempting to think that relative URLs can simplify this, but relative URLs only apply to URLs that are, well, relative. In this example, the URLs point to a different host than the one where the referring page lives. Even if this isn't the case, I avoid relative URLs because they prevent other people from copying the raw HTML and pasting it into their own page (again, on another site) without substantial effort in first making the URLs absolute.
  • $q"

    puts "

  • $qa($i,q)"

    puts "

  • I keep hearing about Expect. So what is it?