Tcl 7.5 - A Good Read

This article was originally written for a session at the 2012 SPA conference entitled "A Good Read". Unlike the Radio 4 programme of the same name, we were reading code not books. Each participant presented some code that they thought was worth reading and explained why.

My choice was Tcl 7.5. Not the language, which I find rather clunky, but the C implementation of the interpreter and intrinsic library.

I first used Tcl 6 early in my career to build a tool to script the management of network switches. At the time, Tcl couldn't dynamically load new commands into an interpreter. You embedded the Tcl as a library into your own program and added commands to the interpreter that called back into your own code. To learn how to embed and extend the interpreter I had to read Tcl's C source.

It was probably the first time I'd had to read a non-trivial C program. And, when I think of some of the C I've seen since, I was very lucky that this was the first C I was exposed to. Tcl showed me that it's possible to write clean, modular, maintainable code in C and how to do so. I continued to use Tcl and Tk, Tcl's GUI toolkit, on other projects and saw how well the code evolved over time.

By version 7.5, the version I talked about at SPA, the code base had grown quite a lot but was still just as easy to read. You can still download the Tcl 7.5 source code if you want to judge for yourself.

So, what did I learn from Tcl about how to modularise a large C codebase?

Be Disciplined About Namespace Management

C doesn't have namespaces: all top-level identifiers are either static, meaning private to a compilation unit, or part of a single global namespace. C programs rely on naming conventions to avoid collisions. The typical convention is for a library or module to use a (hopefully) unique prefix for all their identifiers.

Tcl is very strict about its use of prefixes and naming conventions. Types and functions are in CamelCase, macros and constants in UPPER_CASE_WITH_UNDERSCORES. Static functions don't need a prefix. Identifiers that are part of the published API are prefixed with Tcl_. Identifiers that are shared internally between Tcl's modules but are not part of the published API are prefixed with Tcl. Macros and constants are prefixed with 'TCL_'.

For example, from generic/tcl.h:

typedef struct Tcl_Interp {
    ...
} Tcl_Interp;

EXTERN Tcl_Interp *Tcl_CreateInterp(void);

Note: in the code samples I've not shown macros that allow the code to compile as ANSI C or K&R C. They are not needed these days.

Clearly Document Published APIs

The Tcl APIs are all really well documented and commented. Every function has a block comment that describes its behaviour, parameters, return value, side effects, and how it reports errors. Even static functions that are not part of the API are commented. Each type is described and each field of a struct type is followed by a comment describing what it holds. Each constants and enum value has a comment describing what it represents and where it is used.

For example, from generic/tclInterp.c:

/*
 *----------------------------------------------------------------------
 *
 * Tcl_CreateInterp --
 *
 *      Create a new TCL command interpreter.
 *
 * Results:
 *      The return value is a token for the interpreter, which may be
 *      used in calls to procedures like Tcl_CreateCmd, Tcl_Eval, or
 *      Tcl_DeleteInterp.
 *
 * Side effects:
 *      The command interpreter is initialized with an empty variable
 *      table and the built-in commands.
 *
 *----------------------------------------------------------------------
 */

Tcl_Interp *Tcl_CreateInterp() ...

The documentation is in the C files, not in the generic/tcl.h header file, where you'd first look for it, but Tcl has Unix manual pages for all the Tcl API functions, so that's not such a problem.

Avoid Static and Global State

Tcl avoids static and global state. When looking through the Tcl code to prepare the talk, I found only a handful of static variables in tclAsync.c and tclEvent.c to manage the per-process event queue and not a single global variable.

Instead, a program using Tcl must instantiate abstract data types by calling API functions and pass pointers to those instances to the Tcl APIs. For example, before one can interpret any Tcl code, you must create a Tcl_Interp (interpreter) by calling Tcl_CreateInterp(). This returns a pointer to a Tcl_Interp that is then passed to functions that define native Tcl commands or evaluate Tcl code.

The lack of global state paid off later when the Tcl team added support for multithreading and sandboxed execution.

Tcl doesn't need a global interpreter lock. Instead, each thread has its own Tcl_Interp instance and interpreters in different threads communicate by message passing. It helps that Tcl has extensive support for event-driven I/O, thanks to the influence of its Tk GUI toolkit

To sandbox untrusted code, Tcl creates supervisor relationships between interpreters. An interpreter that receives some untrusted code creates a subordinate interpreter to run it. A subordinate interpreter initially has none of the standard commands that can do unsafe activities, such as access system resources or make network connections. The supervisor interpreter defines commands in the subordinate that have the same API as the standard library but call back to the supervisor to be checked against the supervisor's security policies. That moves the problem of security out of the language runtime and makes it scriptable.

Keep Data Types Abstract

When using the Tcl API, many of the data types you use are entirely abstract. You instantiate and manipulate them only through API functions, and do not rely on their precise definition or read and write their fields.

Some data types do expose public fields for performance reasons, although most of those use macros to provide a uniform programming interface and shield source code from future changes to the type.

For example, from generic/tcl.h:

/*
 * Data structures defined opaquely in this module.  The definitions
 * below just provide dummy types.  A few fields are made visible in
 * Tcl_Interp structures, namely those for returning string values.
 * Note:  any change to the Tcl_Interp definition below must be mirrored
 * in the "real" definition in tclInt.h.
 */

typedef struct Tcl_Interp{
    char *result;               /* Points to result string returned by last
                                 * command. */
    void (*freeProc)(char *blockPtr);
                                /* Zero means result is statically allocated.
                                 * TCL_DYNAMIC means result was allocated with
                                 * ckalloc and should be freed with ckfree.
                                 * Other values give address of procedure
                                 * to invoke to free the result.  Must be
                                 * freed by Tcl_Eval before executing next
                                 * command. */
    int errorLine;              /* When TCL_ERROR is returned, this gives
                                 * the line number within the command where
                                 * the error occurred (1 means first line). */
} Tcl_Interp;

typedef struct Tcl_Command_ *Tcl_Command;

...

EXTERN Tcl_Interp *    Tcl_CreateInterp(void);
EXTERN void            Tcl_DeleteInterp(Tcl_Interp *interp);
EXTERN int             Tcl_Eval(Tcl_Interp *interp, char *cmd);
EXTERN int             Tcl_EvalFile(Tcl_Interp *interp, char *fileName);
EXTERN Tcl_Command     Tcl_CreateCommand(Tcl_Interp *interp,
                            char *cmdName, Tcl_CmdProc *proc,
                            ClientData clientData,
                            Tcl_CmdDeleteProc *deleteProc);
...

Use Polymorphism Judiciously

Tcl also has a few object-oriented types that let the interpreter treat different things in a uniform way and let the user plug application-specific implementations into the interpreter and runtime. For example, different I/O channel types — files, pipes, TCP/IP sockets, etc. — have the same polymorphic interface so that Tcl's I/O commands and event loop work with all channel types and the programmer can add new channel types if necessary.

In Tcl, polymorphic objects are implemented by structs that contain, a pointer to a function table, a pointer to some class-specific instance data, and "base class" fields that are common to all implementations of the type. The functions in the table take a pointer to the object as their first parameter.

This gives you one level of single-inheritance, but that's enough for user-defined plugins.

typedef struct Tcl_ChannelType {
    char *typeName;                         /* The name of the channel type in Tcl
                                             * commands. This storage is owned by
                                             * channel type. */
    Tcl_DriverBlockModeProc *blockModeProc; /* Set blocking mode for the
                                             * raw channel. May be NULL. */
    Tcl_DriverCloseProc *closeProc;         /* Procedure to call to close
                                             * the channel. */
    Tcl_DriverInputProc *inputProc;         /* Procedure to call for input
                                             * on channel. */
    Tcl_DriverOutputProc *outputProc;       /* Procedure to call for output
                                             * on channel. */
    ...
} Tcl_ChannelType;

EXTERN Tcl_Channel Tcl_CreateChannel(Tcl_ChannelType *typePtr, 
                                     char *chanName,
                                     Tcl_File inFile, 
                                     Tcl_File outFile,
                                     ClientData instanceData);

EXTERN int Tcl_Read(Tcl_Channel chan, char *bufPtr, int toRead);
EXTERN int Tcl_Write(Tcl_Channel chan, char *s, int slen);
EXTERN int Tcl_Close(Tcl_Interp *interp, Tcl_Channel chan);
...

Prefer Modularisation Over Conditional Compilation

Tcl 7.5 was the first version that supported platforms other than Unix. The Tcl API now provided platform-independent APIs for I/O and other operating system services, and for dynamically loading native extensions into interpreter.

Unlike a lot of C code I've seen since, Tcl did not implement platform independence by conditional compilation. Instead the public APIs were split into platform independent and platform dependent functions. The platform independent function interfaces with the Tcl interpreter and call the platform-dependent function to interface with the operating system.

There are implementations of the platform-dependent function for each platform that requires a different implementation. For example, I/O functions are implemented in the source files unix/tclUnixChan.c for POSIX, win/tclWinChan.c for Win32 and mac/tclMacChan.c for MacOS.

At the time, Unix had no standard API for loading dynamic libraries, so functions to load extensions are implemented for many different Unix variants. As well as win/tclWin32DLL.c and mac/tclMacLoad.c, loading is implemented in unix/tclLoadAix.c, unix/tclLoadAout.c, unix/tclLoadDl.c, unix/tclLoadDl2.c, unix/tclLoadDld.c, unix/tclLoadNext.c, unix/tclLoadOSF.c, unix/tclLoadShl.c!

The appropriate platform-dependent functions are compiled and linked into the Tcl interpreter by the Makefile. This is approach, supported by the clear naming conventions, makes the code much easier to understand than if platform-dependent code is intermingled with platform independent code and switched on or of with #ifdefs.

The Best Way to Modularise a Large C Codebase is Don't Have a Large C Codebase!

The addition of dynamic loading of native libraries taught me a new way to modularise C code — as small Tcl extensions glued together by Tcl code.

Each Tcl extension acts as an independent component that provides and requires services. An extension provides services by defining new commands that can be executed by a script. It use services provided by other components by running small callback scripts. Unfortunately, because Tcl 7.5 has no closures or objects, all callbacks run in the global scope and must interact through global variables.

Each Tcl extension is small, so complexity does not grow to the point that the limitations of the C language become overwhelming. For example, namespace pollution is less of an issue because Tcl calls into the extension through a small interface of well-known function names. You can use the linker to limit the functions that are exported from the extension's dynamic library. This gives you three levels of visibility rather than the two you normally get with C: functions that are private to a compilation unit, functions that are visible between compilation units of the extension but not visible to other extensions, and the functions published by the extension for use by Tcl and other extensions. If necessary you can even use the linker to limit the exported functions to a known set, hiding any others inside the extension's dynamic library.

Tcl embodies a very useful principle: C doesn't have a good module mechanism, so keep C components small and glue them together via some framework that implements a module mechanism for you.

When C components are used to extend an interpreted language, the resulting architecture is known as Alternate Hard and Soft Layers. Unix shell filters also follow this principle but use streams, rather than an interpreted language, to glue small C components together.

Mistaeks I Hav Made