Teyjus - An Implementation of Lambda Prolog

What exactly is Teyjus?
About the Name of the System
Overview of the Lambda Prolog Abstract Machine
Overview of the Teyjus System
Related Systems and Implementations
The People Behind Teyjus

What is Teyjus?

The Teyjus system is intended to be an efficient and (eventually) robust implementation of the language Lambda Prolog that was designed by Dale Miller and Gopalan Nadathur. Lambda Prolog is a logic programming language that is based on the intuitionistic theory of higher-order hereditary Harrop formulas. This class of formulas enriches that of Horn clauses---the logical basis of Prolog---with the possibilities of quantifying over function and (certain occurrences of) predicate variables, of explicitly representing binding in terms and of using a fuller complement of connectives and quantifiers. By systematically exploiting the new features in the underlying logic, Lambda Prolog provides support at the programming level for capabilities such as higher-order programming, polymorphic typing, scoping over names and procedures, modular programming, abstract data types and the use of lambda terms as data structures.

Much research has been conducted in recent years towards fully understanding the usefulness of the additions to logic programming embodied in Lambda Prolog. Two aspects that have received special attention are the availability of lambda terms for representing objects and of a suitable set of primitives for manipulating such representations. These features enable Lambda Prolog to support the notion of higher-order abstract syntax, a profitable way to view the syntax of objects whose structures involve binding. Several detailed studies have, in fact, indicated the utility of this language in building symbolic systems that manipulate objects such as formulas, programs, proofs and types. Motivated by such applications, Nadathur and colleagues have invested extensive effort in understanding the special implementation challenges posed by the new features of the language and in devising methods for meeting these well. An outcome of this research has been the description of an abstract machine for lambda Prolog. The Teyjus system is based on these ideas and includes a portable, C-based, emulator for the abstract machine and a compiler for translating lambda Prolog programs into byte code that can be run using a realization of the abstract machine.

About the Name of the System

Teyjus is a word in Sanskrit that, amongst other things, means sharpness, edge, brilliance, splendour, vitality, strength, and majesty. Many of these are properties that the system aspires to in the long haul. As with all words from a different language, and especially one with a rich variety of sounds not present in English, there is a question of what the `right' transliteration for the name is. One found commonly in Sanskrit to English dictionaries is tég-as. However, we found it difficult to steer others to the right pronounciation using this way of writing the name and, of course, we would like everyone to pronounce the name correctly! The way we have chosen to write the name might make the task easier. Here is some help: try saying `Tey' like you would `Day' except for replacing the first letter with `D', and `jus' like you would `just' but for leaving out the last letter. This should get you close; unfortunately, there is no sound in English exactly like the one the first `T' is intended to produce.

If you are genuinely interested in getting the pronounciation right, you may click here to hear it spoken.

Overview of the Lambda Prolog Abstract Machine

An integral part of the implementation scheme is an abstract machine that is capable of realizing the operations that arise in typical Lambda Prolog programs efficiently. As with other logic programming languages, unification and backtracking are intrinsic to Lambda Prolog and the Warren Abstract Machine (WAM) provides a basic structure for treating these aspects well. However, an extensive embellishment of this framework is needed for realizing the following additional features satisfactorily:

The language contains primitives that can alter the name space and the definitions of procedures in the course of execution. This means, in particular, that unification has to pay attention to changing signatures and that the solution to each (sub)goal has to be relativized to a specific program context.
In contrast to other languages, lambda terms are used in Lambda Prolog as data structures. A representation must therefore be provided for these terms that permits their structures to be examined and compared in addition to supporting reduction operations efficiently.
Higher-order unification is used in an intrinsic way in the language. This operation has a branching character that must be supported. Further, it is sometimes preferable to delay the solution to unification problems and so a good method is needed for transmitting these across computation steps.
In addition to having a role in determining program correctness, types are relevant to the dynamic behavior of programs. A sensible scheme must therefore be included for carrying these along at run-time.
Programming in the language is done relative to modules. In realizing this feature, it is necessary to support certain operations for composing different modules. Moreover, a mechanism must be provided for periodically adding and removing code depending on which modules are in use.

The abstract machine that has been developed includes devices for treating all these aspects well. The solution to the problem of changing signatures is based on an elegant scheme for tagging constants and variables and using these tags in unification. To realize changing program contexts, a fast method has been designed for adding and removing code that is capable also of dealing with backtracking. The code that needs to be added may sometimes contain global variables and this possibility has been dealt with by an adaptation to logic programming of the idea of a closure. To facilitate a sensible representation of lambda terms, a new notation has been designed for these terms that utilizes the scheme of de Bruijn for eliminating names and that additionally supports an incremental calculation of reduction substitutions. This notation has then been deployed systematically in the low-level steps contained in the abstract machine. A method has been devised for the purpose of representing suspended unification problems that has several interesting features. For example, based on the observation that new such problems arise out of incremental changes to old ones, the scheme is designed to support sharing while still making it possible to rapidly reinstate a previous unification problem upon backtracking. The treatment of higher-order unification itself includes compilation and prioritizes deterministic computations while delaying the truly non-deterministic parts. This approach makes it possible, for instance, to realize a generalization of first-order unification in a completely deterministic fashion. The technique that is adopted for propagating types at run-time uses information already present during compilation to reduce substantially the effort to be expended dynamically. Using this approach, virtually no new computation is required relative to a monomorphic subset of the language that is similar to Prolog. Finally, towards supporting modular programming, a method has been designed for realizing module interactions that permits separate compilation. The actual addition and removal of modules of code can be achieved by methods developed for handling scoping over program clauses. However, these methods have been embellished by efficient devices for determining if a block of code will be redundant in a given context and also by mechanisms for realizing information hiding.

Overview of the Teyjus System

An implementation of Lambda Prolog on stock hardware based on the the present ideas envisages four software subsystems: a compiler, a loader, an emulator for the abstract machine and a user interface. The function of the compiler is to process any given module of Lambda Prolog code, to certify its internal consistency and to ensure that it satisfies a promise determined by an associated signature and, finally, to translate it into a byte code form consisting of a `header' part relevant to realizing module interactions and a `body' containing sequences of instructions that can be run on the abstract machine. The purpose of the loader is to read in byte code files for modules that need to be used, to resolve names and absolute addresses using the information in the header parts of these files and to eventually produce a structure consisting of a block of code together with information for linking this code into a program context when needed. The emulator provides the capability of executing such code after it has been linked. Finally, the user interface allows for a flexibility in the compilation, loading and use of modules in an interactive session.

The Teyjus system embodies all the above components and comprises about 50,000 lines of C code. The functionality outlined above is realized in its entirety in a development environment. Also supported is the use of the compiler on the one hand and the loader and emulator on the other in standalone mode. The system architecture actually makes byte code files fully portable. Thus, Lambda Prolog modules can be distributed in byte code form, to be executed later using only the loader/emulator. Finally, the system includes a disassembler for the purpose of viewing the results of compilation.

Related Systems and Implementations

The main strength of Lambda Prolog is in the capabilities it offers for representing and manipulating objects whose structures incorporate binding. Particular examples of such objects are formulas, proofs and programs. Several systems have been built in the past for reasoning about such objects. Nuprl, Coq and Isabelle are some recent systems that exploit ideas similar to those in Lambda Prolog for representing formulas and proofs. These systems fall under the rubric of logical frameworks, in that they provide mechanisms for specifying a variety of logics and for constructing proof development systems relative to these. More details about such systems can be found in the page on logical frameworks maintained by Frank Pfenning. In contrast to such systems, Lambda Prolog is a programming language, albeit with special capabilities. It is, in this sense, more closely related to the language Elf that is based on a theory of dependent types.

Lambda Prolog has received several implementations in the past. All but one of these implementations have been in the form of interpreters, the most recent one being the Terzo interpreter written in Standard ML. The only other implementation to adopt a compilation approach is Prolog/MALI. There are, however, significant differences between Teyjus and Prolog/MALI. The core of the latter implementation is a memory management system called MALI that offers functionalities relevant to realizing logic programming languages. To a first approximation, Lambda Prolog programs are compiled to C programs that execute MALI commands for creating and interpreting goal structure. Certain inefficiencies, such as the copying of entire goal structures, appear to be inherent to this approach as opposed to the abstract machine based approach used in Teyjus. There are also differences in the treatment of specific aspects such as the representation and runtime manipulation of types, the representation of terms, the realization of reduction, and the implementation of the scoping primitives. Finally, Prolog/MALI appears to support a different notion of modularity.

The People Involved in the Project

The Teyjus system is the result of many years of research and development and has benefitted from the involvement of several people over time. The overall project has been directed by Gopalan Nadathur. The abstract machine, loader and compilation methods that underlie this implementation of Lambda Prolog have been designed by Gopalan Nadathur. Bharat Jayaraman and Keehang Kwon contributed valuable ideas to the structure of an earlier version of the abstract machine. The implementation of these three components, and of the concomitant higher-order unification procedure, in the present system is due to Gopalan Nadathur. A preliminary version of the loader was implemented by Guanshan Tong. Dustin Mitchell has modified the low-level representation of Lambda Prolog types and terms towards realizing portability and is also responsible for the implementation of the changed scheme. The disassembler, the front-end for the system and the implementation of many builtins, especially those dealing with I/O, are due to him. Finally, he has also undertaken the rationalization and (re)organization of innumerable components of the project and its documentation. Shyan-Ming Perng provided initial implementations of many of the arithmetic and logical builtins and also of routines for reading and writing terms. Emil Ong is undertaking the porting of the system to Windows based platforms. Lyn Headley contributed to the implementation of the builtins.

An indispensable component of the release of a system of significant size such as Teyjus is its testing. Dale Miller has patiently used the system from its first, rough and ready, days, has helped pin-point several bugs and has also volunteered several suggestions towards improving its functionality. Others who have provided valuable feedback of this kind are Amy Felty, Chuck Liang, Frank Pfenning and Jeremie Wajs.

Last updated by gopalan@cs.umn.edu on August 12, 2003