© Jrgen Steensgaard-Madsen, Copenhagen, 2006
Deep- and surface structure                                                  

An archetypal example of splitting a language into levels is the Lisp family of languages. The notion of S-expressions and their use to represent application of operations form a common nucleus. Along with rules for writing constants, for quoting (and perhaps unquoting) S-expressions, and the decision to depend on dynamic type checks, it constitutes a deep structure of all members. A surface structure fixes the details about which operations are predefined, how derived operations may be defined, as well as rules for finding the meaning of a name.
Another example is the core of Tcl/Tk that allows developers a disciplined way to add new operations and statement forms to a new language, one of which is Tcl/Tk. We do not aim to provide anything near to a satisfactory overview, so others probably exist.
Libraries for use with C could of course be considered as extensions of the language, but C contains too many details to qualify as a deep structure in our opinion, but readers should not expect a clear demarcation between deep- and surface structure, but a formal treatment may help.
A kind of core problems for the programming language community is to define functions of type:
L:(G × S)  P  I  O
where the letters vary over types of grammars, semantics, programs, inputs and outputs.
An interpreter for a given language is a function of the following kind
J : P  I  O
which may be obtained through specialisation: J may be the image of a function of type (G ×S)  (P I  O) for given grammar gJ :G and a semantics sJ :S. The function can be implemented with tools like yacc and lex.
The grammar and semantics of some core problems may be split into a deep and a surface structure such that
L: (GD × SD) (GT × ST)  P  I  O
For a given (gD , sD ) : GD × SD there will be a D = L(gD , sD )
D : (GT × ST)  P  I  O
Tools like yacc and lex may be used to construct a program for L and in such a way that an interpreter can be implemented as J = D(gT, sT) by D 's implementation.
Expertise in programming language design and implementation is required to identify an appropriate deep structure. Experts in application domains should use D as an implementation tool for a given surface structure and consequently need not be concerned much with language properties that are supported fully by D.
With a good split, a language description may have some pleasing properties, as for instance extensibility defined as
D (g1,T, s1,T) D (g1,T g2,T, s1,T s2,T)
D (g2,T, s2,T) D (g1,T g2,T, s1,T s2,T)
Informally this means that language contributions can be heaped on top of one another to form an entire language. The operator on semantics denotes union of sets, on grammars union of maps when domains are disjoint. If domains of maps are not disjoint, denotes overwrite and the operation is not commutative.
Our program dulce is such a D implemented with the tools flex and bison. The major part of dulce is reused in every interpreter and is later referenced as the spine.


Demo language
Implementation tool

Interpreter construction

Deep- and surface structures
Design issues
Abstract type operatorss
Description of operator
Name introduction
Data abstraction and control

File translated from TEX by TTH, version 3.33.
On 18 Oct 2006, 16:47.
SourceForge.net Logo