To Andrew and all other interested parties,
Great description of your pattern calculus extensions. I must admit
though that I don't yet totally see the gains here. Let me explain.
You started off with a definition of what a term T may be in your
T ::= C | V | (T T) | (T -> T | T) | (Fix T.T)
Well, it turns out that all but the last two expansions (which I'll
come back to) are the fundamental elements of McCarthy's original
In his 1960 paper, "Recursive Functions of Symbolic Expressions and
Their Computation by Machine, Part I", he described every evaluating
element in a Lisp program as an expression, rather than a term. An
expression could be a literal, a symbol, a variable (represented as a
symbol), a function of the form (lambda (p1...pN) e), or a special
form that protected evaluation of its arguments (allowing for flow
control and homoiconicity of code and data).
There were, in fact, only 7 elementary forms in this first Lisp: cons,
car, cdr, atom, eq, cond, and quote
The first five are functions, and the last two are special forms,
since they protect the evaluation of their arguments.
These 7 Lisp "axioms" along with the lambda construct (for functions)
and the list (actually expression tree) representation of all
non-atomic forms, gives you all you need to define the fundamental
self-evaluation operation, eval, which can then run any program
expressed in this language. For those sufficiently interested in how
this can be done, I recommend Paul Graham's excellent 2002 paper "The
Roots of Lisp," which you can read here:
You can find the complete code in Common Lisp for a Lisp eval based
purely on these axioms here:
Alright, what about those whacky fixpoints then? Do we get those in
Lisp too, or is this a fundamentally new contribution of Andrew's
A fixpoint x, for those not in the know, is a value present in both
the domain and range of a function f, such that the function
application f(x) = x. One way to imagine fixpoints for those who are
visually inclined is as the intersection points between the function f
and the line y=x in the cartesian plane. When fixpoints are computed
for higher-order functions (that is, functions which take in and
return functions), a fixpoint of a function f now becomes the function
g, such that f(g) = g.
Okay, why do we care about this in a discussion of programming
language theory? Well, a function which computes the fixpoints of
another function is called a fixpoint combinator, and through the use
of these (the easiest to grasp of which is Haskell Curry's Y
Combinator -- look it up, or just look at the little picture on
Skalka's door), we can define recursive functions using nothing more
than lambda abstractions.
Recursion is needed to get all the interesting loop-type work (and
just about everything else) done in purely functional programming
languages (which otherwise lack mutable state and therefore -- yep you
got it -- loops), so fixpoint combinators are pretty darned useful
after all, it turns out.
With that said, given that Lisp gives us the lambda construct, we know
we can define not only the Y Combinator for the untyped lambda
calculus (Lisp is an implementation of the untyped lambda calculus for
folks not in the know) but also the lazy evaluation semantics that
make its use fun and productive. Voila! So we've got recursion and
our language is actually useful. Okay, moving on.
So the only thing that's actually new here are the pattern extensions
(T -> T | T), which comprise the core of Andrew's email after all.
"The pattern calculus(*) is essentially a higher-order term-rewriting
system. This means that any term (block of code) may be matched and
'rewritten' as (made to evaluate to) a different term. This is made
possible by first generalizing lambda abstraction to pattern-matching,
and then allowing any term to be a pattern."
In order for this to work, you must add to your system the ability to
pattern match on the structure/type of an expression without first
evaluating it. This is something not seen in most programming
languages but is the fundamental characteristic of the Lisp macro. So
here, you've reiterated an old idea but couched it in the language of
a calculus of term. What I don't see is the mechanism by which your
language actually blocks this evaluation.
You say this will be implemented over a variant of ML, but they use
strict evaluation rules which force the evaluation of all expressions
passed as arguments to functions BEFORE the function is applied to
them. Thus, when applying ((EXPAND-M M) P), both M and P will be
evaluated BEFORE they make it into the pattern matching construct.
Since you don't want to match the post-evaluation structure of P in a
Lisp macro but it's pre-evaluation semantics, I fail to see how you
are getting around this restriction (and it would be quite something
if you could match on a function application as you've outlined in
EXPAND since an application is not a terminal form). Perhaps, you've
just left something out of your description.
However, even if I were to grant you the evaluation blocking mechanism
(with something on the order of the Lisp quote expression or an
argumentless lambda wrapper with a post-hoc null application), then
you have just expressed a static mechanism for replacing one
expression with another via structural rearrangement. That is great,
but it's more on the order of a C macro than a Lisp one.
In Lisp, I can use the entire language to write an arbitrary program
(macro) to perform code transformations, including lexical bindings,
partial evaluation of some inputs but not others (and functions of
those evaluated results), recursions, and so on. All of this works,
of course, because the code is just writing a new expression tree in
list form, and then asking the system to evaluate its result. This is
the fundamental purpose of homoiconicity in Lisp (the fact that code
and data are both represented as lists), because writing programs to
write code is as easy as writing those which write data. They all
just generate lists, and the eval function which was defined in the
references above using the 7 axiomatic lisp expressions is able to
evaluate lists. No extra magic involved.
The thing is, the macro system that you are describing must be applied
manually in your code to each P expression which is
macro-transformable by M (which is inconvenient but not theoretically
a problem). Moreover, it seems that your macros are less intelligible
because they combine multiple transformations into one definition,
with the appropriate transformation being selected by the shape of the
input P to that macro.
The purpose of macros in the Lisp family of languages is to enable the
dynamic extension of flow control structures (including exception
handling, stack unwinding, and resource management) and to create
Domain Specific Languages that enable program code which can grow
higher-level abstractions and semantics to better fit particular
To this end, a Lisp macro usually performs a single specific
transformation rather than a set of transformations as you are
describing, but again this is more related to the practical side of
using macros than the theoretical question of whether you can actually
simulate them using your pattern calculus and the MetaML partial
I do see that the partial evaluation work does look promising for
replicating backquote and escape, but it requires you to represent
your code in a new shape, using angle brackets. When it comes to
talking about homoiconicity, these semantics really are not succeeding
too well. You've got different syntactic shapes for functions,
patterns, fixpoints, variables, literals, and now unevaluated or
partially evaluated code. In Lisp, I've got atoms and lists (of atoms
and lists). That's homoiconicity at its cleanest. There are no
arrows, dots, pipes, equals signs or anything else in the language.
When your EXPAND applications start generating terms that look like:
let EXPAND-M =
(M -> Fix EXPAND.
( (F G ) -> ((EXPAND F) (EXPAND G))
| (F -> G ) -> ((EXPAND F) -> (EXPAND G))
| (Fix F.G) -> (Fix F. (EXPAND G))
| F -> (M F)))
the syntactic complexity is sure to multiply very quickly. So again,
I'm just not seeing your homoiconicity here, except for the VERY
high-level abstraction that "everything is a term." Come on. Every
term type has a different shape. I'm really not seeing it.
And that, I think, is more than enough for now. I've got to get home
and do some actual school work, but this conversation has been fun and
thought-provoking. Thanks for all the hard work you put into this,
Andrew. It will be great to see where you go with this stuff from
Keep on hackin' in the free world,
~Gary (your friendly neighborhood Lisp geek)
P.S. Variable capturing semantics are easily avoided using both
Scheme's hygienic macros and Clojure's reader-macro-based gensym
system. This is a largely solved problem.
Andrew Cooper writes:
> Gary et al,
> First off, this is all meant in good fun. Please don't take "girly" and the like
> as serious insults--I was just being colorful. In fact, I'm somewhat girly myself.
> Second, I wasn't ranting: If one extends the pattern calculus(*) with the
> partial evaluation semantics of MetaML, then the result is indeed the basis for
> "a non-Lisp language which allows programmers to easily modify the evaluation
> semantics of arbitrary code blocks." What's more, I will show that such a
> language intrinsically provides both:
> 1) a more expressive macro system, as well as
> 2) fundamentally greater homoiconicity,
> than is feasible in any Lisp (without cheating and implementing such a language
> as a DSL).
> (*) = extended with matching on cases
> Let's start wth an overview of the pattern calculus and how it can (with a
> little help from MetaML's semantics) subsume Lisp's macro facility:
> The pattern calculus(*) is essentially a higher-order term-rewriting system.
> This means that any term (block of code) may be matched and "rewritten" as (made
> to evaluate to) a different term. This is made possible by first generalizing
> lambda abstraction to pattern-matching, and then allowing any term to be a
> pattern. The term syntax is as follows:
> T ::= C | V | (T T) | (T -> T | T) | (Fix T.T)
> That is, a term is either a term constant C, a term variable V, the application
> F X of a term F to a term X, an abstraction P -> S | D (read: "at P use S else
> D"), or the introduction of a fixpoint. The particular form of abstraction
> employed is called an "extension." The first term, P, is the "pattern," the
> second term, S, is the "specialization," and the third term, D, is the
> "default". *Any term may be a pattern, including extensions.*
> The only important reduction rule, similar to the untyped lambda calculus, is
> the application of a term Q to an extension: "(P -> S | D) Q". To evaluate this,
> we first check whether terms P and Q "match". The semantics for this are fairly
> intuitive: Constants match, variables match (and bind) terms of the
> corresponding type, applications match if their arguments do, and likewise for
> abstractions. (There's also some magic for free variables, and for matching
> variables with variables, but you don't need to know.) On match, the result is
> S, with appropriate substitutions according to any subterms of Q which were
> matched by, and bound to, variables in P. If the match fails, then the result is
> the application of D to Q. Canonically, a special term constant FAIL is
> included, which consumes a single argument and then evaluates to itself,
> propagating the match failure upwards (N.B. one can match FAIL, since it is just
> a term like any other).
> NOTE: From here on, the notation P -> S is shorthand for P -> S | FAIL.
> Now, let's examine a lovely function called EXPAND-M:
> let EXPAND-M =
> (M -> Fix EXPAND.
> ( (F G ) -> ((EXPAND F) (EXPAND G))
> | (F -> G ) -> ((EXPAND F) -> (EXPAND G))
> | (Fix F.G) -> (Fix F. (EXPAND G))
> | F -> (M F)))
> where M, F and G are all term variables.
> The "let" construct isn't actually in the calculus, and it may be understood as
> syntactic sugar. (However, as a side note, see that (X -> Y) Z is functionally
> equivalent to "let X = Z in Y"). More importantly, however, what does EXPAND-M
> do? Let's step through it:
> EXPAND-M takes a single parameter M. Inference on the final case shows that F is
> a function of universally quantified type (i.e. M : forall a.a -> a, where "a"
> is a type variable). Upon being applied to such a parameter, it reduces to:
> Fix EXPAND.
> ( (F G ) -> (M ((EXPAND F) (EXPAND G)))
> | (F -> G ) -> (M ((EXPAND F) -> (EXPAND G)))
> | (Fix F.G) -> (M (Fix F. (EXPAND G)))
> | F -> (M F)))
> where M is no longer a variable (having been replaced by the argument to
> EXPAND-M). This resulting term is a function which is typed similarly to M
> (EXPAND : forall b.b -> b), as are each of its "cases" (the extension subterms).
> Briefly put, this function takes an argument of *any type* and recursively
> applies M to *each* of its subterms. The first case deconstructs applications,
> the second extensions, the third fixpoints, and the fourth constants and
> variables (terms with free variables are well-typed in my calculus--I'll spare
> you the details). As long as the argument to this function is strongly
> normalizing (that is, not something stupid like Fix T.CONS T NIL), then this
> function is strongly normalizing (terminates), as well.
> Here's the point: In general, what does ((EXPAND-M M) P) signify? Well, if we
> call the term M a "macro" and the term P a "program", then what we have just
> described is *macro expansion*. You doubt? Let M = X -> Y | (Z -> Z). This
> function replaces terms of form X with terms of form Y, and leaves all other
> terms unchanged.
> Now, you're feeling a little antsy in your pantsy because I didn't mention
> anything about Lisp's backquote mechanism. More formally, this would be called a
> "multi-stage computation" mechanism. That's where the MetaML semantics come in.
> They're really quite straightforward: they provide constructors similar to
> backquote and escape (comma), as well as a "run" construct (oddly enough). They
> also provide type judgments for delayed computations. For example, "<1 + 2>" is
> of type "code of int" and reduces to itself, whereas "<1 + ~(5 - 3)>", also of
> type "code of int", reduces to "<1 + 2>". There's also some goodness for
> variables. It should be obvious that this machinery can be plopped down
> alongside that of the pattern calculus with only a minimum of fuss.
> Deep magic:
> I have just described type-safe, higher-order macro expansion with partial
> application. What special machinery was necessary for this great wonder? Nothing
> more than the union of abstraction over patterns(*) and multi-stage computation.
> These "macros" don't need to be headed by a constructor. No, not a class
> constructor--a type constructor (more specifically, a non-reducing applicative
> term constant)! Show me a macro in Lisp that you don't need to "call".
> A term-rewriting term is essentially a first-class macro, and UPDATE-M describes
> macro expansion. Pure patterns and higher-order rewriting mean there need be no
> difference between symbol and term, or between macro and function. This brings
> about a literally unprecedented level of homoiconicity (disregarding, of course,
> machine code). Terms are not lists! Terms are terms, no matter how you arrange them.
> Peace in the Middle East,
> Quoting Gary Johnson <[log in to unmask]>:
> > Howdy again, folks,
> > It's great to hear everyone's perspectives on the dynamic vs. static
> > typing debate. At this point though, I do want to officially state
> > that I'm very much in agreement with Peter regarding his last email.
> > We do all want our programs to be correct, and different problem
> > domains call for different language semantics. The "bondage and
> > discipline" description of statically typed languages was also not
> > meant as an insult, merely a descriptor. That description has been
> > floating around in the Lisp community for quite some time, and I was
> > merely choosing what was to me a well-known idiom.
> > Finally, I really should clarify something here. In my initial
> > email, my commentary was coming from a position of concern that too
> > much emphasis was being placed on classes and OO-paradigms in
> > today's schools and businesses. Object orientation, as a
> > sub-paradigm of the imperative programming style, is very useful for
> > representing problems in terms of actors and messages. Some problem
> > domains greatly benefit from this: i.e. agent-based simulations or
> > graphics programs. Others, such as theorem provers, compilers, and
> > most numerically-intensive (i.e. scientific computing) algorithms,
> > are IMHO typically made far less understandable when couched in the
> > terminology of OO languages.
> > Pointing people to this link was meant to encourage folks
> > (esp. those students whose main or sole experience with programming
> > is in the object-oriented sphere) to look outside of this one
> > paradigm and consider their options more fully. It was not meant to
> > fan the holy war between dynamic and static typing although that is
> > where we ended up ultimately going.
> > So, in conclusion, I'm simply advocating for coding abstractions
> > that are more powerful through being more generalizable and
> > therefore reusable. This is the key principle of modular design
> > (which is by far the most important tool in scaling up an
> > application's code size). I am arguing, as is Perlis, that although
> > the creation of new data types through class abstractions may seem
> > to be good modular practice, it often creates less generalizable
> > code and may therefore be harder to understand, extend, and reuse in
> > the future.
> > And when Andrew shows me a non-Lisp language which allows
> > programmers to easily modify the evaluation semantics of arbitrary
> > code blocks (and therefore implement any new flow control, exception
> > handling, data abstraction, encapsulation regime,
> > domain-specific-language, or self-rewriting code) as needed for
> > their problem domain (thereby improving both code modularity and
> > adaptiveness), then I might consider listening to his rants about
> > his new type system's comparative strengths to dynamically-typed
> > homoiconic languages.
> > Until then, he can keep his trolling "girly" comments on reddit,
> > where they might be considered appropriate language by the reader
> > base. Like Hacker News, this list is meant for polite, intelligent
> > dialogue, not backbiting or name calling.
> > And that's all I have to say about that.
> > Hack on,
> > ~Gary
> > Peter C. Chapin writes:
> > > On Sat, 21 Mar 2009, Jacob Beauregard wrote:
> > >
> > > > I'd actually be less concerned about what a programming language
> > > > prevents me from doing than what a programming language helps me do.
> > >
> > > This is the crux of the matter, I think. While we all want programming
> > > languages that make it easier for us to write programs, what a language
> > > disallows is probably as important as what it allows. Ultimately this is
> > > because what we really want are programming languages that help us write
> > > *correct* programs.
> > >
> > > Whenever one considers a programming language feature two questions
> > should
> > > be asked: How does the feature help me to express complex ideas in a
> > > simple way? What kinds of potential problems might the feature introduce?
> > > Assuming we are talking about languages that are well designed in their
> > > domains, languages that let you "do anything to everything" provide a
> > > powerful response to the first question, and "Bondage and discipline"
> > > languages provide a powerful response to the second.
> > >
> > > There is, of course, no "best" in all cases... which is why this debate
> > > has raged on for as long as there has been programming.
> > >
> > > Peter