Progress Update (July 24, 2020)
As part of working with the garage door up, I've promised to share what I've been working on. I last updated on June 23. Unfortunately, progress has been limited, primarily since Tessa and I went on a half-vacation / half-working trip, which spanned most of two weeks. Then, I got back and started engaging with buying a house, so I've been a bit distracted.
What's in a Term?
Despite that, I did manage to get substantial work done on a large-ish task, refactoring the structure of terms to be a bit more general. I haven't quite decided whether this is a good idea or not (leaning yes), hopefully this post helps to clarify my thoughts.
The core of the change is this diff in AbstractSyntax.mli:
--- a/src/AbstractSyntax.mli +++ b/src/AbstractSyntax.mli @@ -14,21 +14,18 @@ type sort = | SortAp of sort_name * sort list (** A higher-kinded sort can be applied *) | SortVar of string +(** A sort can be starred to indicate it's repeated, or not. *) +type starred = Starred | Unstarred + +(** Represents a place where a sort can go in a valence. *) +type sort_slot = sort * starred + (** A valence represents the sort of an argument (to an operator), as well as the number and sorts of the variables bound within it *) -type valence = - | FixedValence of sort list * sort (** A fixed valence is known a priori *) - | VariableValence of sort * sort - (** A variable valence binds a number of variables not known a priori. All must be - of the same sort. *) +type valence = Valence of sort_slot list * sort_slot (** An arity specifies the arguments to an operator *) -type arity = - | FixedArity of valence list - (** A fixed arity operator always has the same number of children *) - | VariableArity of sort - (** A variable arity operator has a variable number of children (all of the same - sort (non-binding valence)) *) +type arity = valence list type operator_def = OperatorDef of string * arity (** An operator is defined by its tag and arity *)
Note that the resulting code is simpler than before. We're generalizing two things here, which I'll show by example.
The old way
First, let's talk about the old way of doing things. It was the case that you had two choices when it comes to valence: fixed or variable valence.
example := | fixed(a. b. a) // fixed valence | variable(a*. a) // variable valence
An operator with fixed valence always binds the same number of variables, for
fixed(a. b. a) (this binds
b then uses
a as the body). An
operator with variable valence binds, well, a variable number of... variables.
variable(pattern(a1; a2). a1) binds two variables, then uses one
of them as the body. In a fixed binding position, you introduce exactly one
variable for every binding position. But in a variable binding position, you
use a pattern instead (patterns, by their nature, can bind any number of
A variable valence operator can only have one binding position, meaning I
couldn't have written
variable(a*. b*. a), though
variable(a*. a) or
variable(b*.a ) would be okay. I'm not sure what my justification was for
this restriction. I think this is something I was confused about.
It was also the case that you had two choices when it comes to arity: fixed or variable arity.
example := | fixed(a; b; c) // fixed arity | variable(a*) // variable arity
A fixed arity operator always has the same number of children, for example
fixed(a; b; c). A variable arity operator has a variable number of
children, for example
A variable arity operator can only have one child sort, meaning I'm not allowed
variable(a*; b*). Why? Because we use
; to separate subterms, so
there's no way to tell when the
as end and the
The new way
That probably seemed a bit more complicated than necessary. In my change I removed both of the restrictions, so now you can write:
example := | variable_valence(a*. b*. a) // variable valence with multiple binding positions | variable_arity(a*; b*) // variable arity in multiple slots | both(a*. b*. a*)
A few example terms:
variable_valence(a_pattern(a1; a2). b_pattern(). a1) variable_arity(a1, a2; b1, b2) both(a_pattern(a1; a2). b_pattern(). a1, a2)
Notice that we've now introduced commas (before the only separators were
;). Previously you had to use
; to separate children in a
variable-arity operator, but now we use
;s to separate "slots" (each of which
holds repeated terms of the same sort) and
,s to separate terms within a
variable_arity(;) is now a valid term. This raises the question in
my mind of whether we should specify 1-or-more repetition (
+) in addition to
the 0-or-more repetition (
*) that we currently have. Happily, this is
strictly a generalization, so we can add it later.
Overall I think I'm happy with this new system. My main hesitation is this
nagging feeling that people will find this system overly complicated but in a
new way. I think you'd be justified in asking what the deal is with
However, as I'm in the mood for rethinking things, I've begun to think about the purpose of the abstract syntax system. That's what I plan to talk about next time.