c(list) -> cat.where list is a list of pieces of lexical information and cat is the CUF type for syntactic categories. The lexicon itself maps words (CUF constants) on such lists of lexical information:
l(afs) -> list.
In strictly nonlexicalized grammars, so-called phrase structure rule-based grammars, syntactic categories are atomic symbols, also called 'nonterminals'. However, in (partly) lexicalized grammars, syntactic categories are complex structures. Hence, a distinction between the (complex) syntactic categories and the (atomic) nonterminals arises. In LexGram, one can think of a syntactic category as the description of a partial syntax tree, which states the expectations of a word (the lexical head of the tree) with respect to its neighbors in a string. For this reason, cat is declared as having the two features root and leaves .
cat :: root : root, leaves : list. % list(goal)The value of root is the root of the partial syntax tree, and the value of leaves is the list of constraints on the neighbors of the lexical head, i.e. of a word. Note that, in LexGram, nothing can be said about the internal shape of a partial syntax tree. The system will translate a cat-term into a binary tree which reflects the structure of the list of leaves. An example tree will be given below.
The root value of type root is predefined as being a pair of a nonterminal and a (partial) semantic structure sem . The specification of nonterminal's and of the type sem is left to the grammar writer. You can either define the type nonterminal as an enumeration type of CUF atoms, or, for example, specify an agreement feature for it, i.e. let nonterminals be complex structures. The value of the feature frml is taken as the 'result value' of the semantic structure. You can declare additional features for the sem-type, e.g. a lambda-feature, if you want the sem-type to render conventional lambda-expressions.
root :: syn : nonterminal, sem : sem. sem :: frml : top.The choice of categories for the words of the input string are the premises from which the start symbol of the grammar, i.e. the initial goal, is to be derived. Similarly the elements of a list of leaves are goals which are to be derived from substrings of the input string plus and temporarily assumed hypotheses or traces. It is convenient to classify categories according to whether they correspond to word s, trace s, or derived_phrase s:
cat = word | trace | derived_phrase. word_goal < word. trace_goal < trace. derived_phrase_goal < derived_phrase. goal = word_goal ; trace_goal ; derived_phrase_goal.Since goals are categories, they do not only put constraints on the root of an expected neighbor tree, i.e. subtree. But they specify also whether a subtree is saturated, i.e. its leaves-value equals the empty list [], or unsaturated, i.e. it has a non-empty leaves value. Non-empty leaves-values are required to deal with control verbs and adjuncts. Furthermore, a goal should contain the information
goal ::
dir : direction,
slash : list,
constraints : constraints.
direction = {left, right}.
constraints ::
leftcorner : cat,
completion_state : afs.
The feature
dir is of type
direction, i.e. it shows
the relative position of this argument wrt. the head. The
feature
slash is of type list which entails traces and
premises of the proof of this argument.
The feature constraints has a somewhat experimental status in LexGram. It allows to express certain constraints on derivations which cannot be stated in terms of the leaves and slash values. Currently, the leftcorner-feature allows to constrain the leftcorner of a (sub-)derivation. For example, this can be used to prohibit 'empty scrambling' (spurious leftward movement) by constraining the leftcorner of a verb phrase to be a word. The value of the feature completion_state will be instantiated (to the CUF atom done) once the derivation of a goal has been successfully carried out. For example, this makes it easier to formulate wait declarations for semantic evaluation routines.
A lexical entry for the two-place verb eats and the corresponding syntactic category could now look as follows.
l(eats) := [verb,v2].
c([verb,v2]) := verb(v2).
verb(v2) := ( root : syn : s &
leaves: [ ( dir : right &
root : syn : np &
leaves : [] &
slash : [] ),
( dir: left &
root : syn : np &
leaves : [] &
slash : [] ) ] ).
However, in order to keep a grammar independent from changes of the actual data structures, it is suggested to use the built-in constructor sorts for the cat-type and its related types, which are defined as follows.
category(Root,Leaves):=
root:Root &
leaves:Leaves.
goal(Dir,Root,Leaves,Slash,Constraints):=
category(Root,Leaves) &
dir:Dir &
slash:Slash &
constraints:Constraints.
cons_root(Syn,Sem):=
syn:Syn &
sem:Sem.
cons_sem(Frml):=
frml:Frml.
cons_constraints(Leftcorner,CompletionState):=
leftcorner:Leftcorner &
completion_state:CompletionState.
result_sem(top) -> cat.
result_sem(Sem) :=
category(
cons_root(
_RootSyn,
frml : Sem ),
_Leaves ).
Based on these templates, the previous category definition can be rewritten as
verb(v2) := category( cons_root(s,_),
[ goal(right,cons_root(np,_),[],[],_)
goal(left,cons_root(np,_),[],[],_) ] ).