HORATIO

 

A middle-sized NLP application in Prolog

 

 

 

Archibald Michiels

University of Liège, Liège

 

1994


Table of Contents

Table of Contents....................... 2

1. Foreword....................... 5

2. Introduction....................... 9

3. The Lexicon....................... 15

3.1. Use of Macros........... 17

3.2. Double Analysis........... 18

3.3. Verb Classes........... 20

3.4. Raising and Control........... 21

3.5. Multi-Word Units........... 21

3.5.1. Insertion 22

3.5.2. Extraction 23

3.5.3. Proformation 23

3.6. Inflectional Morphology........... 26

4. The Grammar: Interpreting the Lexicon....................... 27

4.1. General Strategy........... 28

Arguments and Modifiers 28

4.2. The S Level........... 32

4.2.1. Declarative Clauses 32

4.2.1.1. Subject / Operator Agreement Rules 35

4.2.1.2. Priority among Person Features 35

4.2.1.3. Passives 36

4.2.1.3.1. Parsing 36

4.2.1.3.2. Generation 38

4.2.1.4. The Assignment of the Subject Role 39

4.2.1.4.1. Control 40

4.2.1.4.1.1. Parsing 40

4.2.1.4.1.2. Generation 41

4.2.1.4.2. Raising 43

4.2.1.4.2.1. Parsing 43

4.2.1.4.2.1.1. Subject-to-Object Raising 43

4.2.1.4.2.1.2. Subject-to-Subject Raising 44

4.2.1.4.2.2. Generation 45

4.2.1.4.2.2.1. Subject-to-Object Raising 45

4.2.1.4.2.2.2. Subject-to-Subject Raising 45

4.2.1.4.3. Extraposition 47

4.2.1.4.3.1. Parsing 47

4.2.1.4.3.2. Generation 48

4.2.2. Non-Declarative Main Clauses 48

4.2.2.1. Yes-No Questions 48

4.2.2.1.1. Parsing 48

4.2.2.1.2. Generation 51

4.2.2.2. WH Questions 52

4.2.2.2.1. Parsing 52

4.2.2.2.2. Generation 54

4.3. The Phrase Level........... 55

4.3.1. Noun Phrases 55

4.3.1.1. Parsing 55

4.3.1.2. Generation 59

4.3.2. Adjective Phrases 60

4.3.3. Prepositional Phrases 62

4.3.4. Verb Phrases 65

4.3.4.1. Parsing 65

4.3.4.1.1. The Auxiliary Group 65

4.3.4.1.2. Coordinated Verb Phrases 67

4.3.4.2. Generation 68

4.3.4.2.1. The Auxiliary Group 69

4.3.4.2.2. The Main Verb 74

4.4. Modifiers........... 76

4.5. Canonical Order and Athematic Arguments........... 79

5. Semantics....................... 82

5.1. GF Level........... 82

5.2. Inheritance........... 84

5.3. Percolation........... 85

5.4. From Semantic to Lexical Classes........... 86

5.5. Performing the Checks........... 87

5.5.1. Sfok 87

5.5.2. Checksem 89

6. Parsing Issues....................... 91

6.1. Dealing with Flexible Word Order........... 91

6.2. Computing Preferences........... 92

6.3. Hard Coordination........... 94

6.4. Long Distance Dependencies........... 99

7. Generation Issues....................... 107

7.1. Freezing the Variables in the Parses........... 107

7.2. The Cycle........... 108

Appendix A. Non-Standard Arity Prolog Predicates....................... 116

Execution Control........... 116

String Manipulation........... 116

File Operations........... 117

Appendix B. File Organization and Compilation Directives....................... 118

File Organization........... 118

Moratio.cmd 118

Morgen.cmd 119

Arity Prolog Compilation Directives........... 119

Appendix C. Input and Output....................... 121

Input 121

Output 122

Appendix D. Selective Lexicon Downloading....................... 123

Appendix E. Importing Lexical Entries from Ldoce....................... 125

Introduction........... 125

Description of the Liège Ldoce Data Base........... 125

Ldoce Data Base Design........... 126

Lemma Data Base: Coword 126

Definition Data Base: Codefi 127

Idiom Data Base: Coidio 128

Example Data Base: Coexam 128

Grammatical Code Data Base: Codcod 129

Believe in Ldoce 130

The Importation Process from Ldoce into Horatio's Lexicon........... 132

Intransitive Verbs 133

Mono-Transitive Verbs 136

Prepositional Verbs 138

Transitive Phrasal Verbs 141

Verbs Taking an Object and an Object Complement 143

Ditransitive Verbs 147

Verbs Taking an Object and a That-Clause 149

Verbs Taking an Object and a Prepositional Object 151

Appendix F. Sample Parses....................... 155

Appendix G. Test Suites....................... 164

For Analysis........... 164

Designed for Horatio 164

Based on Flickinger et al. 1987 167

For Generation........... 174

Appendix H. Prolog Predicates....................... 178

Index to Prolog Predicates....................... 195

References....................... 198

1. Foreword

            There now exist quite a number of books on Prolog and even quite a few on Prolog and NLP (Natural Language Processing). Among the latter the following stand out: Pereira and Shieber 1987, Gazdar and Mellish 1989, Gal et al. 1991, Covington 1993 and the chapter on natural language in Walker et al. 1987 (by Michael McCord, whose approach is the one followed here; see also McCord 1982, as well as McCord 1989a, 1989b and 1990 for new develop­ments).

            However these books remain at an introductory level. More specifically, they discuss only short introductory programs. None of them describes and gives the full code of a medium-sized application.

            This is precisely what this book does. horatio (and its generation counterpart horgen) is a middle-sized application in the field of NLP, more precisely natural language analysis[1] and generation. The full source code, with numerous comments, is to be found on the disk distributed with the book[2].

            No pseudo-code is given. No piece of code is presented in a simplified form. Cited code always corresponds exactly to the runnable code to be found on the companion disk. This may lead to the reader feeling somewhat swamped under barely digestible code. However, if he persists in his study of horatio, allowing some areas to remain unexplained for a while, he will have mastered not only the general design decisions, but also the interactions between the various components of the grammar and the coding idioms and mechanisms in full detail. The hard step from elementary to advanced NLP applications in Prolog is due to the need for man­aging the interactions between various 'solutions', which are all beautifully simple when looked at in isolation but display the irritating tendency not to mesh with their neighbours as readily as one would wish.

            horatio remains a 'toy' system in that it is oriented towards the teaching of Prolog for NLP rather than any real life application. It is geared to the parsing and generation of 'linguistic' rather than 'real' sentences (in the sense of Tomita 1991, i.e. sentences made up for the purpose of testing linguistic hypotheses rather than utterances occurring in actual text).

            The emphasis throughout this book is on grammar writing rather than on the writing of grammar interpreters or compilers as extensions to standard Prolog. The grammars here are interpreted within the strict top down regime of Prolog; although they belong to the family of definite clause grammars (dcg's),  the dcg notation is not made use of.

            This position needs to be explained and argued for. Recently a lot of research has been geared towards extensions of Prolog for NLP, to be either interpreted or compiled into stan­dard Prolog (cf. inter alia Definite Clause Grammars (Pereira and Warren 1980), Extraposi­tion Grammars (Pereira 1981), Gapping Grammars (Dahl and Abramson 1984), Dislog (Saint-Dizier et al. 1990), Restriction Grammar (Hirschman and Dowding 1990), CLE (The Core Language Engine, Alshawi et al. 1992). These extensions aim at enhancing the follow­ing desirable properties for NLP systems:

     clean separation between the grammar and the parsing algorithm;

     expressive power of the grammar, leading to a compact grammar that directly embodies linguistic generalizations (for instance in the treatment of linguistic discontinuities such as the relation between a trace and its filler);

     automatic construction of the parse tree.

            The dcg notation is clearly a first step in this direction: the linguist need no longer be concerned with the difference list technique that takes care of progression in the input word list. The distinction between grammar predicates and standard Prolog predicates is emphasized notationally.

            However, these extensions to Prolog are not 'free'. The price to pay is on the one hand some loss of control over the parsing process and on the other much harder debugging. The first point can be illustrated with reference to the dcg notation. Since the progression in the in­put list is added when the dcg grammar is interpreted or compiled into standard Prolog, it is no longer possible to specify no progression in the input list in the head of the clause. A body needs to be written. Compare:

gap(...,Input,Input).

with

gap(...) --> [].

This may seem a matter of no great import. However, when horatio is rewritten in dcg nota­tion, a considerable loss of efficiency is incurred. Parsing time increases by about 50% on the standard test suite.

            One of the lessons to be learnt from the Eurotra project is the importance of efficiency for testing and debugging purposes. If the system needs half an hour to provide a sentence with a parse, or to generate a surface form from some deeper representation, the grammar writer will be very wary of experimenting with alternative approaches. He will keep his test suite as short as possible. He will not run the full test suite after so-called 'minimal', 'housekeeping' changes to the grammar. He will invent tricks to attempt to improve efficiency, even if they are detrimental to the readability of the code. As a result, the grammar will be brittle, not properly debugged and even harder to debug than if efficiency had been a primary concern right from the start.

            Debugging is known to be hard in Prolog, on account of nondeterminism and non-permanent variable instantiation. It gets much harder when what is debugged is not the gram­mar the linguist has written but its extended form, i.e. its translation into standard Prolog. Published documentation on the extensions to Prolog mentioned above does not discuss debug­ging facilities. We have preferred to restrict ourselves to standard Prolog: what is debugged is then the grammar as written by the linguist.

            Efforts have been made to keep the code of horatio reasonably clean and maintain­able: no use has been made of program-modifying predicates (assert, retract, recorda,...) and the grammar itself is devoid of cuts. The whole system relies on standard Prolog unification, which is a clean, monotonic process. Control of execution in the system is restricted to testing whether variables are instantiated at the point a call is made to a predicate.

            Our parsing algorithm is top-down, left-to-right and depth-first. This is of course the parsing algorithm that Prolog itself uses, its 'native' parsing algorithm as it were. It is not nec­es­sarily the most or the least efficient: this depends on the nature of the grammar, and on the inputs it is meant to account for. Well-known bottom-up parsers (with top down oracles im­plemented by a link predicate) have been designed in Prolog, for instance the left-corner parser of Pereira and Shieber 1987 and BUP (Matsumoto et al. 1983).

            The programs in this book are written in Arity Prolog[3]. Use is made of Arity Prolog extensions to standard Edinburgh Prolog, mainly in the area of input and output (cf. such predicates as concat, create, open, read_line, etc.). The syntax and semantics of these predi­cates will be briefly explained in the body of the text and in comments to the source code, as well as in a short appendix (see Appendix A, p. 116). The reader who wants a full specifica­tion is referred to the Arity Prolog documentation or to Marcus 1986.

            It should be made clear that horatio is not "tied" to Arity Prolog. It is easily convert­ible to standard Edinburgh Prolog notation, and as a matter of fact horatio also exists in a Yap Prolog version running on Sun. Arity Prolog has been selected because it is both reason­ably fast and available on PC platforms (DOS, WINDOWS and OS/2).

2. Introduction

            horatio is a parser for a subset of English based on a definite clause grammar be­longing to the slot grammar framework (cf. the work of Michael McCord and associates; cf. e.g. McCord 1987[4], a presentation of the framework in half-tutorial fashion). It is written in ARITY Prolog (Version 5.1 for DOS)  and runs on a 386/486 PC under DOS or OS/2 (for OS/2, version 6 of Arity Prolog has been used).

            All parses shown in this book were produced on an IBM Model 70, with 4 Megabytes of core memory and a 120 Mega hard disk. The operating system is DOS 6.0.

            On an Intel 486 DX2 PC clone (66 MHz) the standard horatio test suite (horsuite, the 156 sentences to be found in Appendix G, p. 164, from "they failed" to "I decided what to tell her I believed her to like") takes less than 9 minutes real time (user time, not cpu) to parse. The average parsing time for a sentence in the test suite is therefore about three seconds. Pars­ing here includes writing the raw (horsuite.ter) and pretty-printed (horsuite.lst) parses to disk files (horsuite.lst: 159.702 bytes). The generator horgen is considerably faster than the parser. Compiling and linking the grammar to make the executable file horatio.exe (executing moratio.bat, the DOS counterpart of moratio.cmd to be found on page 118) takes less than one minute under DOS 6.0. on the specified machine.

            A first question that we need to tackle concerns the nature of parsing. Obviously the nature and depth of the parses produced is a crucial issue. Parsing goes from tagging (the as­sociation of form with grammatical tags reflecting Part of Speech (POS)) to deep analysis, looking for the semantic invariant behind different phrasings.

            The level chosen here is the one that is deemed to be adequate for the translation from/into English into/from a related language, such as French. In terms of depth the type of parse produced is not very different from those in the IS (Interface Structure) in the EEC Eurotra project, with which the author was associated[5]. The backbone remains syntactic.

            In order to give an idea of the type of parses produced by horatio, we shall look at the parse returned by the system for the following sentence: The workshop is believed to have taken place in the library I wanted her to go to.

            It will be seen that the parse is uncontroversial. Any application that needs to rely on a linguistic analysis of the sentences it is confronted with (i.e. an application such as machine translation, for which template matching or keyword search, however refined, are not good enough)  will at least have to be able to retrieve the information provided by the horatio parse. I tend to agree with McCord, who writes: "It also appears reasonable to use syntactic analysis (embodying some semantic choices, such as word sense disambiguation) in machine translation systems." (in McCord 1987, p. 325)

            We shall look at the parse in its pretty-printed format. Indentation from the left margin reflects depth of embedding: the further we are from the left margin, the deeper we are in the postulated structure. Items at the same distance from the margin are supposed to display the same level of embedding.

            Here is the parse (other parses can be found in appendix F, p. 155):

 28
  clause
   pred_arg_mod_structure
   prop(vce: passive,asp: none,mod: none,tns: present)
    predicate(believe_1,agr(en_passive))
     object
      clause
       pred_arg_mod_structure
       prop(vce: active,asp: [perfect],mod: none,tns: present)
        predicate(take_place_1,agr(en_active))
         subject
          nounphrase
          index(_0508)
          agr(third,sing)
            det(the)
            noun(workshop_1,agr(sing))
         pp_arg
          prepphrase
          index(_09EC)
          prep(in)
           np_arg_of_prep
            nounphrase
            index(_09F4)
            agr(third,sing)
              det(the)
              noun(library_1,agr(sing))
             relative_clause
              clause
               pred_arg_mod_structure
               prop(vce: active,asp: none,mod: none,tns: past)
                predicate(want_1,agr(finite,past,sing,first))
                 subject
                  nounphrase
                  index(_0C28)
                  agr(first,sing)
                  ppro(first,sing,_0CAC)
                 object
                  clause
                   pred_arg_mod_structure
                   prop(vce: active,asp: none,mod: none,tns: present)
                    predicate(go_1,agr(infinitive))
                     subject
                      nounphrase
                      index(_0E10)
                      agr(third,sing)
                      ppro(third,sing,fem)
                     pp_arg
                      prepphrase
                      index(_1010)
                      prep(to)
                       np_arg_of_prep
                        nounphrase
                        index(_09F4)
                        agr(third,_1074)

            The first line of the returned parse is the preference (28). In the case of multiple parses, the one with the highest preference index is to be preferred. The mechanisms used in the computation of the preference index are discussed on page 92 and following.

            The parse is best conceived of as a set of clause parses each headed by a clause header of the following form:

             clause
               pred_arg_mod_structure

This means that the parser has found a clause and that it is going to display its structure in terms of its predicate, the arguments pertaining to that predicate and the clause modifiers, if any (the latter are not tied to the lexically-determined argument structure opened up by the predicate).

            We then have a line devoted to the properties of the clause: voice (active/passive), aspect (none/perfect/progressive), modality (none/modal aux), and tense (present/past). Have taken place yields the following prop line:

       prop(vce: active,asp: [perfect],mod: none,tns: present)

            The predicate has its own property line, made up of the lexeme (with reading number) and of an agreement structure. Multi-word units are recognized as such, even if their compo­nent parts are not adjacent to each other in the input string (see the relevant section on page 21). The predicate line for wanted is the following:

                predicate(want_1,agr(finite,past,sing,first))

The values sing and first (person) are obviously not computed on the basis of wanted, but on the basis of the surface subject I.

            We then get the list of arguments, in canonical order. Unspecified arguments (such as the subject of believe) are left out. The relationships between the four clauses as displayed by the parse are the following:

clause 1
            predicate: believe
                                    args:  subject: unspecified
                                                object: clausal (clause 2)

clause 2
            predicate: take_place
                                    args:  subject: workshop
                                                pp_arg: in library (index X)
                                                                                    np modifier: rel clause (clause 3)

clause 3
            predicate: want
                                    args:  subject: I
                                                object: clausal (clause 4)

clause 4
            predicate: go
                                    args:  subject: she
                                                pp_arg: to library (index X)

            Prepositional phrases and noun phrases bear an index that is used for coindexing. In the sample parse, the missing np governed by the preposition to is coindexed with the np the library: (index(_09F4)). Such coindexing is crucial for the treatment of gapping and long dis­tance dependencies (see page 99).

            Noun phrases also display an agreement structure. For her we find the following two lines:

                      agr(third,sing)
                      ppro(third,sing,fem)

They indicate that we have a personal pronoun whose gender is feminine, number singular and person third. The agreement structures are part of the information that the horatio parses keep about surface structure to make it possible for the generator horgen to retrieve the sur­face forms from the raw Prolog terms corresponding to the parses.

            However, the adequacy of this type of parsing for translation purposes is not proven - the reader is given a program that parses and generates, not one that translates; besides, and on a more positive note, the structures arrived at are presumably usable for other purposes than translation from and into a related language.

            We claim that the real touchstone in horatio is the ability to disambiguate between the various readings of the lexical items belonging to the string to be parsed. Such reading assignment can be seen as one of the central tasks of any parsing system geared towards high quality translation. But of course this is not a rigorous test, because there is no way to decide on the number of readings an item has - the granularity depends on the purposes that are set to the lexicon in the system, as it does on the size of the dictionary and the targeted audience in lexicographical practice.

            In the last instance the best way for the reader to decide whether he is interested in what horatio can do is to look at the sample parses and the test suites provided, and then at the mechanisms and strategies put to work in the parser, to assess their degree of generality and re­usability.

            We make no claim as to the originality of the solutions provided by horatio to parsing problems (although we would be entitled to do so for the treatment of multi-word units and hard coordination). As has already been stressed, it is the interaction between the components that proves the most difficult to manage in applications that go beyond the 'toy' stage.

            It should be noted that the generator horgen provides a minimal check on the parses produced by horatio. It guarantees that the parses produced keep enough information for generating back the strings the parser worked on. This minimal ability is crucial in machine translation, although it is of less interest for other endeavours, such as the development of a natural language front end to a data base.

3. The Lexicon

            A main principle of horatio is that information which belongs to the lexicon should belong in the lexicon. A prime example is frame information, i.e. information on the syntactic (and/or semantic) environment a given item can or must fit into. The lexical entries themselves contain the relevant frames; they do not refer to information stored elsewhere. Consider the entries for ALLOW in horatio:

m_verb(verbtr,allow_1,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,abstract,
        [np(oblig,posprec(1,Wnp),object,abstract)]).

/* the facts allow the explanation */

m_verb(vthat,allow_2,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,human,
        [s(oblig,posprec(1,Precs),object)]).

/* she allows that he is good */

m_verb(vio,allow_3,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,human,
        [np(oblig,posprec(2,Wnp1),object,thing),
         io(oblig,posprec(1,W2),indirect_object,human,_)]).

/* the teacher allows the boys money for books */

m_verb(vinf,allow_4,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,_,
        [np(oblig,posprec(1,Wnp),surf_object,_),
         np_vp(oblig,to_inf,object)]).

/* they allowed him to teach linguistics */

m_verb(vobjadv,allow_5,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,human,
        [np(oblig,posprec(1,Wnp),object,human),
         pp(oblig,posprec(1,Wpp),pp_arg,_,direction,_)]).

/* he allowed the girl into the library */

m_verb(vtrprep,allow_for_1,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,human,
        [pp(oblig,posprec(1,Wpp),pp_arg,_,_,for)]).

/* he allowed for the oversimplifications */

m_verb(vtrprep,allow_for_1,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,human,
        [string(oblig,posprec(1,0),[for]),
         np(oblig,posprec(2,Wnp),object,_)]).

/* he allowed for the oversimplifications */

(the existence of two m_verb clauses for the same reading of ALLOW is explained below, p. 19.)

            The arguments appear in a list which is the last argument of the predicate m_verb, which acts as macro-clause. The first argument is the class the predicate belongs to, the second is the lexeme value - including reading number - , positions 3 to 10 take care of inflectional morphology, position 11 is the value for the transitivity feature, position 12 is a semantic re­striction on the deep subject[6]. Each element of the argument list opens with the value for the optionality feature - either oblig(atory) or opt(ional). The posprec structure is discussed below, p. 91 - it is used to establish linear precedence. The nature of a given argument in the lexical predicate's argument list is of course given by the functor of the structure (such as string, np, pp, etc. in the entries for ALLOW). A common feature is that for surface or deep gf (grammatical function).

            The advantage of putting lexical information in the lexicon is obvious: additions, changes or enhancements in the argument structure of lexical predicates (whether individual predicates or whole classes) do not entail changes in the grammar.

            An alleged disadvantage is the size of the lexicon, which very soon grows rather bulky. However, this disadvantage is not a real one because lexical entries need not be produced as such by the linguist or lexicographer; they can result from the expansion of macro-clauses, either within or outside Prolog. Besides, lexical entries can be imported from a machine-read­able dictionary (MRD), as in the importation from ldoce (The Longman Dictionary of Con­temporary English) to horatio, discussed in Appendix E. The task of the linguist or lexicog­rapher is then reduced to selecting retrieval criteria and checking and expanding the resulting entries. A string manipulation language such as AWK is an ideal tool for performing the neces­sary format transformations.

            As for consultation, at least for languages such as English, it is not necessary to load the whole lexicon into Prolog. Selective downloading can easily be achieved by a simple AWK program, such as getvoc.awk (see Appendix D, page 123).

3.1. Use of Macros

            In horatio macro-expansion is done in Prolog. Consider an m_verb clause and one of its expansion clauses:

Macro-clause

m_verb(verbtr,_,allow_1,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,abstract,
        [np(oblig,posprec(1,Wnp),object,abstract)]).

/* the facts allow the explanation */

Expansion for third person singular present tense inflectional form

verb([Vs|X],X,Class,[predicate(Lex,agr(finite,
            present,sing,third))],
        finite,present,sing,thirdsg,Semsubj,Args):-
m_verb(Class,Part,Lex,_,_,_,Vs,_,_,_,_,_,Semsubj,Args).

            The verb clause is responsible for progression in the input list. The word that is to be read in must be a third person singular present tense form. This form is read off the macro-clause, where it occupies a certain position in the predicate's argument list. The agr functor is filled in the verb clause and appears as part of the structure that is returned in the parse tree. Other relevant information is transferred from the macro-clause to its expansions by unification (such as for instance the argument list Args).

            By having as many verb clauses as is warranted by the inflectional paradigm associated with English verbs we manage to account - in a fairly economical fashion - for all possible forms for the verbs our lexicon includes.

            It is of course possible to generate the macro-clauses themselves, at least partially. In fact, it is even possible to import them from a computerized dictionary such as LDOCE, once it has been converted to data base format, as it has been at the University of Liège. The inter­ested reader is referred to Appendix E for a full discussion, including the complete code of the awk programs that take care of the necessary reformatting operations.

3.2. Double Analysis

            Quirk et al. 1985 and Bresnan 1981 argue cogently that some English lexical con­structions can be parsed in two ways. Such a double analysis is necessary to account for the syntactic manipulations that these constructions admit of.

            A case in point for English is the verb+preposition combination, as in LOOK AT. We can conceive of  LOOK AT as a transitive verb like any other, or we can conceive of it as the verb LOOK governing a prepositional phrase headed by AT. Schematically:

1) LOOK AT + NP
2) LOOK +PP (AT)

            The following sentences illustrate two of the syntactic manipulations (WH-movement and passivization) that lead one to postulate the need for a double analysis. Others can be found in Quirk et al. 1985 and Bresnan 1981.

1: What are you looking at ?
    The man he was looking at ...
    The problem has been looked at from every angle
2: The text at which we have been looking for too long ...

            Pulman in Alshawi et al. 1992 (p. 74) points out that if take advantage of is treated as a complex V only one passive can be derived in the GPSG meta-rule treatment of the passive, because advantage will not be available as an NP node for the meta-rule to apply to. Consequently, only the first of the following two passive S's will be generated:

Kim was taken advantage of.
Advantage was taken of Kim.

This leads Pulman to reject the GPSG treatment. But the problem disappears if a double analy­sis is provided, evidence for which is precisely the availability of two passives.

            It should be noted that the need for double analysis of some lexical constructions is not limited to English. Consider AVOIR L'AIR in French. We need to assign the following two analyses:

1) AVOIR L'AIR + ADJ
2) AVOIR + NP (AIR + ADJ)

on account of the two ways in which agreement can be made (either with AIR or with the sub­ject of the whole phrase AVOIR L'AIR):

Elle a l'air idiote.
Elle a l'air idiot.

            In horatio the lexicon file lexatio2.ari holds two macro-clauses for prepositional verbs such as LOOK AT. The first caters for the analysis in which the preposition belongs to the prepositional phrase rather than to the verb (analysis 2 in our account). The arglist contains a prepositional phrase specified in terms of the preposition heading it (AT in the case of LOOK AT):

m_verb(vtrprep,_,look_at_1,look,look,look,looks,looking,
        looked,looked,looked,trans,living,
        [pp(oblig,posprec(1,Wpp),pp_arg,_,_,at)]).

/* they were looking at her
    the girl at whom they had been looking */

            The second macro-clause identifies AT as a particle to be appended immediately to the right of the verb LOOK (second argument of the macro-clause). The arglist opens with a string (AT) and further contains the np playing the object role:

m_verb(vtrprep,part0:at,look_at_1,look,look,look,looks,looking,
        looked,looked,looked,trans,living,
        [string(oblig,posprec(1,0),[at]),
         np(oblig,posprec(2,Wnp),object,_)
]).

/* they were looking at her
    whom are they looking at ? */

            It should be noted that in both m_verb clauses the lexeme value is the same, viz. look_at_1. We are dealing with the same lexical item.

            In the analysis of sentences such as They were looking at her, both m_verb clauses will succeed, and two parses will be returned. Such redundancy is not felt to be a negative feature, as the relationship between verb and preposition is truly indeterminate in such cases.

            Parsing an S such as the problem was paid attention to relies in horatio on the avail­ability of a lexical entry for pay attention to in which attention to is simply a string appended to pay, the only argument being the np inside the to-phrase, and therefore the only candidate for the subject role in the passive clause:

m_verb(vtrphrprep,part0:'attention to',
pay_attention_to_1_a,pay,pay,pay,pays,paying,
      paid,paid,paid,trans,human,
      [string(oblig,posprec(1,0),[attention,to]),
      np(oblig,posprec(2,Wnp),object,_)]).

            On the other hand parsing the other passive (attention was paid to the problem) will take advantage of the np node whose head is the word attention:

m_verb(vobjfreepp,_,pay_attention_1,pay,pay,pay,pays,paying,
      paid,paid,paid,trans,human,
      [np(oblig,posprec(1,Wnp),object,attention),
      pp(oblig,posprec(1,Wpp),pp_arg,_,_,to)]).

Note that here we have a pp argument governed by preposition to, which is necessary to ac­count for such relative clauses as to which he had paid great attention.

3.3. Verb Classes

            In horatio the lexical clauses for verbs (m_verb clauses) have as first argument the class the verb belongs to. Such a piece of information is used as a handle, useful when we wish to have a quick and easy way of ascertaining that a given verb is appropriate for the operation we want to perform. For instance, it is very handy to be able to pick out quickly raising or ex­traposition verbs, or copula verbs.

            However, we do not use the sharing of verb class as a necessary or sufficient condition for the two verbs to be coordinated in a verb phrase. We shall see that we cannot readily dis­pense with a double parsing of the remaining word list, once as arglist of the first verb, and once as arglist of the second (see the section on hard coordination, p. 94).

            The most important part played by the verb class is to provide an entry point for consis­tency checks and template determination in a lexicographer's workbench, or in the importa­tion process of lexical material from a machine-readable dictionary.

3.4. Raising and Control

            The treatment of raising and control in horatio is based on the distinction between surface and deep grammatical functions. The main principle is that a subject will go on play­ing the subject role until a potential new subject is found.

            I should point out that I disagree with McCord as to the potential subject status of the indirect object. He claims (cf. McCord 1987, p. 346) that the slot frame for PROMISE should have an iobj slot, not an obj slot, and that the indirect object (iobj) cannot play the part of new subject in complement verb phrases, whereas obj (direct objects) and pobj (prepositional ob­jects) can. Therefore Bill is able to play the part of subject of the complement vp (to see Mary and to find Mary) in the first two of the following sentences (because Bill is obj in the first and pobj in the second), but not in the third, where Bill is iobj:

            John wants Bill to see Mary.
            John depended on Bill to find Mary.
            John promised Bill to see Mary.

            I hold PROMISE to be a real exception, to be marked as such in the lexicon. Compare the following sentences, which show that PROMISE and TEACH behave the same way, syntactically:

I promised him to teach linguistics.
I taught him to teach linguistics.
What did I promise him ?
What did I teach him ?
Who did I promise to teach linguistics ?
Who did I teach to teach linguistics ?
I promised it to him.
I taught it to him.

            The distinction between the two arguments cannot be captured by syntactic tests. Him (Who) is iobj (indirect object) in both cases, but cannot fill the new subject slot (i.e. that of subject for any vp to the right) in the case of PROMISE, whereas it can (and does) in the case of TEACH and other verbs governing an indirect object.

3.5. Multi-Word Units

            In horatio multi-word units (mwus) are dealt with according to the degree of mor­phological, syntactic and lexical frozenness that they exhibit.

            Certain pieces of structure develop ties (a degree of internal cohesiveness) that go be­yond what the grammar predicts, or have meanings (and often translations) that are non-com­positional with respect to the grammar being used, or -as is often the case- display both these characteristics at one and the same time. We call them mwus.

            Mwus illustrate the non-givenness of the lexicon. More than single word units, they are theoretical constructs. Their recognition -and the structure that they are assigned- should result from their behaviour in discourse, more precisely from their potential for manipulation. The main principle adhered to in horatio is that mwus should be assigned as little structure as their behaviour warrants. It is this amount of assigned structure which determines the appro­priate techniques to be used for the recognition of mwus from their manifestation in discourse. To give just one example: in order to recognize the mwu TAKE PLACE we look for a mor­phological form of the verb TAKE immediately followed by the string P-L-A-C-E; we do not look for an object NP whose realization is the noun PLACE;  we do not look for the noun PLACE either. Consequently, the entry for TAKE PLACE runs as follows:

m_verb(vidiomintr,take_place_1,take,take,take,takes,taking,
        took,took,taken,intrans,abstract,
        [string(oblig,posprec(1,0),[place]),
         pp(oblig,posprec(2,Wpp),pp_arg,_,location,_)]).

/* the workshop took place in the university */

            Mwus also illustrate the arbitrariness of the grammar-lexis distinction. In horatio there is no linguistically motivated border between syntax and lexis. We can choose to say that unit clauses (a Prolog concept) make up the dictionary of the system, but then the term dictionary is no longer used in a sense that is relevant to linguistic theory.

            In order to assess the degree of internal cohesion of mwus we explore three classes of manipulation:

3.5.1. Insertion

Insertion of material into the lexical unit; compare:

PLAY A ROLE ---> PLAY AN important ROLE

SET FIRE TO            ---> * SET dangerous FIRE TO

   This type of insertion (insertion of modifiers attached to elements belonging to a piece of the mwu) should be distinguished from:

a) interruption of the mwu by foreign material:

            he PAID, if I may say so, ATTENTION TO the problem
            * the match TOOK, if I may say so, PLACE in the library

b) insertion into the mwu of one or several of its arguments:

            he TOOK the problems INTO ACCOUNT (insertion of the object 'the problems' into the mwu TAKE INTO ACCOUNT)

3.5.2. Extraction

            Extraction of an element from its position within the canonical representation of the lexical unit; this basic manipulation subsumes all standard transformations effecting movement or deletion; compare:

PAY ATTENTION TO       ---> attention was paid to every single detail

MAKE A FOOL OF                   ---> * a fool was made of the new head

3.5.3. Proformation

            Replacement of a node in the mwu by a suitable pro-form: personal or indefinite pro­noun for NP, so for S, do so for (certain classes of) VP, there for PPs functioning as place adjunct, etc. Compare:

PLAY A ROLE             ---> play it again

PAY ATTENTION TO       ---> * pay some again / * don't pay any to him

            In horatio we distinguish (in a hierarchy from frozen to open):

a) completely frozen mwus

            A standard example is the adverb BY AND LARGE.

            These mwus have no internal structure. They should be regarded as objects of type string, with their various elements bound by the adjacency operator (i.e. white space). In particular, there is no reason whatsoever for trying to assign a part of speech to any of the constitutive elements: for example, BY is not a preposition here (or whatever else for that matter: it is no more than the sequence of letters B‑Y) and LARGE is not an adjective.

b) mwus that allow only inflectional morphology variation (in one or several of their con­stituents)

            Examples in horatio are TAKE PLACE and SHOOT THE BREEZE, in which TAKE and SHOOT can be inflected. Only the complete configurations are assigned structures. There is no reason to assign any structure to PLACE, which is simply the sequence of letters P‑L‑A‑C‑E. It does not behave as an NP, so is not an NP. It does not behave as a noun, so is not a noun. We have already given the entry for TAKE PLACE. Here is the one for SHOOT THE BREEZE:

m_verb(vidiomintr,part0:'the breeze',
              shoot_the_breeze_1,shoot,shoot,shoot,shoots,shooting,
        shot,shot,shot,intrans,human,
        [string(oblig,posprec(1,0),[the,breeze])]).

c) mwus which can be interrupted by one or several of their arguments.

            An example in horatio is TAKE INTO ACCOUNT. TAKE can be inflected. TAKE and INTO ACCOUNT can be separated by the object of the mwu:

he took the problems into account
he took into account the problems that she had seen

(the relevant feature for position of the object is its weight)  Here is the entry for TAKE INTO ACCOUNT:

m_verb(vobjfixedpp,part1:'into account',
         take_into_account_1,take,take,take,takes,taking,
        took,took,taken,trans,human,
        [string(oblig,posprec(1,3),[into,account]),
         np(oblig,posprec(1,Wnp),object,_)]).

d) collocations: these are mwus whose elements are free to behave as the normal (i.e. with re­spect to a given grammar) structure assignment predicts. An example in horatio is TAKE MEASURE, where both TAKE and the NP whose head is the noun MEASURE behave as predicted by the 'normal' structure  assignment:

VP [ V [TAKE] NP [ ... Head N [MEASURE]]]]

            The link between TAKE and MEASURE is collocational, i.e. TAKE is the preferred verb to express what it expresses here. The implementation of such a lexical affinity in horatio is achieved through a feature on the object of TAKE, namely [measure], feature which is assigned to the noun MEASURE under one of its readings. Such features can be re­garded as hyperspecialised semantic features, i.e. it is hypothesized that they will not be needed alongside semantic features, and that consequently they can share the same slot. The entry for TAKE MEASURE looks like this:

m_verb(verbtr,_,take_measure_1,take,take,take,takes,taking,
        took,took,taken,trans,human,
        [np(oblig,posprec(1,Wnp),object,measure)]).

            horatio also has the corresponding entry for the noun MEASURE when used in the TAKE MEASURE collocation:

m_noun(measure_1,measure,measures,[measure],[]).

            In connection with the implementation of mwu's it should be noted that when we satisfy (i.e. match against the input word list) a fixed string, we return no parse tree, as the fixed string is included in the predicate's lexical entry (as in look_down_on_1) as well as in the predicate's arglist:

satisfy(P0,P1,[],0,Posprec,Rel,Intrel,[],
      string(Type,Posprec,String),_,_):-
append(String, P1, P0).

            The String appended to the remaining list should yield the input list. In the lexical en­try, String is a list as in:

[string(oblig,posprec(1,0),[down]),

part of the entry for look down on:

m_verb(vtrphrprep,part0:down,look_down_on_1,look,look,look,looks,looking,
      looked,looked,looked,trans,human,
      [string(oblig,posprec(1,0),[down]),
      pp(oblig,posprec(2,Wpp),pp_arg,_,_,on)]).

/* the teacher looked down on his students */


3.6. Inflectional Morphology

            horatio works with a full form dictionary (of course, morphological variants can be looked up -irregular forms- or generated -regular forms- cf. Appendix E); verbs have nine morphological variants:

am

first person sing present tense

is

third person sing present tense

was

first person singular past tense

was

third person sing past tense

were

second person or plural past tense

are

second person or plural present tense

being

ING form

been

past participle

be

infinitive

Note: even BE does not have nine, but eight, distinct forms; the assignment of nine forms is grammar-internal.

4. The Grammar: Interpreting the Lexicon

            horatio is a lexicon-driven parser. The main task of the grammar component is to in­terpret the information contained in the lexical predicates of the system.

            What does interpreting the lexicon actually mean? Consider the arglist for the second reading of ALLOW, repeated here for convenience:

[s(oblig,posprec(1,Precs),object)]

            This argument is interpreted by a clause for the predicate satisfy[7], which recursively calls the grammar for the parsing of a sentence (job of the xsentence predicate):

satisfy(P0,P1,Gap,Prefgen,posprec(Pos,4),
             Rel,Intrel,[Function,Tree],
          s(Type,posprec(Pos,4),Function),
          subject(SUBJ,Semsubj),
          subject(SUBJ,Semsubj)):-

         xsentence(P0,P1,Gap,Pref,Tree,finite,Person,Number,Voice),
         Prefgen is Pref + 4.

            The same process is used for other argument types; for instance, to satisfy an np arg, a call on the nounphrase predicate is made; information can be read off the lexicon when nec­essary, as in this case information on the function to be filled by the np. The argument list is traversed non-deterministically. The checking of linear precedence is discussed below (see p.91).

4.1. General Strategy

Arguments and Modifiers

            Arguments are lexically determined. They are either obligatory or optional. In Horatio, each lexical argbearer (argument bearer) has an arglist (argument list) associated with it in the lexicon (the arglist is always the last argument of a lexical predicate).

            Modifiers are not associated with lexical items, but with syntactic classes. They are al­ways optional.

            When an argbearer participates in a syntactic construction, its obligatory arguments must be satisfied in the construction. Besides, they must appear in a sequence which satisfies the precedence relation: each argument must satisfy the precede predicate with respect to the argument which follows it in the left to right order of the word list to be parsed.

            Let us take the example of an arglist associated with a verb. The verb CONSIDER can take, under one of its readings, an arglist consisting of an object and an object complement:

m_verb(vcomp,_,consider_2,consider,consider,consider,considers,
      considering,
        considered,considered,considered,trans,human,
        [np(oblig,posprec(1,Wnp1),object,_),
         np(oblig,posprec(2,Wnp2),object_attribute,_)
]).

/* he considered the claim she made an oversimplification */

            This arglist will be passed on to the predicate arglist when the verb phrase which has consider as main verb is parsed by the predicate verbphrase:

verbphrase(P1,P3,subject(SUBJ,Semsubj),Gap,Pref,
               [pred_arg_mod_structure,
               prop(vce:V,asp:A,mod:Modality,tns:Tense),
               VERB,SParse],
             Rel,Intrel,Type,Tense,aspect(Aspect),Modality,
             Number,Person,Voice,nsubject(NSUBJ,Nsem)):-
    verb(P1,P2,Class,VERB,Type,Tense,
          Number,Person,Semsubj,Args) ,
    arglist(P2,P3,Gap,Status,Pref,Preclist,Rel,
               Intrel,Voice,Parse,vp,
               Args,Func,subject(SUBJ,Semsubj),
               nsubject(NSUBJ,Nsem),
               Class),
     (nonvar(Aspect); var(Aspect),A = none ),
     (Aspect = [] , A = none; Aspect \= [],A = Aspect ),
     (nonvar(Modality); var(Modality),Modality = none ),
     (nonvar(Tense); var(Tense), Tense = present ),
     (nonvar(Voice), V = Voice; var(Voice),V = active ),
     append([NSUBJ],Parse,AParse),
     insort(AParse,SParse).

            The predicate arglist will call reog to deal with possible subject changes induced by passivization and other subject-changing transformations and then will call satisfylist with the new, reorganized, arglist:

arglist(P0,P2,Gaps,ArgOrModFound,
              Pref,Posprec1,Rel,
              Intrel,Voice,Parse,NpOrVp,List,
              Func,subject(SUBJ,Semsubj),
              nsubject(NSUBJ,Nsem),
              Class):-
              reog(Voice,Class,subject(SUBJ,Semsubj),List,
                   nsubject(NSUBJ,Nsem),Nlist,Func),
              satisfylist(P0,P2,Gaps,ArgOrModFound,Pref,Posprec1,
                  Rel,Intrel,Voice,Parse,NpOrVp,Nlist,
                  Func,subject(NSUBJ,Nsem)).

            The predicate satisfylist[8] will non-deterministically pick an element out of the arglist and try to satisfy it by calling the predicate satisfy; it will then go on to try and sat­isfy the remainder of the arglist, making sure that the precede relation between the satisfied argument and the remainder is satisfied:

satisfylist(P0,P2,Gaps,ArgOrModFound,
              Pref,Posprec1,Rel,
              Intrel,Voice,Parse,NpOrVp,List,
              Func,subject(SUBJ,Sem)):-
              append(Gap1,Gap2,Gaps),
              pick
(List,Elem,Tail),

/* pick[9] is non-deterministic selection of an element from a list:
List is the list to select from
Elem is the selected element
Tail is List from which the selected element has been deleted */

              satisfy(P0,P1,Gap1,Pref1,Posprec1,
                      Rel,Intrel,Parse1,
                      Elem,subject(SUBJ,Sem),
                      subject(NEXTSUBJ,Nsem)),
             ArgOrModFound = 1,
              satisfylist(P1,P2,Gap2,ArgOrModFound,
                          Pref2,Posprec2,Rel,Intrel,Voice,
                          Parse2,NpOrVp,Tail,
                          Func,subject(NEXTSUBJ,Nsem)),
              precede(Posprec1,Posprec2),
              accu(Pref,[Pref1,Pref2]),
              append([Parse1],Parse2,Parse).

            The predicate satisfy is defined in different ways according to the argument that it is passed. In the case of CONSIDER, it will be passed two nps, and the following clause will be triggered:

satisfy(P0,P1,Gap,Pref,posprec(Pos,Prec),Rel,Intrel,
                [Function,Rest],
                np(Type,posprec(Pos,Prec),
                Function,Semvp),
                subject(SUBJ,Semsubj),
                subject([subject,Rest],Semvp)):-

    nsubject(Function),
    xnounphrase(P0,P1,Gap,index(J),
           Prefnp,Prec,Rel,Intrel,
            Function,
            [Function,Rest],
            Number,Person,Sem),
    sfok(Semvp,Sem),
    Pref is Prefnp + 4.

            Satisfy calls the xnounphrase predicate to parse the argument np. It will be called twice, as the two elements in the arglist are both nps (but they do not fill the same function, the first being the object, and the second the complement, unless weight considerations have disturbed the canonical arg order).

            How is the difference between obligatory and optional arguments accounted for? We have seen that satisfylist calls itself recursively. When it cannot succeed by parsing more of the input, it is allowed to succeed doing nothing, provided the arglist no longer contains obligatory args, i.e. all the remaining args are optional:

satisfylist(P0,P0,Gap,_,0,_,_,_,_,[],_,List,Func,_):-
                                                      allopt(List).

            Allopt checks that all the args are optional. It does so by looking at their Type, which is always the first member of the functor representing the arg (np, pp, s, etc.):

allopt([Head|Tail]):-
                      arg(1,Head,opt),
                      allopt(Tail).

allopt([]).

            Modifiers are parsed by the predicate modifier. The crucial difference is that their parsing does not affect the arglist:

satisfylist(P0,P2,Gaps,ArgOrModFound,
              Pref,Posprec1,Rel,
              Intrel,Voice,Parse,NpOrVp,List,
              Func,subject(SUBJ,Sem)):-
              append(Gap1,Gap2,Gaps),
              modifier(P0,P1,NpOrVp,Gap1,
                       Prefmod,Posprec1,
                       Rel,Intrel,Parse1,
                       subject(SUBJ,Sem)),
              ArgOrModFound = 1,
              satisfylist(P1,P2,Gap2,ArgOrModFound,
                          Preflist,Posprec2,
                          Rel,Intrel,Voice,
                          Parse2,NpOrVp,List,
                          Func,subject(SUBJ,Sem)),
              precede(Posprec1,Posprec2),
              accu(Pref,[Prefmod,Preflist]),
              append([Parse1],Parse2,Parse).

            The predicate modifier will parse modifier pps for both nps and vps, and will also parse ing-phrases and en-phrases as np modifiers. In the case of pps it will call on the xprepphrase predicate, and in the case of ing and en-phrases on the xverbphrase predi­cate. Here is, for example, the definition of the predicate modifier for the parsing of modi­fier pps within vps:

modifier(P0,P1,vp,Gap,Prefgen,posprec(1,Precpp),
         Rel,Intrel,Tree,subject(SUBJ,Sem)):-
         xprepphrase(P0,P1,Gap,index(J),npindex(I),Pref,Precpp,
                    Prepform,Rel,
                    Intrel,vp_modifier,
                   Tree,PPsem,PPsemnp),
         modppvp(Prepform),
         Prefgen is Pref + 2.

4.2. The S Level

            At the highest level, we have the parse predicate, with three clauses: one for declarative sentences, a second for yes-no questions and a third for wh-questions (imperatives are not cov­ered). This highest level is the only one at which the Preference value is included in the parse tree.

parse(P0,[],[Preference,Tree]):-
      xsentence(P0,[],[],Preference,Tree,finite,Person,Number,Voice).

            Note that the S parsed by xsentence  must  be finite and that the whole string must have been traversed. The first arg of xsentence, P0, is the input word list and the second arg is the remaining word list to be traversed. It is here set to the empty list ([]). The third argument of xsentence is the gap specification. It is also set to the empty list: a main declarative clause can­not feature any gap.

parse(P0,[],[Preference,Tree]):-
      yesnoquestion(P0,[],Preference,Tree).

parse(P0,[],[Preference,Tree]):-
      whquestion(P0,[],Preference,Tree).

4.2.1. Declarative Clauses

            The main predicate here is xsentence. In horatio a predicate name beginning with an x, such as here xsentence, is used to parse a phrase that can, but need not, result from the coordination of two phrases of the type indicated by the predicate name without the x. xsentence will take care of both simple and coordinated S's. In the case of coordinated S's, xsentence will make a call on c_sentence and then recursively call itself.

xsentence(P0,P2,[],Prefs,[and_sentence,S1,S2],Type,Person,
            Number,Voice):-
inlist(and,P0),
c_sentence(P0,[and|P1],[],Pref1,S1,Type,Person,Number,
            Voice),
xsentence(P1,P2,[],Pref2,S2,Type,Person2,Number2,Voice2),
accu(Prefs,[Pref1,Pref2]).

            In the case of a simplex S (non-coordinated main clause), xsentence will simply make a call on c_sentence:

xsentence(A,B,C,D,E,F,G,H,I):- c_sentence(A,B,C,D,E,F,G,H,I).

     c_sentence parses  main clauses that can, but need not, be flanked by an adverbial subordinate clause on either side. Here is the code for a main clause preceded by an adverbial subordinate clause:

c_sentence(P0,P2,[],Preftot,[sentence,Adverbs,[S]],
                              finite,Person,Number,Voice):-
adverb_sentence(P0,P1,[],Prefsub,Adverbs,finite,Person1,Number1,Voice1),
sentence(P1,P2,[],Prefmain,S,finite,Person,Number,Voice),
accu(Preftot,[Prefsub,Prefmain]).

            The adverb_sentence predicate parses an adverbial subordinate clause by finding a subordinator and parsing an S:

adverb_sentence(P0,P2,[],Pref,[SUB,S],finite,Person,Number,Voice):-
sub(P0,P1,SUB,_),
/* subordinating conjunction; last position in argument list is currently unused */
sentence(P1,P2,[],Pref,S,finite,Person,Number,Voice).

            Finally, a c_sentence can have no adverbial subordinate clause, but consist of a main clause only:

c_sentence(A,B,C,D,E,F,G,H,I):- sentence(A,B,C,D,E,F,G,H,I).

We can now look at the sentence predicate itself.

sentence(P0,P2,Gaps,Prefs,[clause,VP],Type,Personvp,Number,Voice):-

append(Gapnp,Gapvp,Gaps),

/* it is more efficient to do the appending of the gaplists now, because there are cases where we know that the result of the appending must be the empty list */

/* the subject np */

xnounphrase(P0,P1,Gapnp,index(I),Prefnp,Weight,Rel1,
                        Intrel1,subject,SUBJ,Number,Personnp,Semsubjnp) ,
var(Rel1),

/* the Rel variables must be uninstantiated; it is the conjunction of an antecedent and a relative clause that releases, i.e. uninstantiates the Rel var; see the treatment of relative clauses below, p.
99 */

/* Subject-verb agreement */

agree(Personnp,Number,Personvp),

/* the first two arguments of agree come from the np, the third from the vp */

/* the vp */

/* the parse tree corresponding to the subject will be included in the parse tree returned by the xverbphrase predicate */

xverbphrase(P1,P2,subject(SUBJ,Semsubjvp),
                        Gapvp,Prefvp,VP,Rel2,
                        Intrel2,Type,Tense,aspect(Aspect),Modality,
                        Number,Personvp,
                        Voice,
                        nsubject(NSUBJ,Nsemsubjvp)),

/* Nsemsubjvp records the semantic restriction on the subject if the verbphrase has been found to be passive; the Voice variable is left uninstantiated in the active */

/* nsubject potentially changes the deep subject (passives, for instance); the semantic check (sfok) must therefore be between Nsemsubjvp and the Semsubjnp list */

var(Rel2),

/* semantic check */
sfok(Nsemsubjvp,Semsubjnp),

/* computing the preference index */
accu(Prefs,[Prefnp,Prefvp,4]).

4.2.1.1. Subject / Operator Agreement Rules

            Agreement between the subject np and the verb phrase is checked by the predicate agree, whose code is the following:

agree(first,sing,firstsg).
agree(first,plural,other).
agree(second,sing,other).
agree(second,plural,other).
agree(third,sing,thirdsg).
agree(third,plural,other).

            The first two arguments come from the subject np: Person and Number. The third ar­gument comes from the verb phrase. The value firstsg is necessary for the forms was and am of BE; the value thirdsg is used to capture the agreement feature of was and of the present tense third person of most verbs. The value other is a catchall value.

            A call to the agree predicate is made at the S level, in the definition of the sen­tence predicate, repeated below with the relevant pieces in bold type:

sentence(P0,P2,Gaps,Prefs,[clause,VP],Type,Personvp,
                  Number,Voice):-
append(Gapnp,Gapvp,Gaps),
xnounphrase(P0,P1,Gapnp,index(I),Prefnp,Weight,Rel1,
                 Intrel1,subject,SUBJ,Number,Personnp,Semsubjnp) ,
var(Rel1),
agree(Personnp,Number,Personvp),
xverbphrase(P1,P2,subject(SUBJ,Semsubjvp),
                 Gapvp,Prefvp,VP,Rel2,
                 Intrel2,Type,Tense,aspect(Aspect),Modality,
                 Number,Personvp,
                 Voice,
                 nsubject(NSUBJ,Nsemsubjvp)),
var(Rel2),
sfok(Nsemsubjvp,Semsubjnp),
accu(Prefs,[Prefnp,Prefvp,4]).

4.2.1.2. Priority among Person Features

            When dealing with coordinated nps we need to call on the priority predicate to de­termine the person of the np resulting from the coordination. For instance she and I is first per­son, not third. The code for the priority predicate reflects the person hierarchy: first has priority over second and third, and second over third:

priority(first,second,first):- !.
priority(first,third,first):- !.
priority(third,first,first):- !.
priority(second,first,first):- !.
priority(third,second,second):- !.
priority(second,third,second):- !.
priority(X,X,X).

            The first two arguments are the person values of the first and second member of the co­ordination; the third arg is the resulting person value, the one associated with the coordination as a whole. Note that we can use the cut and that we need so many clauses because we need to cater for the two positions that the winning value can have in the coordination. The last clause deals with the case where the two coordinated nps share the person value.

            The predicate priority is called in the parsing of coordinated nps:

xnounphrase(P0,P2,[],index(I),Pref,Weight,Rel,Intrel,
                  Function,
                 [Function,[and_nounphrase,Rest1,Rest2]],
                 plural,Person,Sem):-
     inlist(and,P0),
     nounphrase(P0,[and|P1],[],_,Pref1,Weight1,Rel,
                Intrel,Function,
                [Function,Rest1],
                Number1,Person1,Sem),
     xnounphrase(P1,P2,[],_,Pref2,Weight2,Rel,
                 Intrel,Function,
                 [Function,Rest2],
                 Number2,Person2,Sem2),
     priority(Person1,Person2,Person),
     accu(Pref,[Pref1,Pref2]),
     accu(Weightaccu,[Weight1,Weight2]),
     Weight is (Weightaccu/2)+1.

4.2.1.3. Passives

4.2.1.3.1. Parsing

            Passivization induces changes in the arglist associated with the passivized verb: the ac­tive subject is left out of the arglist, or demoted to head of a by-phrase; one of the other ar­gu­ments is promoted to subject.

            In horatio it is the predicate reog (for reorganization of the arglist) which takes care of accounting for these changes, i.e. maintaining the relation between the arglist as ex­pressed in the lexicon and the arguments as occurring in the passivized clause.

            It should be noted that in horatio the voice value is left uninstantiated when the clause is active, and set to passive when the clause is passive. Here is the relevant definition of the reog predicate:

reog(passive,Class,subject([SFunc,Rest],Semsubj),List,
               nsubject(NSUBJ,Nsem),Nlist,Func):-
      pick(List,Elem,Remainder),
      Elem =.. Elemlist,

/* tree to list conversion: each element in the arglist is a tree whose root indicates its nature: np, pp, s, etc... */

      Elemlist = [Nature,Type,Posprec,Func,Nsem|_],
      psubject(Func),

/* Func points to a function in the active S that can be promoted to subject of passive; this is checked by psubject  */

      NSUBJ=[Func,Rest],

/* note that the deep function is preserved in the parse tree */

      append(Remainder,
             [byphrase(opt,posprec(_,3),subject,Semsubj)],
            Nlist).

/* the subject position opened up for the active vp yields a by-phrase position to be appended to the remaining arglist */

            The code for psubject enumerates the various functions that can fill in the subject role in a passivized S:

psubject(indirect_object).
psubject(object).
psubject(surf_object).

            The last one will be used in the parsing of such sentences as She is expected to teach, where the subject (she) results from a promotion to surface subject of the main clause of a surf_object  (They expect her to teach). This surface object results from the promotion of the subject of the embedded clause (via subject-to-object raising), which is the real (i.e. deep), clausal object (she to teach).

            Reog is always called in the definition of the arglist predicate (passage from arglist to satisfylist), but it is allowed to succeed doing nothing in the case of active clauses to which neither raising nor extraposition applies:

reog(Voice,Class,subject(SUBJ,Semsubj),List,
              nsubject(SUBJ,Semsubj),List,Func):-
       Class \= sraising,
       Class \= extrapos,
       var(Voice),!.

            The code for arglist includes a call to reog, which, if appropriate, will reassign the subject and modify the argument list:

arglist(P0,P2,Gaps,ArgOrModFound,
              Pref,Posprec1,Rel,
              Intrel,Voice,Parse,NpOrVp,List,
              Func,subject(SUBJ,Semsubj),
              nsubject(NSUBJ,Nsem),
              Class):-
              reog(Voice,Class,subject(SUBJ,Semsubj),List,
                   nsubject(NSUBJ,Nsem),Nlist,Func),
              satisfylist(P0,P2,Gaps,ArgOrModFound,Pref,Posprec1,
                  Rel,Intrel,Voice,Parse,NpOrVp,Nlist,
                  Func,subject(NSUBJ,Nsem)).

4.2.1.3.2. Generation

            Passive in generation is the inverse of passive in analysis. Analysis yields parse trees which exhibit deep grammatical functions, and the surface relations must be re-established be­fore the strings corresponding to the surface phrases can be generated. The whole reorganiza­tion process is in the hands of the predicate prepgen, which adheres to the transformational cycle (see the section on the cycle on page 108), and prepgen calls on passive where ap­propriate. Passive has a fair number of clauses on account of the various geometries of the trees it is supposed to work on: they may contain indirect objects, direct objects, subjects, in various orders. Passive has a clause for each configuration. It is not necessary to examine them all here.

            First, let us look at the clause for passive which applies when voice is active. Obvi­ously enough, we do not want passive to fail in such cases, but rather to succeed trivially, i.e. leaving the tree untouched:

passive([H1,[pred_arg_mod_structure,prop(vce:active,B,C,D)|R1]|R2],
             [H1,[pred_arg_mod_structure,prop(vce:active,B,C,D)|R1]|R2] ):-
second_header(H1),
!.

            Second_header(H1) is only a check on the environment, namely on the clause header.

            As an example of a clause for passive that actually does something, let us look at the one for clauses with an indirect object that can be demoted to subject of the passive clause:

passive([H1,[pred_arg_mod_structure,prop(vce:passive,B,C,D),Pred1,
    [[Subject,S],
     [object|Robject],
     [indirect_object|Rio]|Otherargs]]],
 
     [H1,[pred_arg_mod_structure,prop(vce:passive,B,C,D),Pred1,
    [[subject_pass|Rio],[object|Robject],
     [pp_arg,[prepphrase,index(I),prep(by),[np_arg_of_prep,
             S]]]
|Otherargs]]]):-

subject_active(Subject),
second_header(H1) .

            The actual work gets done in the head of the clause, the body only containing checks on the environment, namely a check on the clause header (H1) and on the subject of the active clause.

            The reorganization work consists in demoting the active subject to by-phrase in the passive clause. The by-phrase is created as a pp_arg, with preposition BY governing an np_arg_of_prep corresponding to the body of the subject in the active clause. The second transformation concerns the status of the active indirect object. Its body is not changed, but its function is turned to subject_pass, i.e. subject of a passive S.

            Note that we need to take care of the case where the by-phrase has an np arg that is uninstantiated, so that we do not generate a surface by-phrase with an empty np inside. The following clause for gen does the job:

gen([pp_arg,[prepphrase,index(I1),prep(by),
      [np_arg_of_prep,[nounphrase,index(I2)|VAR]]]],[]):-
var(VAR).

The second arg of gen, which houses the generated string, is set here to the empty list ([]).

4.2.1.4. The Assignment of the Subject Role

            Control relations (which determine the assignment of one of the predicate's arguments to the subject slot of a nonfinite complement clause, such as an infinitive or ing clause) are computed while syntactically parsing the S. They are not taken out of the syntactic component and assigned to semantic interpretation rules, as they are in CLE (see Alshawi et al. 1992, section 5.3.2, VP control phenomena, p. 101 and foll.)

4.2.1.4.1. Control

            In Horatio, control takes care of the coindexing of a controlling argument with the argument it controls, which is always the subject of a subordinate clause which belongs to the same arglist as the controller.

            In generation, the task consists in ghosting the controlled subject, i.e. depriving it of lexical material, so that nothing is generated.

4.2.1.4.1.1. Parsing

            Consider the case of verbs such as WANT constructed with an np followed by a to-infinitive. The control relation is between the surface object of WANT and the subject of the infinitive. In a sentence such as The woman wants the teacher to teach, the teacher will be as­signed as running subject[10] (i.e. subject of any vp further to the right) when it is parsed to satisfy the np argument in the arglist of WANT:

m_verb(vinf,_,want_1,want,want,want,wants,wanting,
        wanted,wanted,wanted,trans,living,
        [np(opt,posprec(1,Wnp),subject_inf,_),
         np_vp(oblig,to_inf,object)]).

            The running subject assignment will be done by the satisfy predicate:

satisfy(P0,P1,Gap,Pref,posprec(Pos,Prec),Rel,Intrel,
                [Function,Rest],
                np(Type,posprec(Pos,Prec),
                Function,Semvp),
                subject(SUBJ,Semsubj),
                subject([subject,Rest],Semvp)):-
    nsubject(Function),
    xnounphrase(P0,P1,Gap,index(J),
           Prefnp,Prec,Rel,Intrel,
            Function,
            [Function,Rest],
            Number,Person,Sem),
    sfok(Semvp,Sem),
    Pref is Prefnp + 4.

            The first subject functor in the satisfy argument list points to the running subject on entering and the second to the running subject on exiting the procedure. The parse tree returned by the xnounphrase is unified with the first argument of the second subject functor, except for the function, which is set to subject in the second subject functor.

            Nsubject checks that the function is one that can yield the running subject. It is de­fined by the following code:

nsubject(subject_inf).
nsubject(subject_ing).
nsubject(surf_subject).
nsubject(object).
nsubject(surf_object).
nsubject(indirect_object).

            Now that the running subject has been set, it can be assigned as subject to the to-infinitive. This is done when the to-infinitive argument is satisfied by a call to the satisfy predicate:

satisfy([to|P0],P1,Gap,Prefgen,Posprec,Rel,Intrel,
                [Function,[clause,Tree]],
                np_vp(Type,to_inf,Function),
                subject([Sfunction,Treesubj],Semsubj1),
                subject([Sfunction,Treesubj],Semsubj1)):-
  xverbphrase(P0,P1,subject([subject,Treesubj],Semsubj2),
              Gap,Pref,Tree,Rel,
              Intrel,infinitive,Tense,
              aspect(Aspect),Modality,
              Number,Person,Voice,
              nsubject(NSUBJ,Nsem)),
  checksem(Nsem,Semsubj1),
  Prefgen is Pref + 4.

4.2.1.4.1.2. Generation

            In generation, the controlled subject needs to be ghosted, i.e. deprived of lexical mate­rial, so that it does not appear in the generated string. We need several clauses for the predicate control, depending on the structural positions of the controller and of the clause with the controlled subject. Let us examine the one for controller as first argument and clause contain­ing the controlled subject as second:

control([H2,
[pred_arg_mod_structure,Prop1,[predicate(Pred1,AgrPred1)],
    [[Controller,[nounphrase,index(I),AgrNP|Rest1]],
    [H1,[clause,[pred_arg_mod_structure,Prop2,[predicate(Pred,agr(Agr))],
    [[Subject,[nounphrase,index(I),AgrNP|Restsubj]]|Otherargs]]]]|R1]]],

[H2,[pred_arg_mod_structure,Prop1,[predicate(Pred1,AgrPred1)],
    [[Controller,[nounphrase,index(I),AgrNP|Rest1]],
    [H1,[clause,
[pred_arg_mod_structure,Propnew2,[predicate(Pred,agr(Agr))],
    [[Subject,[nounphrase,index(I),AgrNP|VAR]]|Otherargs]]]]|R1]]]):-

nonvar(I),
allsubject(Subject),
cv(Pred1,Requires),
second_header(H2),
first_header(H1),
controller(Controller),
nonfinite(Agr),
Prop2 = prop(Voice,Aspect,Mod,Tns),
Propnew2 = prop(Voice,Aspect,Mod,tns:Requires).

            We first check that the index is instantiated. We do not want control to be respon­sible for instantiation through unification.

            This is followed by a series of other checks on the environment of the rule. We make sure that the variable Subject refers to a subject function (the code for allsubject is given on page 113), and that the clause headers are as expected (see page 110). We also check that Controller points to a function that can control. The code for the controller predicate is the following:

controller(subject).
controller(object).
controller(subject_inf).
controller(subject_pass).
controller(indirect_object).

            The cv clause makes a call on the lexicon (macro clause m_verb) and checks that the verb class is that of a control verb with the help of the predicate cvclass. The predicate cvclass also gives the nature of what will remain from the controlled clause: it will be a vp, and cvclass indicates whether it will be gerundive or infinitive. This value returned by cvclass will come to occupy the position for the tense value, which does not apply to non-finite clause. This mechanism is further explained on page 114.

            The last check is on the agreement feature of the predicate in the controlled clause. It cannot be a finite clause, as control does not apply to finite clauses (rather pronominalization does).

            The ghosting job consists in replacing Restsubj, the body of the subject in the parse tree, by VAR, an uninstantiated variable.

            The generator has a clause which ensures that ghosted nps (and other ghosted elements) do not generate any output. It simply stipulates that uninstantiated variables generate the empty list, and this empty list disappears in the list appending and flattening processes which complete generation:

gen(X,[]):- var(X), !.

4.2.1.4.2. Raising

            We need to distinguish Subject-to-Object raising and Subject-to-Subject raising. The first case is exemplified by the relation between the string I believe John to teach linguistics and the parse produced, revealing that the surface object is the deep subject of the subordinate clause (something linearizable as I believe [John teach linguistics].) The second case is ex­emplified by the surface string John seems to teach linguistics and its "source" (in transforma­tional terms - merely metaphorical here, since horatio does not assume the existence of transformations, but is only interested in revealing relations): [John teach linguistics] seems.

4.2.1.4.2.1. Parsing

4.2.1.4.2.1.1. Subject-to-Object Raising

            In the lexicon subject-to-object raising verbs belong to the vinf class. Consider the entry for BELIEVE under the relevant reading :

m_verb(vinf,_,believe_1,believe,believe,believe,believes,believing,
        believed,believed,believed,trans,human,
        [np(oblig,posprec(1,Wnp),surf_object,_),
         np_vp(oblig,to_inf,object)
]).

/* he believes him to teach linguistics */

            When the arglist is satisfied, the np (such as him in the example) will be assigned surf_object as function. This function is an athematic one, and the np will therefore not ap­pear in the parse tree, on account of the following defining clause for the predicate drop: drop([surf_object,Rest]).

            However, this np will have been made running subject by the satisfy predicate, since surf_object is one of the functions accepted by nsubject: nsubject(surf_object).

satisfy(P0,P1,Gap,Pref,posprec(Pos,Prec),Rel,Intrel,
                [Function,Rest],
                np(Type,posprec(Pos,Prec),
                Function,Semvp),
                subject(SUBJ,Semsubj),
                subject([subject,Rest],Semvp)
):-
    nsubject(Function),
    xnounphrase(P0,P1,Gap,index(J),
           Prefnp,Prec,Rel,Intrel,
            Function,
            [Function,Rest],
            Number,Person,Sem),
    sfok(Semvp,Sem),
    Pref is Prefnp + 4.

            When the np_vp construction in the arglist is satisfied, it is assigned the running subject as subject:

satisfy([to|P0],P1,Gap,Prefgen,Posprec,Rel,Intrel,
                [Function,[clause,Tree]],
                np_vp(Type,to_inf,Function),
                subject([Sfunction,Treesubj],Semsubj1),
                subject([Sfunction,Treesubj],Semsubj1)
):-
  xverbphrase(P0,P1,subject([subject,Treesubj],Semsubj2),
              Gap,Pref,Tree,Rel,
              Intrel,infinitive,Tense,
              aspect(Aspect),Modality,
              Number,Person,Voice,
              nsubject(NSUBJ,Nsem)),
  checksem(Nsem,Semsubj1),
  Prefgen is Pref + 4.

            The function of the whole clause is read off the lexicon (object). We therefore end up with the clausal object we need: him to teach, where him is assigned subject role.

4.2.1.4.2.1.2. Subject-to-Subject Raising

            Subject-to-Subject Raising verbs are assigned the sraising class. Consider the entry for SEEM under the relevant reading in the lexicon:

m_verb(sraising,_,seem_1,seem,seem,seem,seems,seeming,
        seemed,seemed,seemed,intrans,_,
        [vp(oblig,subject)]).
/* he seems to have taught linguistics */

            The vp structure in the arglist will be satisfied by the following defining clause for the predicate satisfy:

satisfy([to|P0],P1,Gap,Prefgen,Posprec,Rel,Intrel,
                [Function,[clause,Tree]],
                vp(Type,Function),
                subject([Sfunction,Treesubj],Semsubj1),
                subject([Sfunction,Treesubj],Semsubj1)):-
  xverbphrase(P0,P1,subject([subject,Treesubj],Semsubj2),
              Gap,Pref,Tree,Rel,
              Intrel,infinitive,Tense,
              aspect(Aspect),Modality,
              Number,Person,Voice,
              nsubject(NSUBJ,Semsubj1)),
  Prefgen is Pref + 2.

            The whole infinitive clause is assigned the subject role (read off the lexical entry). Here too the running subject (in this case the subject of SEEM) is assigned as subject of the infini­tive clause. But it also needs to be made athematic in the higher clause, so that it does not ap­pear as subject in the parse tree, the real subject being clausal. This demotion is accomplished by reog:

reog(Voice,Class,subject([subject,Rest],_),List,
        nsubject([surf_subject,Rest],_),List,Func):-
          ( Class = sraising; Class = extrapos ),
          var(Voice),!.

4.2.1.4.2.2. Generation

4.2.1.4.2.2.1. Subject-to-Object Raising

            Subject-to-Object raising in generation is accomplished by the predicate oraising. It is examined in detail in the section on the cycle, where the sentence we follow through the cy­cle is one where subject-to-object raising applies, namely I believe him to have been killed. The reader is referred to page 112 and following.

4.2.1.4.2.2.2. Subject-to-Subject Raising

            In generation terms Subject-to-Subject raising, as in we tend to like students, applies to subordinate clauses in subject function. Their own subject is promoted to subject of the main clause, leaving a subjectless subordinate clause which will be non-finite (infinitive).

            The predicate taking care of subject-to-subject raising in generation is sraising. It has a clause which succeeds trivially (returns its first arg in its second arg) in those cases where the transformation is not applicable, i.e. where the predicate of the main clause is not of the appropriate class:

sraising([clause,[pred_arg_mod_structure,Prop1,
          [predicate(Notraising,Agr1)]|R1]|R2],
          [clause,[pred_arg_mod_structure,Prop1,
          [predicate(Notraising,Agr1)]|R1]|R2]):- not(sraise(Notraising,_)), !.

            The predicate sraise calls on the lexicon to check whether the verb is a subject-to-subject raising one; the appropriate verb class is the sraising one:

sraise(Sraisingverb,to):-
m_verb(sraising,_,Sraisingverb,_,_,_,_,_,_,_,_,_,_,_).

            Let us now turn to the case where subject-to-subject raising is applicable. The defining clause is as follows:

sraising([clause,
[pred_arg_mod_structure,Prop1,[predicate(Sraising,Agr1)],
    [[Subject1,[clause,[pred_arg_mod_structure,Prop2,Pred,
    [[Subject2|Rest]|Otherargs]]]]]]],

[clause,[pred_arg_mod_structure,Prop1,[predicate(Sraising,Agr2)],
    [[Subject2|Rest],
[Subject1,[clause,[pred_arg_mod_structure,Propnew2,Pred,
    [Otherargs]]]]]]]):-

sraise(Sraising,Requires),
allsubject(Subject1),
allsubject(Subject2),
 ( Agr1 = agr(Type,Tns,Number,Person), Agr2 = agr(Type,Tns,_,_);
   Agr2 = Agr1 ),
Prop2 = prop(Voice,Aspect,Mod,tns:present),
Propnew2 = prop(Voice,Aspect,Mod,tns:Requires).

            Apart from the manipulation of the syntactic relations performed in the head of the clause, the following points should be noted:

1) The agreement functor of the main predicate is changed. We do not keep the Number and Person values, if we have them. If we do have them, we disinstantiate them; if we do not have them (the agr functor is not the 4-place one), we can copy the old agr functor (Agr1) into the new one (Agr2). The reason why we cannot keep the Person and Number values is obvious: agreement will have to be made with the new subject, the promoted one.

2) To simplify generation we stipulate that Subject-to-Subject raising cannot apply to subordi­nate clauses that are not present tense. If tense is past, extraposition should be allowed to apply, not subject-to-subject raising. Compare:

(a)        it appears that he interviewed browne
(b)        he appears to be interviewing browne

In this way we do not need to introduce a perfect value in the aspect slot of Propnew2 when tense is non-present in Prop2 (to generate he appears to have interviewed browne as variant of (a)). Therefore in Prop2 (the Property field of the subordinate clause), we check that the tense value is present before overwriting it with the non-finite value returned by the sraise predi­cate.

4.2.1.4.3. Extraposition

            Extraposition is the end-placing of a subject clause and the filling of the vacated sub­ject position with place-filler IT. It is the relation between the surface string It seems that John teaches linguistics and [John teach linguistics] seems, which can also be related to the surface string John seems to teach linguistics by raising.

4.2.1.4.3.1. Parsing

            In analysis the surface subject needs to be demoted. This is accomplished by the reog clause that we have looked at in the preceding section.

            In the lexicon extraposition verbs are assigned the extrapos class:

m_verb(extrapos,_,seem_1,seem,seem,seem,seems,seeming,
        seemed,seemed,seemed,intrans,it,
        [s(oblig,posprec(1,W),subject)]).
/* it seems that he has taught linguistics
[11] */

            The arglist contains an S that is assigned the subject role. It is satisfied by the following defining clause of the predicate satisfy:

satisfy(P0,P1,Gap,Prefgen,posprec(Pos,4),
             Rel,Intrel,[Function,Tree],
          s(Type,posprec(Pos,4),Function),
          subject(SUBJ,Semsubj),
          subject(SUBJ,Semsubj)):-
         xsentence(P0,P1,Gap,Pref,Tree,finite,Person,Number,Voice),
         Prefgen is Pref + 4.

4.2.1.4.3.2. Generation

            In generation we need to put the place-filler IT back into the parse, so that it gets gen­erated as surface subject of SEEM. This is accomplished by the predicate extrapos:

extrapos([clause,[pred_arg_mod_structure,Prop1,
[predicate(Extrapos,Agr1)],
    [[subject,[clause|Rest]]|Otherargs]]],
[clause,[pred_arg_mod_structure,Prop1,

[predicate(Extrapos,Agr1)],
    [[subject,[nounphrase,index($index$),agr(third,sing),
                                      ppro(third,sing,neuter)]]
,
     [object,[clause|Rest]]|Otherargs]]]):-
expos(Extrapos,Requires).

            The body of the clause simply checks that the verb belongs to the right class. The clausal subject is transferred to the object argument (it will be generated as if it were an object, which it is positionally), and a new subject is created with the appropriate agreement functor (agr(third,sing)) and the appropriate body (ppro(third,sing,neuter)).

4.2.2. Non-Declarative Main Clauses

4.2.2.1. Yes-No Questions

            The basic format of a YES-NO question is the following:

Operator - Sentence with incomplete verb group

            The operator is the first auxiliary of the verb group if the latter includes any auxiliary, or DO otherwise. The operator must be finite.

4.2.2.1.1. Parsing

            In horatio we check that the operator and the remainder of the verb group build up a licit verb group (each auxiliary has a Required value, which indicates the Type of the verb form that can be found to its right).

yesnoquestion(P1,P3,Prefs,[yes_no_question,
          [clause,
               [P,prop(vce:V,asp:A2,mod:Modality,tns:Tense) |Rest]]
    ]):-
     xaux(P1,P2,Modality,finite,Required,Number,Person,Tense),
     checkaux3(Asp,Required,V),
     xsentence(P2,P3,
              [],
              Prefs,
              [clause,
               [P,prop(vce:V,asp:A,mod:M,tns:T) |Rest]],
              Required,Person,Number,V),

     (nonvar(Modality); var(Modality),Modality = none ),

/* we have a value none in case there is no modal auxiliary */

     myappend
(Asp,A,Aspect),

/* myappend appends the aspect value contributed by the auxiliary parsed by xaux to the aspect value returned by the parsing of the S (xsentence); the code for myappend is to be found on page
66 */

     (nonvar(Aspect); var(Aspect),A2 = none ),
     (Aspect = [] , A2 = none; Aspect \= [],A2 = Aspect ).

/* the returned value is none if Aspect is uninstantiated or the empty list */

 The value for Voice is computed by checkaux3, whose code is the following (further dis­cussed on page 65):

checkaux3([perfect],en_active,_).

checkaux3([progressive],ing,_).

checkaux3([],Required,_):-
                                     Required \= en_passive,
                                     Required \= en_active,
                                     Required \= ing.

checkaux3([],en_passive,passive).

            The first argument is the contribution the auxiliary makes to the Aspect list; the second argument is the Required feature; the third argument is Voice, which is to be left uninstanti­ated if voice is active.

            Looking at auxiliaries in the lexicon, we find both a Required (fifth argument) and a Type (fourth argument) value:

aux([does|X],X,_,finite,infinitive,sing,thirdsg,present).
aux([been|X],X,_,en_active,ing,_,_,_).

            Does is finite and requires an infinitive; been can be the active past participle and requires an ing-form:

Does he know you ?
He has been reading a book.

            The treatment of YES-NO questions needs to be different when the question opens with a non-auxiliary, i.e. copula BE. We do not have a verb group in the remaining S, and we need to look at the whole YES-NO question as having the following structure:

Copula - Subject - Copula's arglist

Is         the teacher            a genius ?
Are      they                 aware of the problems ?

We therefore have the following piece of code in horatio:

yesnoquestion(P1,P4,Prefint,[yes_no_question,
          [clause,
               [pred_arg_mod_structure,
                prop(vce:active,asp:none,mod:none,tns:Tense),
          VERB,[SUBJ,Parse]]]
    ]):-
    verb(P1,P2,cop,VERB,finite,Tense,
            Number,Personvp,Semsubj,Args) ,

/* cop is verb class of non-auxiliary BE (copula verb ) */

    xnounphrase(P2,P3,[],index(I),Prefnp,Weight,Rel1,
                 Intrel1,subject,SUBJ,Number,Personnp,Semsubjnp),
    agree(Personnp,Number,Personvp),
    arglist(P3,P4,[],Status,Prefarg,Preclist,Rel,
               Intrel,Voice,[Parse],vp,
               Args,Func,subject(SUBJ,Semsubj),
            nsubject(NSUBJ,NSemsubj), cop),
     accu(Prefint,[Prefnp,Prefarg,4]).

            Note that we must perform the agreement check between the copula and the subject np (call to the agree predicate). The arglist of the copula is parsed by a call to the arglist predi­cate. As far as the verbal group properties are concerned, we know that we have no aspectual or modality values (there would be an auxiliary in that case), and that voice is active.

4.2.2.1.2. Generation

            When we have copula BE as main verb, we need to generate the copula, generate the subject, and then generate the remainder, i.e. the copula's arglist and modifiers. The generator therefore looks into the parse tree to make sure that the predicate is BE. It also needs to isolate the first element in the arglist, and make sure that it is the subject. Note that we know that voice is active, and that there is no auxiliary for either aspect or modality:

gen([yes_no_question,[clause,[pred_arg_mod_structure,
     prop(vce:active,asp:none,mod:none,tns:Tns),
     [predicate(be,Agr)],
     [[Subject,[nounphrase,I,agr(Person,Number)|Restsubj]]|Otherargs]]]],
      Sentence):-

subject(Subject),
gen([Subject,[nounphrase,I,agr(Person,Number)|Restsubj]],Gensubject),
gen([predicate(be,Agr)],Vform,Part),
append([Part],Otherargs,Args),
insort(Args,Sortedargs),
gen(Sortedargs,Genargs),
append([Vform],[Gensubject],List1),
append(List1,[Genargs],Sentence).

            In all other cases we do have one or more auxiliaries. We need to isolate the first auxil­iary, because it must appear in front, before the subject. The generation of auxiliaries is carried out by the genyesno predicate, whose code is examined in the section on the verb phrase (see p.65 and following). It returns a list of auxiliaries, and we append the first in front of the clause, insert the subject, and then append the remaining auxiliaries (there may be none), the verb, and finally the remaining args and modifiers:

gen([yes_no_question,[clause,[pred_arg_mod_structure,
     Properties,[predicate(Predicate,Agr)],
     [[Subject,[nounphrase,I,agr(Person,Number)|Restsubj]]|Otherargs]]]],
      Sentence):-

subject(Subject),
gen([Subject,[nounphrase,I,agr(Person,Number)|Restsubj]],Gensubject),
genyesno(Properties,Agr,Person,Number,[First|Others]),
gen([predicate(Predicate,Agr)],Vform,Part),
append([Part],Otherargs,Args),
insort(Args,Sortedargs),
gen(Sortedargs,Genargs),
append([First],[Gensubject],List1),
append(Others,[Vform],List2),
append(List2,[Genargs],List3),
append(List1,List3,Sentence).

4.2.2.2. WH Questions

            They are made up of a wh-group and a clause with a gap which can be filled by that wh-group.

4.2.2.2.1. Parsing

            We need to distinguish two cases:

a) the wh-group is the subject

            We parse the sentence as a wh-group followed by a sentence which lacks a subject. The link between the two parts is provided by coindexing.

/* NP wh in subject position; the S does or does not feature an auxiliary:

who knows the teacher ? 
who has seen the teacher ?
who is the teacher ? */

whquestion(P0,P2,Prefint,[wh_question,NP,S]):-
 nounphrase(P0,P1,[],index(I),Prefnp,Weight,
                Semrelint,int,Functioninint,
                NP,sing,third,Semwholenp),

/* this np must have int as 8th arg, i.e. it must be or contain an interrogative element */

     nonvar(Semrelint),
     xsentence(P1,P2,
                 [gap(_,npgap,Npsem,_,Functioninint,_,index(I),_)],
                 Prefs,S,finite,thirdsg,sing,Voice),

/* only singular subject is catered for here: * who teach linguistics ? */

     nonvar(Functioninint),
/* the gap is a real one */

     Functioninint = subject,
/* the function of the missing element is subject */

     NP \=  [_,[_,_,_,[interrogative(whom)]]],
/* we can't have whom as subject */

     sfok(Npsem,Semwholenp),
     accu(Prefint,[Prefnp,Prefs]).

b) the wh-group is not the subject

Examples:

  A book about whom has he read ?
  Whom has the teacher seen ?

            Here we need to be able to look into the parse trees, to account for the auxiliary which is detached from the remainder of the verb phrase, being separated from it by the subject. The relevant piece of code is the following:

whquestion(P0,P3,Prefint,[wh_question,NP,
          [clause,
          [P,prop(vce:V,asp:A2,mod:Modality,tns:Tense) |Rest]]
    ]):-

     nounphrase(P0,P1,[],index(I),Prefnp,Weight,
                Semrelint,int,Functioninint,
                NP,Number,third,Semwholenp),
     nonvar(Semrelint),
     xaux(P1,P2,Modality,finite,Required,Numbers,Person,Tense),

/* the auxiliary contributes the modality and tense features */

     checkaux3(Asp,Required,V),
     xsentence(P2,P3,
                 [gap(_,npgap,Npsem,_,Functioninint,_,index(I),_)],
              Prefs,
                [clause,
                 [P,prop(vce:V,asp:A,mod:M,tns:T) |Rest]],
                 Required,Person,Numbers,V),
     (nonvar(Modality); var(Modality),Modality = none ),
     myappend(Asp,A,Aspect),
     (nonvar(Aspect); var(Aspect),A2 = none ),
     (Aspect = [] , A2 = none; Aspect \= [],A2 = Aspect ),
     nonvar(Functioninint),
     Functioninint \= subject,
     sfok(Npsem,Semwholenp),
     accu(Prefint,[Prefnp,Prefs]).

            Note the sharing of the function Functioninint between the opening wh-group and the gap in the S. Note also the semantic check between wh-group and gap (call to sfok).

            PP gaps receive a similar treatment.

4.2.2.2.2. Generation

            In the case of a subject WH-group, we generate the WH-group and then generate the remainder as a normal clause (recall that it is a gapped clause, since it lacks a subject):

generate([wh_question,Whgroup,Clause],Sentence):-
Whgroup = [subject|Rest],
gen(Whgroup,Whgroupgen),
generate(Clause,Ynogen),
append([Whgroupgen],[Ynogen], Sentence).

            Note the distinction between the two generation predicates. Gen generates directly, whereas generate first makes sure that control and other function-changing processes (raising, passivization, etc.) have been undone (See section on the cycle, p.108) by calling prepgen:

generate(Tree,Sentence):-
    prepgen(Tree,Treeprep),
    gen(Treeprep,List),
    flatten(List,Sentence).

            When the WH-group is not the subject, we generate the WH-group and then the re­mainder as a YES-NO question. Generating the clausal part as a YES-NO question ensures that the clause begins with an auxiliary, and that DO is inserted if no auxiliary is present.

generate([wh_question,Whgroup,Clause],Sentence):-
Whgroup \= [subject|Rest],
gen(Whgroup,Whgroupgen),
generate([yes_no_question,Clause],Ynogen),
append([Whgroupgen],[Ynogen], Sentence).

4.3. The Phrase Level

4.3.1. Noun Phrases

4.3.1.1. Parsing

            In the parsing of noun phrases we have to pay attention to the problem of structural weight, as this is a determining element in the position of nps with respect to the verb, each other and other phrasal elements such as pps.

            The lightest nps are of course np gaps. They are assigned a weight of zero. Next in or­der of increasing weight are personal, relative and interrogative pronouns, which are as­signed a weight of 1, which ensures that they stay close to the verb. Core nps get a weight of 2, provided the noun or adjective they contain does not have an instantiated arglist. In the lat­ter case they get a weight of 3, like other heavy nps, such as those that include a relative clause.

            From the preceding paragraph it is clear that the process which computes weight must be sensitive to whether the arglist of a constituent argbearer has been satisfied. This is possi­ble because the predicate satisfylist returns in its fourth argument a Status, i.e. an indi­cation whether any argument has been satisfied or a modifier found (Status is set to 1; other­wise it is left uninstantiated).

            In the parsing of core nps we include an ifthenelse: if the Status variable returned by satisfylist is uninstantiated, we assign 2 as weight, otherwise we assign 3. Here is the relevant piece of code (the case envisaged here is that of an np which does not include ad­jective pre- or postmodification and satisfylist therefore applies to the head noun's ar­glist):

corenounphrase(P0,P3,Gap,index(I),Pref,Weight,Rel,Intrel,
            Function,
                      [Function,[nounphrase,index(I),
                             agr(third,Number),DET,N,Parse]],
                    Number,third,Sem):-
     determiner(P0,P1,DET,Number,Rel1,Intrel1) ,
     xnoun(P1,P2,N,Number,Sem,Arglist),
     satisfylist(P2,P3,Gap,Status,Pref,Prec,Rel2,Intrel2,_,
                 Parse,np,Arglist,Func,
                 subject([subject,[nounphrase,index(I),
                          agr(third,Number),DET,N]],
               Verbsem)),
     sfok(Verbsem,Sem),
     ( var(Rel1),var(Intrel1),Rel=Rel2,Intrel=Intrel2;
      nonvar(Rel1),nonvar(Intrel1),Rel=Rel1,Intrel=Intrel1 ),
     ifthenelse
(var(Status),Weight=2,Weight=3).

            The ifthenelse(Condition, Yescase, Nocase) predicate is an Arity Pro­log extension to standard Edinburgh Prolog; if missing in a Prolog implementation, it can readily be simulated by the following disjunction:

( var(Status), Weight = 2; nonvar(Status), Weight = 3 )

            The weight value is passed to the posprec functor and exploited by the precede rela­tion to determine the relative order of arguments and modifiers. See the relevant section, p. 91.

            Coordinated nps present two problems:

1) computing the weight of the resulting coordination: heuristically it has been decided to add the weights of the constituent nps, divide the result by 2 and add 1.

2) determining the person of the resulting np: this question is discussed on p. 35.

            Np gaps are discussed on p. 99.

            Most of the information necessary for the parsing of pronouns is simply read off the lexicon. Of course, the function played by the pronoun cannot be determined ahead of parsing, and it is not read off the lexicon, although unification provides a check on the possible function value stored in the lexicon and the one assigned by the parsing process.

            Let us take personal pronouns first. The features copied from the lexicon are high­lighted:

nounphrase(P0,P1,[],index(I),0,1,Rel,Intrel,
              Function,
              [Function,
      [nounphrase,index(I),agr(Person,Number),Tree]],
               Number,Person,Sem):-
    pp(P0,P1,Tree,Person,Number,Gender,Function,Sem).

            Note that Gender would also have to be percolated to the noun phrase level if we wanted to deal with reflexive pronouns. In the parsing of

She behaved herself.

we would have to check that gender is the same in the subject and the reflexive object.

            Relative and interrogative pronouns have two arguments that we need to go into a lit­tle, namely the seventh and the eighth.

            Relative pronouns use the 7th position to store the semantic value of the pronoun. If no relative is present in a noun phrase, the 7th position of the nounphrase predicate will be left uninstantiated. We cannot therefore use an uninstantiated variable to mean that the relative does not set any semantic restriction, and we use the value norestriction instead. The predicate performing the semantic check (sfok) knows how to deal with that value (see p. 87).

            The code for noun phrases consisting of a relative pronoun is the following:

nounphrase(P0,P1,[],index(I),
                 0,1,Semnp,rel(I),Function,
                 [Function,
            [nounphrase,index(I),agr(third,Number),Tree]],
                 Number,third,[Semnp]):-
relative(P0,P1,Tree,Semnp,Sempp,np).

            Note that the Semnp feature is inserted in a list when passed to the noun phrase level, because at that level np semantics must be a list, as noun semantics is a list.

            The relative predicate is a lexical one; the last three arguments are as follows:

Semnp: semantic features associated with the relative (which has np value); the string norestriction is used to indicate that no semantic feature is assigned, e.g.

relative([who|X],X,[relative(who)],human,_,np).

vs

relative([that|X],X,[relative(that)],norestriction,_,np).

Sempp: semantic features associated with the relative (which has pp value), e.g.

relative([why|X],X,[relative(why)],norestriction,[reason],pp).

The last argument stipulates whether the relative pronoun stands for an np or an adverbial (np and pp values).

            The eighth position is used to keep relative and interrogative pronouns apart. In the case of interrogatives, it is set to the string value int. In the case of relatives it is used to en­able coindexing of the relative pronoun when the latter does not constitute the whole np but is embedded somewhere in it, as in a book about whom. In such a case we need an index for the whole np and one for relative whom. The latter is provided by the rel(I) functor, where I is meant to carry an index.

            Here is the code for interrogative pronouns:

nounphrase(P0,P1,[],index(I),
                 0,1,Semnp,int,Function,
                 [Function,
      [nounphrase,index(I),agr(third,Number),Tree]],
                 Number,third,[Semnp]):-
 interrogative(P0,P1,Tree,Semnp,Sempp,np).

            The parsing of nps containing a relative clause must ensure that at the higher level the seventh and eighth positions are left uninstantiated, and that the relative clause (in particular, the gap within it) and the antecedent share what they must share. Note that the function is NOT shared between the antecedent and the gap in the relative clause: we can have an antece­dent subject coindexed with an object gap in the relative clause, as in The teacher (subject) you like (object gap) teaches mathematics. The code is the following:

/* Weight is 3, i.e. the resulting np is heavy */
  nounphrase
(P0,P2,[],index(I),
           Preftop,3,_,_,Function,
           [Function,Newrest],
           Number,third,Sem):-
 corenounphrase(P0,P1,[],index(I),
                  Pref,Weight,Rel,Intrel,Function,
                  [Function,Rest],
                  Number,third,NounSem),
 ( var(Intrel); Intrel \= int ),      /* not an interrogative */

 var(Rel),
/* the antecedent itself must feature a free Rel variable */

 xrelclause(P1,P2,Prefrel,RELCL,
               [gap(_,_,Npsem,_,_,_,index(I),_)],
               Personinrel,Funcinrel,NounSem,Number),
 sfok(Npsem,NounSem),
 append(Rest,[RELCL],Newrest),
 accu(Preftop,[Pref,Prefrel]).

4.3.1.2. Generation

            Generating nps is considerably simpler than parsing them. We consider three cases. The first is that of np gaps and of nps that have been ghosted (deprived of lexical material) by the generator in its treatment of control relations (see p. 41). Such nps are represented by a parse tree where the variable named NP in the following piece of code is an uninstantiated variable. The generator produces an empty list, which will disappear in the list appending and flattening operations at the end of the generation process.

gen([Function,[nounphrase,index(I),agr(P,N)|NP]],[]):-  var(NP),!.

            The second case is that of personal pronouns. They are retrieved form the lexicon on the basis of their person, number, gender and function. Note that these values appear both inside and outside the ppro functor. We give below the rule in the generator and the lexical clause for object her:

gen([Function,[nounphrase,index(I),
      agr(Person,Number),ppro(Person,Number,Gender)]],PPform):-pp([PPform|_],_,ppro(Person,Number,Gender),Person,Number,Gender,Function,_).

pp([her|X],X,ppro(third,sing,fem),third,sing,fem,object,[human]).

            The third case is the default case. We skip the information pieces that precede the lexi­cal material in the parse tree, and generate from that material:

gen([Function,[nounphrase,index(I),agr(P,N)|NP]],Gennp):-
    NP \= ppro(_,_,_),
    gen(NP,Gennp).

            The call on gen will unify with its defining clause for generating the various elements of a list:

gen([H|T],[Hgen|Tgen]):-
     gen(H,Hgen),
     genlist(T,Tgen).

genlist
([],[]).

genlist([H|T],[Hgen|Tgen]):-
        gen(H,Hgen),
        genlist(T,Tgen).

            Gen will then generate the determiner (if any), adjective (if any), the noun and its ar­guments and modifiers (if any). To do so, it will call on the gen clauses for leaves, such as the following for the determiner:

gen([det(Det),index(I)],Det).
gen([det(zero)],[]).
gen([det(Det)],Det):- Det \= zero.

4.3.2. Adjective Phrases

            An adjective modifying a noun generally precedes that noun; however, if the adjective has an arglist that is partly or totally satisfied, the adjective follows the noun. Contrast:

Each participant was asked an easy question.
* Each participant was asked a question easy.
A question easy for you to answer is not necessarily an easy question.
* An easy for you to answer question
[12]

            In horatio we use the Status variable returned by satisfylist to check whether the arglist of the adjective has been partly or totally satisfied. If the adjective precedes the noun, the adjective's arglist must be unsatisfied (the noun's arglist can of course be satisfied: an easy book on linguistics). If the adjective follows, the noun's arglist cannot be satisfied, and the adjective's must be (* a book easy for you to read on linguistics; * a book easy on linguis­tics).

            Here is the code for the case where the adjective precedes the noun it modifies :

corenounphrase(P0,P4,Gap,index(I),Pref,Weight,Rel,Intrel,Function,
                            [Function,
                  [nounphrase,index(I),agr(third,Number),
                                                        DET,ADJ,N,Parse]],
                      Number,third,Sem):-
     determiner(P0,P1,DET,Number,Rel1,Intrel1) ,
     xadjphrase(P1,P2,[],subject(SUBJ,Semsubjadj),
                Prefadj,_,modifier,Statadj,
                ADJ,Semadj),
     var(Statadj),
     xnoun(P2,P3,N,Number,Sem,Arglist),
     sfok(Semadj,Sem),
     satisfylist(P3,P4,Gap,Status,Pref,Prec,Rel2,Intrel2,_,
                 Parse,np,Arglist,Func,
                 subject([subject,[nounphrase,index(I),agr(third,Number),
                          DET,ADJ,N]],
                 Vpsem)),
      sfok(Vpsem,Sem),
      ( var(Rel1),var(Intrel1),Rel=Rel2,Intrel=Intrel2;
      nonvar(Rel1),nonvar(Intrel1),Rel=Rel1,Intrel=Intrel1 ),
     ifthenelse(var(Status),Weight=2,Weight=3).

            Phrases such as an easy man to please cannot be parsed by the mechanisms explained above: the adjective's arglist is separated from the adjective by the noun. In order to parse them, we need to isolate the adjective classes for which this construction is possible. In horatio such adjectives are recognized by looking at their arglist, which is [to_vp(oblig,object)], i.e. the noun modified by the adjective is the object of the infinitive in the to-phrase. The code is the following:

corenounphrase(P0,P4,[],index(I),Pref,3,Rel,Intrel,Function,
                      [Function,[nounphrase,index(I),agr(third,Number),
                          DET,ADJ,N,Parse]],
                      Number,third,Sem):-
     determiner(P0,P1,DET,Number,Rel1,Intrel1) ,
     adj(P1,P2,ADJ,_,[to_vp(oblig,object)]),
     xnoun(P2,P3,N,Number,Sem,Arglist),
     satisfy(P3,P4,[],Pref,Prec,Rel2,Intrel2,
              Parse,to_vp(oblig,object),
              subject([_,[nounphrase,index(I),agr(third,Number),
                      DET,ADJ,N]],Semsubj),
              subject([_,[nounphrase,index(I),agr(third,Number),
                      DET,ADJ,N]],Semsubj)),
     ( var(Rel1),var(Intrel1),Rel=Rel2,Intrel=Intrel2;
      nonvar(Rel1),nonvar(Intrel1),Rel=Rel1,Intrel=Intrel1 ).

            Note that the subject functor passed to the satisfy predicate will not be used for subject control of the to-phrase; in the absence of a FOR NP the subject of the infinitive is not syntactically retrievable. The structure passed in the subject functor is used to coindex the gap that the to-phrase needs to display, namely an object gap (object is not to be taken strictly here; it simply means an oblique function). The defining clause for the satisfy predicate as used here is the following:

satisfy([to|P0],P1,[],Prefgen,Posprec,Rel,Intrel,
                 [adj_arg,Tree],
                 to_vp(Type,object),
                 subject(SUBJ,Semsubj),
                 subject(SUBJ,Semsubj)):-

  SUBJ=[X,[Y,index(I)|Remainder]],
/* the "subject" of the adj (index(I)) is an object gap in the vp */

  xverbphrase(P0,P1,
               subject(NP,Sem), /* no subject control */
     [gap(Object,npgap,Semsubj,_,Func,_,index(I),_)],
     Pref,Tree,Rel,
     Intrel,toinfinitive,Tense,
     aspect(Aspect),Modality,
     Number,Person,Voice,nsubject(NNP,Nsem)),
     nonvar(Func),
     ( Func=object; Func=indirect_object; Func=np_arg_of_prep ),

/* these are the possible functions for the gap:
      hard to see; hard to give a book to; hard to borrow from */

     Prefgen is Pref + 3.

            The code for the parsing of an adjective phrase does not require a lot of explanations:

adjphrase(P0,P2,Gap,subject(SUBJ,Sem),Pref,3,
            Function,Status,
                 [Function,[adjectivephrase,Adj,Parse]],Semadj):-
  adj(P0,P1,Adj,Semadj,Arglist),
  satisfylist(P1,P2,Gap,Status,Pref,Prec,Rel2,_,_,
                 Parse,_,Arglist,Func,
                 subject(SUBJ,Semadj)).

            Note only that we need to have the subject functor for cases such as reluctant to go, where we need the subject (either the modified noun or the subject in predicative constructions) to process the TO phrase.

4.3.3. Prepositional Phrases

            Prepositional phrases can appear as arguments of nouns. Nouns may then determine either the semantics of the whole prepositional phrase, or the exact preposition to be used. The noun HYPOTHESIS is of the first type; it specifies that its optional argument pp must be a topic. We have the following clause in the lexicon:

m_noun(hypothesis_1,hypothesis,[abstract],[pp(opt,n:topic),s(opt)]).

            The noun trouble is of the second type. It specifies that its optional argument pp must be headed by the preposition with:

m_noun(trouble_1,trouble,[abstract],[pp(opt,prep:with)]).

            Accordingly, prepositional phrases in the grammar must have these two specifications at the highest level. A number of other pieces of information are also needed, among which two indices, one for the whole pp and one for the np inside. We need to be able to coindex the np inside the pp with an antecedent:

The man about whom I have read a book is a teacher.

But we also need an index for the whole pp, to be referred to in the gapped argument of book.

We use the following code to parse "normal" (i.e. neither gapped nor coordinated) pps:

prepphrase(P0,P2,Gap,index(J),
            npindex(I),Prefnp,3,PREP,Rel,Intrel,Function,
                  [Function,[prepphrase,index(J),prep(PREP),NP]],
                  Sempp,Semnp):-
     prep(P0,P1,[prep(PREP)],Sempp) ,
     nounphrase(P1,P2,Gap,index(I),Prefnp,Precnp,Rel,
                Intrel,np_arg_of_prep,NP,Number,Person,Semnp).

            We note that the semantics of the embedded np is also percolated to pp level. It is needed in the parsing of by-phrases, where it must be checked that the np within the pp has the appropriate semantics for the subject role. The check is performed in the following piece of code:

satisfy(P0,P1,Gap,Pref,posprec(_,3),Rel,Intrel,
                [subject,[nounphrase,index(I)|X]],
                byphrase(opt,posprec(_,3),subject,Semsubj),
                subject(SUBJ,Sem),subject(SUBJ,Sem)):-
   xprepphrase(P0,P1,Gap,index(J),npindex(I),
                 Prefpp,Precpp,by,Rel,Intrel,
                    subject,
           [subject,[prepphrase,index(J),prep(by),
                    [np_arg_of_prep,[nounphrase,index(I)|X]]]],
                PPseminpp,Semsubjinpp),
       sfok(Semsubj,Semsubjinpp),           
         Pref is Prefpp + 4.

            Gapped pps are discussed on page 99 and following.

            We distinguish two cases of coordinated pps. In the first one two full pps are coordi­nated, as in with the teacher and for the teacher. In the second, the first pp is reduced to a preposition, the np being gapped, as in with and for the teacher. An alternative solution would be to parse a coordinated preposition, rather than whole pp, in the second case.

            The first case is dealt with by the following code:

xprepphrase(P0,P2,[],index(I),npindex(I2),
                  Pref,Weight,PREP,Rel,Intrel,Function,
                  [Function,[and_prepphrase,PP1,PP2]],
                  Sempp,Semnp):-
  inlist(and,P0),
  prepphrase(P0,[and|P1],[],index(I),
            npindex(I2),Pref1,Weight1,PREP,
             Rel,Intrel,Function,
             PP1,Sempp,Semnp),
  xprepphrase(P1,P2,[],index(J),
                    npindex(J2),Pref2,Weight2,Prep2,
             Rel,Intrel,Function,
             PP2,Sempp2,Semnp2),
  accu(Pref,[Pref1,Pref2]),
  accu(Weightaccu,[Weight1,Weight2]),
  Weight is (Weightaccu/2) + 1.

            Note that the two pps must share the Rel and Function values: if one contains a rela­tive, so must the other: about whom and for whom; * about whom and for the teacher.

            The second case is taken care of by the following code. Note that we must look into the returned parse trees to restore the missing np to the first conjunct:

xprepphrase(P0,P2,[],index(I),npindex(I2),
                        Pref,Weight,PREP,
            Rel,Intrel,Function,
                  [Function,[and_prepphrase,
                            [prepphrase,index(J),prep(PREP),NP],
                            [prepphrase,index(I),prep(Prep2),NP]
                  ]],
                  Sempp,Semnp):-
  inlist(and,P0),
  prep(P0,[and|P1],[prep(PREP)],Sempp),
  prepphrase(P1,P2,[],index(I),npindex(I2),
             Pref,Weight,Prep2,
             Rel,Intrel,Function,
             [Function,[prepphrase,index(I),prep(Prep2),NP]],
             Sempp2,Semnp).

4.3.4. Verb Phrases

4.3.4.1. Parsing

4.3.4.1.1. The Auxiliary Group

            We first discuss the treatment of the auxiliary group. We need to take into account the well-known precedence relations within the group. To do so, we assign two features to each auxiliary, one being its type (finite, ing, en-passive, etc.), the other the type that it requires the next auxiliary (or main verb) to be. In the lexicon, we therefore have such clauses as:

aux([may|X],X,may,finite,infinitive,_,_,present).
aux([is|X],X,_,finite,ing,sing,thirdsg,present).
aux([is|X],X,_,finite,en_passive,sing,thirdsg,present).
aux([have|X],X,_,infinitive,en_active,_,_,_).

            The fourth argument of the lexical predicate aux is the Type of the auxiliary, the fifth one is the Required feature, i.e. the type of the next verb form to its right. The third clause in the sample applies to BE as passive auxiliary, the fourth to HAVE as perfect auxiliary.

            We also need to specify what each auxiliary contributes to the property field of the clause. Such a property field has the following form:

prop(vce:Voice, asp:Aspect, mod:Modality,tns:Tense),

where vce, asp, mod and tns are markers, and the capitalized items are variables. Voice ranges over two values: passive and uninstantiated (represented by an uninstantiated variable). Tense ranges over past and present, and is set by the first element of the verb group, be it an auxiliary or the main verb. Aspect is a feature list, whose possible members are progressive and perfect. Modality is also a feature list, whose possible members are the various modal auxiliaries (shall, will, may, can, etc.; the treatment is admittedly very surfacy). We also make use of the special value none for Modality and Aspect; it is equivalent to an empty fea­ture list.

            We define the predicate checkaux3 to specify the contribution made by each auxiliary to the aspect and voice fields. The first argument is the aspect feature list, the second the Required feature of the auxiliary, and the third the value for Voice. The code is the following:

checkaux3([perfect],en_active,_).
checkaux3([progressive],ing,_).
checkaux3([],Required,_):-
                             Required \= en_passive,
                             Required \= en_active,
                             Required \= ing.
checkaux3([],en_passive,passive).

            Note that the value for Voice is the anonymous variable when Voice is not passive, i.e. active.

            Since we do not know the number of auxiliaries that a given verb phrase will have, we define the verbphrase predicate as recursive: it calls the xverbphrase predicate (recall that the x prefix indicates that the predicate can be coordinated). We call a special version of the append predicate (myappend) to build up the aspect value. Myappend is able to deal with the none value. Here is the relevant piece of code:

verbphrase(P0,P2,subject(SUBJ,Semsubj),Gap,Preflist,
              VG,
              Rel,Intrel,Type,Tense,
              aspect(A1),Modality,
              Number,Person,Voice,
              nsubject(NSUBJ,Nsem)):-
        xaux(P0,P1,Modality,Type,Required,Number,Person,Tense),
        checkaux3(A2,Required,Voice),
        myappend(A1,A2,A),
        xverbphrase(P1,P2,subject(SUBJ,Semsubj),Gap,Preflist,
                   VG,Rel,Intrel,
                   Required,Tense,aspect(A),Modality,
                   Number,Person,Voice,
                   nsubject(NSUBJ,Nsem)).

Myappend is defined by the following clauses:

/* the value none - either isolated or as first element of a one-element list - receives the same treatment as the empty list */

myappend([],X,X).
myappend(X,none,X):-!.
myappend(X,[none],X):-!.
myappend(X,Y,Y):-   var(X),  !.
myappend([Head|L1],L2,[Head|L3]):- myappend(L1,L2,L3).

4.3.4.1.2. Coordinated Verb Phrases

            In coordinated verb phrases we make sure that the arglist is appropriate to both con­juncts. For instance, in

He looked at and liked the girl

the phrase the girl is the arglist of both LOOK AT and LIKE.

            The check is implemented by feeding the arglist predicate the same word list. In our example, the girl is parsed as arglist of LOOK AT and then reparsed as arglist of LIKE. An additional problem arises out of the presence of the preposition or particle, which remains close to its verb, instead of joining the arglist. Indeed, we do not have

* He looked and liked at the girl

We must therefore make sure that the particle is found to the immediate right of the first coor­dinated verb. The following piece of code takes care of the example:

/* 1  he looked at and liked the girl */
verbphrase
(P0,P4,subject(SUBJ,Semsubj2),Gap,Pref,
           [pred_arg_mod_structure,
            prop(vce:V,asp:A,mod:Modality,tns:Tense),
            [and_predicate,VERB1,VERB2],SParse],
            Rel,Intrel,Type,Tense,aspect(Aspect),Modality,
            Number,Person,Voice,nsubject(NSUBJ,Nsem2)):-
    verb(P0,P1,Class1,VERB1,Type,Tense,Number,
          Person,Semsubj1,Args1) ,
    Args1 = [string(X,Y,Z)|Rest],
/* X is Status, Y is posprec and Z is the string itself in list format, e.g. [away,with] */
    append(Z,[and|P2],P1),
/* the string + and + the remainder is what follows the verb */
    verb(P2,P3,Class2,VERB2,Type,Tense2,Number,Person,Semsubj2,Args2),
    arglist(P3,P4,Gap,Status1,Pref1,Preclist1,Rel,
               Intrel,Voice,Parse1,vp,
               Rest,Func,subject(SUBJ,Semsubj1),
               nsubject(NSUBJ,Nsem1),
               Class1),
    arglist(P3,P4,Gap,Status2,Pref,Preclist,Rel,
               Intrel,Voice,Parse2,vp,
               Args2,Func,subject(SUBJ,Semsubj2),
               nsubject(NSUBJ,Nsem2),
               Class2),
/* note P3,P4 twice: the same list must satisfy both the remaining args of the first verb and those of the second */
     (nonvar(Aspect); var(Aspect),A = none ),
     (Aspect = [] , A = none; Aspect \= [],A = Aspect ),
     (nonvar(Modality); var(Modality),Modality = none ),
     (nonvar(Tense); var(Tense), Tense = present ),
     (nonvar(Voice), V = Voice; var(Voice),V = active ),
     append([NSUBJ],Parse2,AParse2),
     insort(AParse2,SParse),  /* to get canonical word order */
     Status1 = Status2. /* the two Status values must unify */

            Note that the verb class (Class1 and Class2) is passed as an argument to arglist, but cannot be used as the basis on which to determine whether the two verbs can be coordinated: we need to parse the remaining word list. It would be counter-productive to have to assign the same word class to two verbs merely on the basis of the observation that the two verbs can be coordinated. In our example we do not want to assign LOOK AT and LIKE to the same word class, as this would go against the spirit of consistency checks and template sharing that we might want to place on the lexicon.

4.3.4.2. Generation

            Generating the verbal group divides naturally into two tasks: generation of the auxiliary group (whose membership may be nil) and generation of the main verb.

4.3.4.2.1. The Auxiliary Group

            We shall first consider the generation of the auxiliary group. The gen predicate dedi­cated to this task needs to have access to the Property field (voice, aspect, modality and tense), and to agreement values of the subject, i.e. Person (NP1 in the code cited below) and Number (NP2). It returns a list of auxiliaries.

            This gen predicate must also be sensitive to the required auxiliary order in the clause: modality, aspect (perfect), aspect (progressive) and voice, as in:

He may have been being interviewed.

            Once we have generated a candidate auxiliary list, we check that the first auxiliary agrees in person and number with the subject. If it does not, we have to backtrack, but we can skip the part between snips ([! !]) (snips are a useful Arity Prolog extension to standard Prolog: the predicates between the two snips are skipped on backtracking[13]). Here attempting to redo the list appending would be useless, and attempting to redo the list flattening would lead to trouble.

            The relevant piece of code is the following:

gen(prop(vce:Voice,asp:Asp,mod:Mod,tns:Tns),NP1,NP2,Alist2):-

gen(Mod,Tns,Modaux,Req1),           /* Modality */
genasp1(Asp,Aspaux1,Req1,Req2),       /* Aspect: perfect */
genasp2(Asp,Aspaux2,Req2,Req3),       /* Aspect: progressive */
gen(Voice,Voiceaux,Req3,Req4),        /* Voice */
/* the Required feature of the preceding Aux is found in last position but one in the next, i.e. the one housing its Type */
[! append([Modaux],[Aspaux1],List1),
append([Aspaux2],[Voiceaux],List2),
append(List1,List2,List3),
flatten(List3,Auxlist),
first(Auxlist,Auxfirst) !] ,
getagr(Auxfirst,Auxfirstagree,Tns,To),
/* the last argument indicates whether a TO is needed in front of the auxiliary list */
[! append(To,Auxlist,Alist),
flatten(Alist,Alist2) !] ,
agree(NP1,NP2,Auxfirstagree).

            Various predicates need to be explained. Let us begin with the ones generating the auxiliaries.

             Modal auxiliaries are very easy to generate. We call the gen predicate with four argu­ments: the first is the parse tree of the auxiliary (it reduces to a simple string indicating the lexemic value of the auxiliary: both might and may will yield may, and differ as to tense); the second is tense, the third will house the generated string and the fourth is the type required. In the case of modal auxiliaries, this value is always finite.

            We need to cater for the value none, which generates the empty list:

gen(none,_,[],_).

            If the parse tree is not none, we get the appropriate form from the lexicon:

gen(Mod,Tns,Modaux,Req1):-
                       aux([Modaux|_],_,Mod,_,Req1,_,_,Tns).

            We have two predicates for the generation of aspectual auxiliaries, genasp1 and genasp2. This is due to the fact that the perfect auxiliary (HAVE) needs to be generated be­fore the progressive auxiliary (BE). Of course, neither need be present.

            Genasp1 looks for the value perfect in the aspect value list. It may be the only value in the field (there is no progressive), or it may be the first of two, or the second of two. We therefore have three clauses, which we could easily reduce to one by testing if perfect is a member of the list passed as first argument. It is this new version of the predicate that we give here:

genasp1(Asplist,Auxform,Req1,en_active):-
       inlist(perfect,Asplist),
       aux([Auxform|X],X,have,Req1,en_active,_,_,_).

            Note that Req1 points to the type that aspectual have is required to have (for instance, infinitive after a modal auxiliary), and that en_active is the type it requires of the next auxil­iary in the chain, and will therefore appear in the preceding slot in the next call in the genera­tion of the auxiliary list, as in this fragment:

genasp1(Asp,Aspaux1,Req1,Req2),  /* Aspect: perfect */
genasp2
(Asp,Aspaux2,Req2,Req3),  /* Aspect: progressive */

            We also need a clause for the case where there is no perfect auxiliary:

genasp1(_,[],Req,Req).

            Note the anonymous variable in the first slot, which makes it clear that rule ordering is relevant here.

            The generation of the progressive auxiliary runs parallel, and will not be discussed here.

            Generating the voice auxiliary is an easy task. If voice is active, we do not generate anything. If it is passive, we look in the lexicon for the appropriate form of BE:

gen(active,[],Req,Req).

gen(passive,Auxform,Req1,en_passive):-
        aux([Auxform|X],X,be,Req1,en_passive,_,_,_).

            The next predicate that deserves our attention is getagr, whose job is to get the agreement feature of the first auxiliary in the auxiliary list, since this is the only auxiliary that needs to agree with the subject, being the only finite one.

             We must bear in mind that we have used the tense feature to indicate restriction on the whole verb group in dealing with raising and control, using it to indicate for instance that the whole group must be an infinitive or ing group. For instance, consider the following definition of oraising, the predicate taking care of subject to object raising:

oraising([H1,[pred_arg_mod_structure,Prop1,
         [predicate(Pred1,AgrPred1)],
    [[object,[clause,
[pred_arg_mod_structure,Prop2,[predicate(P,agr(Agr))],
                     [[Subject|Rest]|Otherargs]]]]|R1]]],
[H1,[pred_arg_mod_structure,Prop1,
        [predicate(Pred1,AgrPred1)],
    [[object|Rest],
     [object,[clause,
[pred_arg_mod_structure,Propnew2,[predicate(P,agr(Agr))],
                     [Otherargs]]]]|R1]]]):-

allsubject(Subject),
second_header(H1) ,
oraise
(Pred1,Requires),
nonfinite(Agr),
Prop2 = prop(Voice,Aspect,Mod,Tns),
Propnew2 = prop(Voice,Aspect,Mod,tns:Requires).

            The oraise predicate makes a direct call on the lexicon, as we shall see. The call oraise(Pred1,Requires) checks that Pred1 is a subject-to-object raising predicate, but it also determines the nature of the verbal group left after the subject of the lower clause has been promoted to object of the higher one.

             The code for oraise is the following:

oraise(Oraisingverb,to):-
m_verb(vinf,_,Oraisingverb,_,_,_,_,_,_,_,_,_,_,
        [np(_,_,surf_object,_),
         np_vp(oblig,to_inf,object)]).

oraise(Oraisingverb,to):-
m_verb(vinf,_,Oraisingverb,_,_,_,_,_,_,_,_,_,_,
        [np(_,_,subject_inf,_),
         np_vp(oblig,to_inf,object)]).
  
oraise(Oraisingverb,ing):-
m_verb(ving,_,Oraisingverb,_,_,_,_,_,_,_,_,_,_,
        [np(_,_,subject_ing,_),
         np_vp(oblig,ing,object)]).

            From this we see that the second argument of oraise houses a property of the remain­ing non-finite clause after the promotion of its subject to the higher clause: it is either infinitive (bare infinitive or infinitive preceded by TO) or gerundive.

            In the code for oraising, we see that this property overwrites the tense value in Prop2 to yield Propnew2, the property list associated with the lower clause. In dealing with the agreement feature of the first auxiliary in the auxiliary list, we also take into account the restriction that the tense feature slot can have been made to bear.

            Tense occupies the third slot of getagr. The first is the auxiliary and the second the agreement property of the auxiliary (the one that needs to concord with the Person and Num­ber values percolated from the subject; it is read off the lexicon). The fourth slot is a list, which will either be left empty or will house the infinitive particle TO.

            Of course, it may be the case that there is no first auxiliary, simply because the auxil­iary list is empty. In that case, getagr returns noaux as auxiliary agreement feature. The agree predicate will have to be able to deal with this value.

            Here is the code for getagr:

/* there is an aux */

getagr(Aux,Auxagree,Tns,[]):-
       Tns \== to,
       Tns \== bare,
       Tns \== ing,
       aux([Aux|X],X,_,finite,_,_,Auxagree,Tns).

getagr(Aux,Auxagree,to,[to]):-
       aux([Aux|X],X,_,infinitive,_,_,Auxagree,_).

getagr(Aux,Auxagree,bare,[]):-
       aux([Aux|X],X,_,infinitive,_,_,Auxagree,_).

getagr(Aux,Auxagree,ing,[]):-
       aux([Aux|X],X,_,ing,_,_,Auxagree,_).

/* there is no aux */

getagr([],noaux,Tns,[]):-
             Tns \== to,
       Tns \== bare,
       Tns \== ing,!.

getagr([],noaux,to,[to]).
getagr([],noaux,bare,[]).
getagr([],noaux,ing,[]).

            Note that if Tns really houses a tense value, the auxiliary needs to be finite. Note also that we do not really need four clauses when there is no auxiliary (two would do), but we have chosen to preserve code parallelism.

            The code for agree is similar to the one used in analysis, but we need a clause for noaux, which we place at the end of the clause packet, and which succeeds no matter how the first two arguments (percolated from the subject) are instantiated:

/* agree(Personnp,Number,Personvp) */

agree
(first,sing,firstsg).
agree(first,plural,other).
agree(second,sing,other).
agree(second,plural,other).
agree(third,sing,thirdsg).
agree(third,plural,other).
agree(_,_,noaux).

            In YES-NO questions we use a special predicate to generate the auxiliary list (genyesno). The reason is that a form of DO needs to be generated in YES-NO questions when voice is active and there is no modal or aspectual auxiliary (and recall that what follows a non-subject WH-group is also generated as a YES-NO question). We call on the lexicon to provide us with a finite form of DO and then call on agree to check whether it is in agree­ment with the subject:

genyesno(prop(vce:active,asp:none,mod:none,tns:Tns),
          Person,Number,[Aux|[]]):-
   aux([Aux|_],_,do,finite,_,_,Agraux,Tns),
   agree(Person,Number,Agraux),!.

            Note that the form of DO is returned as first element of a list (with empty tail), in con­formity with the pattern used in the generation of auxiliaries in declarative clauses. If we do not need to generate a form of DO, genyesno can simply fall back on the version of the gen predicate as used in declarative clauses:

genyesno(prop(vce:Voice,asp:Asp,mod:Mod,tns:Tns),Person,Number,Auxlist):-
         gen(prop(vce:Voice,asp:Asp,mod:Mod,tns:Tns),Person,Number,Auxlist).

4.3.4.2.2. The Main Verb

            The generation of the last element of the verb group, i.e. the main verb, is in most cases simply a matter of retrieving the verb form on the basis of the lexeme value and the agreement functor. However, we also need to call on the macro m_verb clause to retrieve the particle that should accompany the verb, since it is amalgamated to the lexeme in the parse tree returned by horatio. For instance, He looked it up will have a parse tree where up is no longer an independent element, but appears only in the lexeme value, which will be look_up_1, i.e. the first (and at present only) reading of a lexeme LOOK UP. It would also be possible to retrieve the particle by a string operation on the lexeme value in the parse tree, but this is felt to be too tightly tied to the string structure of the lexeme value to be a reasonable way of getting at the particle. We prefer to have it as a separate argument in the lexicon, even if it will often be left uninstantiated.

            The default clause for the generation of the main verb runs as follows:

gen([predicate(Lex,Agr)],Vform,P):-
verb([Vform|_],_,_,[predicate(Lex,Agr)],_,_,_,_,_,_),
m_verb(Class,Part,Lex,_,_,_,_,_,_,_,_,_,_,_),
( var(Part),P=[]; nonvar(Part),P=[Part] ).

            If Part is left uninstantiated in the lexicon, we return an empty list as particle value (last argument of the gen predicate); otherwise, we return the particle. As a matter of fact, the par­ticle as read off the lexicon is more than the string value of the particle. It is a couple made of a particle type and a string value. And particle is to be understood in an extended sense, as covering various types of fixed strings associated with verbs. Consider the m_verb clause for LOOK UP:

m_verb(vphr,part1:up,look_up_1,look,look,look,looks,looking,
        looked,looked,looked,trans,human,
        [part(oblig,posprec(1,2),up),
         np(oblig,posprec(1,Wnp),object,abstract)]).

/* she looked it up */

            In the couple part1:up, the first element, the particle type, is used to determine the position of the particle with respect to other elements, such as the verb's arguments. This is ac­complished in the standard way, by associating a certain weight with particles and other con­stituents.

            The relevant code is that of the poids predicate (poids is French for weight):

poids([part0:_],0).
poids([part1:_],2).

poids([F1,[nounphrase,_,_,ppro(_,_,_)]],1):-
       ( F1 = object; F1 = indirect_object; F1 = indirect_object_2).
/* personal pronouns filling these functions must stay close to the verb */

poids([F1,R1],W):- assoc(F1,W).
/* general case: we read the value in the assoc table */

            Part0 particles are given the lightest possible weight, and so will have to remain at­tached to the verb. For instance, we will assign part0 as particle type to the string place in the mwu take  place. We have no problem regarding place as a particle, in an extended sense at least, where the word particle refers to fixed string elements. In the section on mwu's (see page 21) we argued that place in take place should not be considered an np or even a noun, but should be regarded as a string, from which viewpoint its resistance to morphological vari­ation and syntactic manipulation is readily explained.

            Part1 particles are given a weight of 2, and can therefore follow very light nps such as personal pronouns, which receive a weight of 1. This explains why we generate He looked it up, and not * He looked up it.

            The assoc table will be further discussed on p. 80. The generation version differs slightly from the analysis version. Here is the code for generation:

assoc(subject,1).
assoc(subject_pass,1).
assoc(subject_inf,1).
assoc(object,3).
assoc(indirect_object,2).
assoc(indirect_object_2,2).
assoc(cplt_s,6).
assoc(subject_attribute,4).
assoc(object_attribute,4).
assoc(pp_arg,3).
assoc(vp_modifier,5).

            Two special cases precede the default case in main verb generation. They deal with the TO that must follow bareinf verbs when used in the passive voice (They saw him take it vs. He was seen to take it), and with the TO that must precede the verb when its agreement functor registers the need for it (value toinfinitive). The code for these cases runs as follows:

gen([predicate(Lex,agr(en_passive))],Genpred,P):-
verb([Vform|_],_,_,[predicate(Lex,agr(en_passive))],_,_,_,_,_,_),
m_verb(vbareinf,Part,Lex,_,_,_,_,_,_,_,_,_,_,_),
( var(Part),P=[]; nonvar(Part),P=[Part] ),
append([Vform],[to],Genpred).


gen([predicate(Lex,agr(toinfinitive))],Genpred,P):-
verb([Vform|_],_,_,[predicate(Lex,agr(toinfinitive))],_,_,_,_,_,_),
m_verb(Class,Part,Lex,_,_,_,_,_,_,_,_,_,_,_),
( var(Part),P=[]; nonvar(Part),P=[Part] ),
append([to],[Vform],Genpred).

4.4. Modifiers

            In horatio modifiers can take the form of prepositional phrases. The main difference with argument pps is in the preference value assigned to the phrase: arguments are preferred to modifiers, according to the densest match first principle.

            As to the internal structure of modifier pps, we distinguish between prepositions that can head modifier pps assigned to nps and those that can head modifier pps assigned to vps. Of course, some prepositions can head both np and vp modifier pps. The predicates modppvp and modppnp are used to implement the distinction. We have the following definitions for these predicates:

/* availability of preps in np and vp modifiers */

/* vp */

modppvp(on).
modppvp(in).
modppvp(at).
modppvp(with).
modppvp(for).
modppvp(about).

/* np */

modppnp(of). /* a book of importance */
modppnp(on). /* a book on each table */
modppnp(in). /* a bird in the tree */
modppnp(with). /* a book with a black cover */
modppnp(for).  /* a book for mary  */
modppnp(about) /* a book about the economic situation in Great-Britain */

            Modifier pps are parsed by calls to the xprepphrase predicate. The reader is referred to the section on prepositional phrases, p. 62. We give here the code for np modifier pps:

modifier(P0,P1,np,Gap,Prefgen,posprec(1,Precpp),
             Rel,Intrel,Tree,subject(SUBJ,Sem)):-
         xprepphrase(P0,P1,Gap,index(J),npindex(I),Pref,Precpp,
                     Prepform,Rel,
                    Intrel,np_modifier,
                   Tree,PPsem,PPsemnp),
         modppnp(Prepform),
         Prefgen is Pref + 1.

            Np modifiers can also be clauses, either ING-clauses or EN-clauses (passives). Exam­ples are:

the man reading a book is good
the book
read by the man is good

            These modifier clauses are parsed by calls to the xverbphrase predicate. The head noun must be passed  to the ING or EN clause, where it will play the part of subject (active, ING-clause) or object (passive, EN-clause). Here is the code for modifier ING clauses:

modifier(P0,P1,np,Gap,Prefgen,posprec(1,_),Rel,Intrel,
           [np_modifier,Tree],
           subject([Function,[Cat,Index|Rest]],Sem)):-

    xverbphrase(P0,P1,subject([Function,[Cat,Index|VAR]],Semsubj),Gap,
      Pref,Tree,Rel,Intrel,
                ing,Tense,
                aspect(Aspect),Modality,
                Number,Person,
               Voice,nsubject(NSUBJ,Nsem)),
    sfok(Semsubj,Sem),
    Prefgen is Pref + 1.

            The third argument to the modifier clause indicates that such modifiers are restricted to np modifiers. Note also that the body of the subject NP (Rest) is ghosted in the tree returned by the call to the xverbphrase predicate. Only the index is retained, to show coindexing. A semantic check is performed between the restriction placed on the subject of the verb phrase and the semantic feature list percolated from the head noun to its ING-modifier via the subject functor.

            Such percolation is accomplished by a call to the satisfylist predicate. Remember that this predicate will also parse modifiers (the difference being that in the case of a modifier the noun's arglist is left untouched), so that modifiers have access to the subject functor:

corenounphrase(P0,P3,Gap,index(I),Pref,Weight,Rel,Intrel,
            Function,
                      [Function,[nounphrase,index(I),
                             agr(third,Number),DET,N
,Parse]],
                    Number,third,Sem):-
     determiner(P0,P1,DET,Number,Rel1,Intrel1) ,
     xnoun(P1,P2,N,Number,Sem,Arglist),
     satisfylist(P2,P3,Gap,Status,Pref,Prec,Rel2,Intrel2,_,
                 Parse,np,Arglist,Func,
                 subject([subject,[nounphrase,index(I),
                          agr(third,Number),DET,N]],
               Verbsem)),
     sfok(Verbsem,Sem),
     ( var(Rel1),var(Intrel1),Rel=Rel2,Intrel=Intrel2;
      nonvar(Rel1),nonvar(Intrel1),Rel=Rel1,Intrel=Intrel1 ),
     ifthenelse(var(Status),Weight=2,Weight=3).

            In the case of an EN-clause as modifier, we likewise use the subject functor in the clause. The clause is passive, and therefore the passive subject will come to occupy object po­sition in the tree when passivization is undone in the EN-clause. The semantic check is there­fore between the Sem feature list passed to the subject functor in the modifier clause and the new subject returned by the passive clause (nsubject functor; this would house the restriction on the object of READ, in our example).

modifier(P0,P1,np,Gap,Prefgen,posprec(1,_),Rel,Intrel,
             [np_modifier,Tree],
             subject([Function,[Cat,Index|Rest]],Sem)):-
    xverbphrase(P0,P1,
      subject([Function,[Cat,Index|VAR]],Semsubj),
            Gap,Pref,Tree,Rel,Intrel,
             en_passive,Tense,
             aspect(Aspect),Modality,Number,Person,
             passive,nsubject(NSUBJ,Nsem)),
    sfok(Nsem,Sem),
    Prefgen is Pref + 1.

            In generation, modifier EN-clauses need a slightly special treatment: we must make sure that we do not generate the passive auxiliary. In fact, we DO generate it when we generate the passive clause, but we drop it from the string we return at the highest level, i.e. as second argument of the gen predicate:

/* written by students */

gen([[np_modifier,[pred_arg_mod_structure,prop(vce:passive,B,C,D)|R]]], Tail):-
    prepgen([clause,[pred_arg_mod_structure,prop(vce:passive,B,C,D)|R]], Prepmod),
    gen(Prepmod,Mod),
    flatten(Mod,[H|Tail]).

/* we return only the Tail; the Head (H) is the auxiliary, which does not appear in the modi­fier: written by students, * is written by students */

4.5. Canonical Order and Athematic Arguments

            In the parse trees returned by horatio the arguments appear in a canonical order, and athematic arguments are excluded. This is achieved by a number of predicates. We begin by considering insort, which is simply an insertion sort that we perform on the arglist. Insort calls insert, which itself calls before. Insort also calls drop, which decides whether the argument is athematic and should be dropped.

insort([H|T],S):- drop(H),
                            insort(T,S).

insort([H|T],S):- not(drop(H)),
                            insort(T,L),
                            insert(H,L,S).

insort([],[]).

            If the head of the list to be sorted is an athematic element, it is dropped and insort calls itself recursively on the tail. It the head is a thematic argument, the tail is sorted and then the head is inserted at its proper place by insert, whose code is the following:

insert(X,[H|T],[H|L]):-
                 before(H,X),
                 !,
                 insert(X,T,L).

insert(X,L,[X|L]).

            The element is inserted in the tail of the list if the head of the list comes before the ele­ment to be inserted; otherwise, the new element is inserted in front. The predicate before determines whether its first argument should come before its second argument. The decision is taken on the basis of the argument's function, which is always the head of the list representing it in the parse:

before([F1,R1],[F2,R2]):-
                      assoc(F1,Rank1),
                      assoc(F2,Rank2),
                      Rank1<Rank2.

            The predicate assoc associates a given rank with each thematic function:

assoc(subject,1).
assoc(object,2).
assoc(indirect_object,3).
assoc(indirect_object_2,3).
assoc(cplt_s,4).
assoc(subject_attribute,5).
assoc(object_attribute,6).
assoc(pp_arg,7).
assoc(vp_modifier,8).

            It remains to look at the list of athematic functions, the ones that appear as arguments of the drop predicate :

drop([surf_subject,Rest]).
drop([surf_object,Rest]).
drop([subject_inf,Rest]).
drop([subject_ing,Rest]).

5. Semantics

            horatio is very poor on semantic features. The values are organized in a hierarchy by means of Prolog clauses very much like the well-known parent/ancestor predicates. The se­mantic checks are of course performed by unification. It should be noted that nouns have lists of semantic features, whereas verbs place restrictions by means of single semantic features (one per argument, including the subject). The use of lists for nouns enables the lexicographer to maintain the principle of one macro-clause per reading. Consider the entries for COMPUTER and BOOK:

m_noun(computer_1,computer, computers,[thing,human],[]).
m_noun(book_1,book, books,[thing,abstract],[pp(opt,n:topic)]).

            These entries allow us to deal with sentences such as the following where the verb or another predicative element places restrictions on the np whose head is a realization of the lex­eme COMPUTER:

1) The computer thinks that the problem is hard to solve.
2) The computer should be repaired.

(if we accept that THINK requires a [+HUMAN] subject, and REPAIR a [+THING] object)

5.1. GF Level

            It is obvious that semantic checks should not be applied at surface level, but at the GF level, where the grammatical functions corresponding to the specifications of the item's arglist have been retrieved.

            A case in point is passives. In a sentence such as The book has been read by the teacher, passivization has demoted the subject to by-phrase status, and promoted the object to subject status. In the lexicon, however, the semantic restrictions imposed by the verb read are placed on the arguments as they appear before passivization reorganizes the arglist (to express things in chronological terms, a harmless metaphor so long as we keep in mind that for horatio passives and actives are simply related, without any priority being given to either).

            Let us track the subject of the passive clause. The predicate reog (whose behaviour has been studied in the section on passives, p. 36) puts the object into the slots of the nsubject functor, which is passed on to the arglist predicate and ends up in the verbphrase predi­cate:

arglist(P0,P2,Gaps,ArgOrModFound,
              Pref,Posprec1,Rel,
              Intrel,Voice,Parse,NpOrVp,List,
              Func,subject(SUBJ,Semsubj),
              nsubject(NSUBJ,Nsem),
              Class):-
              reog(Voice,Class,subject(SUBJ,Semsubj),List,
                   nsubject(NSUBJ,Nsem),Nlist,Func),
              satisfylist(P0,P2,Gaps,ArgOrModFound,Pref,Posprec1,
                  Rel,Intrel,Voice,Parse,NpOrVp,Nlist,
                  Func,subject(NSUBJ,Nsem)).

  verbphrase(P1,P3,subject(SUBJ,Semsubj),Gap,Pref,
               [pred_arg_mod_structure,
               prop(vce:V,asp:A,mod:Modality,tns:Tense),
               VERB,SParse],
             Rel,Intrel,Type,Tense,aspect(Aspect),Modality,
             Number,Person,Voice,nsubject(NSUBJ,Nsem)):-
    verb(P1,P2,Class,VERB,Type,Tense,
          Number,Person,Semsubj,Args) ,
    arglist(P2,P3,Gap,Status,Pref,Preclist,Rel,
               Intrel,Voice,Parse,vp,
               Args,Func,subject(SUBJ,Semsubj),
               nsubject(NSUBJ,Nsem),
               Class),
     (nonvar(Aspect); var(Aspect),A = none ),
     (Aspect = [] , A = none; Aspect \= [],A = Aspect ),
     (nonvar(Modality); var(Modality),Modality = none ),
     (nonvar(Tense); var(Tense), Tense = present ),
     (nonvar(Voice), V = Voice; var(Voice),V = active ),
     append([NSUBJ],Parse,AParse),
     insort(AParse,SParse).

            At sentence level, the check is performed between the subject np and the semantic re­striction specified by the nsubject functor:

sentence(P0,P2,Gaps,Prefs,[clause,VP],Type,Personvp,
                  Number,Voice):-
append(Gapnp,Gapvp,Gaps),
xnounphrase(P0,P1,Gapnp,index(I),Prefnp,Weight,Rel1,
                 Intrel1,subject,SUBJ,Number,Personnp,Semsubjnp) ,
var(Rel1),
agree(Personnp,Number,Personvp),
xverbphrase(P1,P2,subject(SUBJ,Semsubjvp),
                 Gapvp,Prefvp,VP,Rel2,
                 Intrel2,Type,Tense,aspect(Aspect),Modality,
                 Number,Personvp,
                 Voice,
                 nsubject(NSUBJ,Nsemsubjvp)),
var(Rel2),
sfok(Nsemsubjvp,Semsubjnp),
accu(Prefs,[Prefnp,Prefvp,4]).

5.2. Inheritance

            In horatio semantic features are organized hierarchically into classes, subclasses, sub-subclasses, etc. A subclass is meant to be able to satisfy a semantic requirement expressed in terms of its parent class.

            The class/subclass relation is expressed by the predicate up1. Its first argument is the subclass, its second the parent class. We find:

up1(animal,living).
up1(human,living).

            These two clauses are meant to convey the information that animals and humans are both living entities. More formally, that the class animal is a subclass of the class living, and that the class human is also a subclass of the class living.

            The hierarchy is traversed upwards by the predicate up, whose definition is similar to that of the ancestor predicate, to be found in introductory tutorials on Prolog. The parent predicate's role is similarly played by the up1 predicate:

up(X,Y):- up1(X,Y).
up(X,Y):- up1(X,Z),up(Z,Y).

            When a semantic restriction is checked, the up predicate is called to see whether we do not have a subclass of the class we are looking for, in which case the semantic requirement is satisfied:

sfok(Sem,Semlist):- up(Sem1,Sem),inlist(Sem1,Semlist).

5.3. Percolation

            Percolation can be defined as the copying up or down the tree of feature values. In Prolog this copying is of course done by unification. For instance, we need to be able to refer to the head noun's semantic feature list at the level of the whole noun phrase. Similarly, we sometimes need to refer to the semantic feature list of a noun phrase within a prepositional phrase.

            Let us look at the first of these two examples of feature percolation. In the relevant de­fining clause for the corenounphrase predicate the two occurrences of the semantic feature list appear in bold type:

corenounphrase(P0,P3,Gap,index(I),Pref,Weight,Rel,Intrel,
            Function,
                      [Function,[nounphrase,index(I),
                             agr(third,Number),DET,N,Parse]],
                    Number,third,Sem):-
     determiner(P0,P1,DET,Number,Rel1,Intrel1) ,
     xnoun(P1,P2,N,Number,Sem,Arglist),
     satisfylist(P2,P3,Gap,Status,Pref,Prec,Rel2,Intrel2,_,
                 Parse,np,Arglist,Func,
                 subject([subject,[nounphrase,index(I),
                          agr(third,Number),DET,N]],
               Verbsem)),
     sfok(Verbsem,Sem),
     ( var(Rel1),var(Intrel1),Rel=Rel2,Intrel=Intrel2;
      nonvar(Rel1),nonvar(Intrel1),Rel=Rel1,Intrel=Intrel1 ),
     ifthenelse(var(Status),Weight=2,Weight=3).

            The xnoun predicate is itself defined in terms of the  predicate noun:

xnoun(V1,V2,V3,V4,V5,V6):- noun(V1,V2,V3,V4,V5,V6).

            The predicate noun itself is a macro-expansion of the lexical predicate m_noun:

noun([Singular|X],X,[noun(Lex,agr(sing))],sing,Sem,Arglist):-
m_noun(Lex,Singular,Plural,Sem,Arglist).

            Here is an example of a m_noun clause, with the semantic list (fourth argument) in bold type:

m_noun(explanation_1,explanation,explanations,[abstract],[]).

5.4. From Semantic to Lexical Classes

            Consider the phrase PAY ATTENTION TO. Attention is modifiable, and the np whose head it is plays a functional role, namely that of object of the verb PAY:

You should pay more attention to the problems he has mentioned.
Too much attention has been paid to these pseudo-problems.

            Nevertheless PAY ATTENTION TO is a multi-word unit: the preposition is lexically determined and the sense of PAY is not assignable without considering the object.

            The solution adopted for such mwu's in horatio is to use the slot reserved for seman­tic restrictions to code lexical restrictions. We have one entry for PAY (where the lexe­me value is the phrase pay_attention_1) where we specify that the object must bear the se­mantic feature attention. We also have a reading of attention which bears the required feature in its semantic feature list:

m_verb(vobjfreepp,_,pay_attention_1,pay,pay,pay,pays,paying,
        paid,paid,paid,trans,human,
        [np(oblig,posprec(1,Wnp),object,attention),
         pp(oblig,posprec(1,Wpp),pp_arg,_,_,to)]).

/* they should pay attention to the problem he has seen */


m_noun
(attention_1,attention,[attention],[]).

            By the side of the entry for pay_attention_1 given above, we need another one, where attention to is parsed as a particle attached to the verb. This is achieved by including the string value attention to in the arglist. This entry is needed to account for passives such as The problem was paid attention to, where the object is not attention, but the problem, i.e. the object argument in the arglist for this second entry (cf. the section on double analysis, p. 19):

m_verb(vtrphrprep,part0:'attention to', pay_attention_to_1_a,
        pay,pay,pay,pays,paying,
        paid,paid,paid,trans,human,
        [string(oblig,posprec(1,0),[attention,to]),
         np(oblig,posprec(2,Wnp),object,_)]).

/* the problem should be paid attention to */

            We would also need another entry for ATTENTION to account for its uses outside the mwu PAY ATTENTION TO. The following would be appropriate:

m_noun(attention_2,attention,[abstract],[]).

            Note that here abstract is a true semantic feature, not a lexical one.

5.5. Performing the Checks

            SFOK is used to check that a semantic restriction placed by a verb or an adjective is satisfied by the noun phrase filling the verb's argslot or by the noun modified by the adjective.

            CHECKSEM is used when the two values are both semantic restrictions. They are checked for compatibility, i.e. it is ascertained whether they meet somewhere by going up the semantic trees they belong to.

5.5.1. Sfok

            To understand sfok, it is necessary to recall that nouns have semantic feature lists, whereas verbs and adjectives place semantic restrictions by means of single semantic features. For instance, the noun BOOK has the following entry in the lexicon:

m_noun(book_1,book,books,[thing,abstract],[pp(opt,n:topic)]).,

[thing,abstract] being its semantic feature list.

            On the other hand, the verb READ has the following entry:

m_verb(verbtr,_,read_1,read,read,read,reads,reading,
        read,read,read,trans,human,
        [np(opt,posprec(1,Wnp),object,abstract)]).

/* she was reading */

where human is the semantic restriction on its subject, and abstract on its object.

            To check whether a semantic restriction is satisfied, we try to find the feature embody­ing the semantic restriction placed by the verb on one of its arguments among the semantic feature list associated with the np or pp filling the argument position. The necessary list tra­versal is accomplished by inlist, which is basically a deterministic version of the member predicate:

inlist(X,X):- !.

inlist(Sem,[norestriction]):- !.

inlist(Sem,[Sem|X]):- !.

inlist(Sem,[_|X]):- inlist(Sem,X).

            The first clause takes care of the cases where one of the two arguments, or both, is an uninstantiated variable. In such a case the test must succeed.

            The second clause takes care of the special semantic feature list [norestriction] which is used with certain relative and interrogative pronouns. Here too the test succeeds, no matter what the semantic restriction placed by the verb or the adjective is.

            The third and last clauses code a deterministic version of the well-known member predicate.

            Sfok similarly opens with a number of clauses taking care of the special cases:

sfok(Sem,Sem):- !. 

/* this clause is used to assign a value to a variable; necessary when the NP is a gap; the se­mantic restriction in the VP will be projected onto the gap; it also takes care of cases where both arguments are variables, e.g. checking the semantic features of an optional argument that is not realized in the S to be parsed */

sfok(norestriction,X):- !.

sfok(X,norestriction):- !.

sfok([],X):- !.

            The last two clauses call inlist to accomplish the list traversal of the semantic feature list associated with the np or pp argument. The last one calls up, whose role, as we have seen, is to permit inheritance:

sfok(Sem,Semlist):- inlist(Sem,Semlist),!.

sfok(Sem,Semlist):- up(Sem1,Sem) ,
                                 inlist(Sem1,Semlist).

5.5.2. Checksem

            As already said, we use checksem instead of sfok when we have two semantic re­strictions, and want to know if they are compatible. This will be the case in the analysis of such sentences as the man wants to read the book, where what is passed on to the infinitive clause to read the book is the restriction that wants places on its subject. Consider the follow­ing piece of code:

satisfy([to|P0],P1,Gap,Prefgen,Posprec,Rel,Intrel,
                [Function,[clause,Tree]],
                np_vp(Type,to_inf,Function),
                subject([Sfunction,Treesubj],Semsubj1),
                subject([Sfunction,Treesubj],Semsubj1)):-
  xverbphrase(P0,P1,subject([subject,Treesubj],Semsubj2),
              Gap,Pref,Tree,Rel,
              Intrel,infinitive,Tense,
              aspect(Aspect),Modality,
              Number,Person,Voice,
              nsubject(NSUBJ,Nsem)),
  checksem(Nsem,Semsubj1),
  Prefgen is Pref + 4.

            The xverbphrase predicate will put its semantic requirement on its subject in the Nsem variable in the nsubject functor (remember that it can be submitted to such argument re­shuffling transformations as passive). The Semsubj1 variable in the subject functor of the satisfy predicate records the semantic restriction on the subject of the verb that has already been parsed, wants in our case. We need to verify whether the two semantic restrictions are compatible, and we use checksem for this purpose.

            The code for checksem is the following.

            If we have one or two uninstantiated variables, or if the two variables are unifiable, checksem succeeds:

checksem(X,X):- !.

            It may be the case that the second argument of checksem points to a list of features (this may happen through unification of semantic features), in which case we simply use sfok:

checksem(X,Y):- sfok(X,Y), !.

            If both arguments refer to semantic restrictions, they must be compatible, i.e. must meet - be identical - somewhere on the up path in the hierarchy:

checksem(X,Y):- up(X,X1),
                            checksem(X1,Y),
                            !.
/* X is more restrictive than Y */


checksem(X,Y):- up(Y,Y1),
                             checksem(X,Y1).

 /* Y is more restrictive than X */

            Note that in the case of subject-to-subject raising we do not use checksem, because such raising verbs (such as SEEM) do not place restrictions on their subjects. We simply pass the restriction placed by the infinitive on its subject to the higher level (by percolation), and leave the sentence predicate to make the appropriate call on sfok:

satisfy([to|P0],P1,Gap,Prefgen,Posprec,Rel,Intrel,
                [Function,[clause,Tree]],
                vp(Type,Function),
                subject([Sfunction,Treesubj],Semsubj1),
                subject([Sfunction,Treesubj],Semsubj1)):-
  xverbphrase(P0,P1,subject([subject,Treesubj],Semsubj2),
              Gap,Pref,Tree,Rel,
              Intrel,infinitive,Tense,
              aspect(Aspect),Modality,
              Number,Person,Voice,
              nsubject(NSUBJ,Semsubj1)),
  Prefgen is Pref + 2.

6. Parsing Issues

6.1. Dealing with Flexible Word Order

            English is characterized by a word order that is neither wholly free nor entirely fixed. We shall therefore call it flexible. The first task is to isolate the relevant factors determining word order. For the cases dealt with in horatio, these are two: argument canonical order as specified in an item's arglist, and structural weight.

            Structural weight has priority over argument order, although the exact weighting of these two factors is not easy to determine, and a good deal of it is heuristics, i.e. trying various possibilities and seeing how well they fit the data.

            Argument order as determined in the lexical clause specifying the arglist is undoubtedly a factor. Consider the phrases built around the skeleton consider x y. If x and y have the same structural weight, x needs to refer to the object and y to the object's complement, not the other way round. In an interpretation where motorists is the object and criminals the complement, we can have

I consider motorists criminals

but not

* I consider criminals motorists

            However, if the object is weightier than the complement, it will follow:

I consider criminals the motorists who drink and drive.

            Sticking to the argument order specified in the rule would yield a much less acceptable sentence, also on account of the unintended object link between drive and criminals:

? I consider the motorists who drink and drive criminals.

            In horatio, we use the posprec structure to house both the argument order and to leave room for the weight, to be filled when the argument is actually parsed. Consider again the entry for ALLOW, third reading, repeated here for convenience:

m_verb(vio,allow_3,allow,allow,allow,allows,allowing,
        allowed,allowed,allowed,trans,human,
        [np(oblig,posprec(2,Wnp1),object,thing),
         io(oblig,posprec(1,W2),indirect_object,human,_)]).

            The first value in posprec can be determined in the argument list itself. Here, it is meant to convey the information that the indirect object should precede the direct object. The second value in posprec is left as a variable in the argument list. It will be instantiated when the actual object np and io (indirect object) are parsed. It reflects the weight of the element. For instance, a prepositional phrase will be assigned a certain weight, nps will differ in the weight they are assigned (for instance, an np consisting of a personal pronoun is very light, an np consisting of a head followed by a relative clause is heavy, etc.).

            As already said, priority is given to the second value over the first in posprec, to allow for end placing of heavy elements. The relative importance to be assigned to the two values of the posprec structure is a matter for investigation[14].

            The approach sketched here does not run into the problems that the CLE one (Core Language Engine) is confronted with (see Pulman 1992, in his section on subcategorization, p. 62 and foll.). He points out the the GPSG approach, as well as the CLE approach, does not "allow for the possibility of optional modifiers like PP or AdvP to appear between elements on a subcat list" . To remedy this problem CLE has "a version of the subcategorization schema that splits the list of complements and allows a modifying structure to intervene". In Alshawi et al. 1992 we are given no further information on this splitting scheme. I believe that we should look for a solution that takes into account both arg order and weight. As has been said already, the crux lies in the weighting of these two factors.

6.2. Computing Preferences

            In horatio, each parse is assigned a preference, expressed by a positive number. The greater the number, the higher the preference. The preference of a parse is a function of the preferences assigned to its constituents. The preference mechanism is built around the time-honoured principle of the redundancy of natural language, i.e. the densest match - best match principle. In practice, it means that arguments are preferred to modifiers.

            Consider the following two predicate definitions. They both deal with the parsing of prepositional phrases. In the first case we have a pp as argument, in the second case as modifier.

satisfy(P0,P1,Gap,Prefgen,posprec(Pos,Precpp),Rel,Intrel,Tree,
                pp(Type,posprec(Pos,Precpp),
                   Function,PPsemnp,PPsemvp,Prepform),
               subject(SUBJ,Semsubj),
               subject(SUBJ,Semsubj)):-
         xprepphrase(P0,P1,Gap,index(J),npindex(I),
                     Prefpp,Precpp,Prepform,
                    Rel,Intrel,Function,Tree,
                    PPsem,PPsemnp),
         sfok(PPsemvp,PPsem),
         Prefgen is Prefpp + 4.


modifier
(P0,P1,vp,Gap,Prefgen,posprec(1,Precpp),
         Rel,Intrel,Tree,subject(SUBJ,Sem)):-
         xprepphrase(P0,P1,Gap,index(J),npindex(I),Prefpp,Precpp,
                    Prepform,Rel,
                    Intrel,vp_modifier,
                   Tree,PPsem,PPsemnp),
         modppvp(Prepform),
         Prefgen is Prefpp + 2.

            Prefgen records the preference assigned to the phrase. In the first case (argument), it is the preference returned by the xprepphrase predicate plus 4. In the second case (modifier), the added value is only 2.

            Consider now how the preference value is computed at the highest level, i.e. for whole sentences. Here is the relevant definition of the sentence predicate:

sentence(P0,P2,Gaps,Prefs,[clause,VP],Type,Personvp,
                  Number,Voice):-
append(Gapnp,Gapvp,Gaps),
xnounphrase(P0,P1,Gapnp,index(I),Prefnp,Weight,Rel1,
                 Intrel1,subject,SUBJ,Number,Personnp,Semsubjnp) ,
var(Rel1),
agree(Personnp,Number,Personvp),
xverbphrase(P1,P2,subject(SUBJ,Semsubjvp),
                 Gapvp,Prefvp,VP,Rel2,
                 Intrel2,Type,Tense,aspect(Aspect),Modality,
                 Number,Personvp,
                 Voice,
                 nsubject(NSUBJ,Nsemsubjvp)),
var(Rel2),
sfok(Nsemsubjvp,Semsubjnp),
accu(Prefs,[Prefnp,Prefvp,4]).

            The preference value for the whole S (Prefs) results from the accumulation (accu predi­cate) of the preference values of the subject np (Prefnp) and the verb phrase (Prefvp). But an S is more than the concatenation of an np and a vp. The np must be able to function as subject of the vp (in this sentence pattern) and this is ensured by the grammatical and semantic agree­ment checks (agree and sfok predicates). This link is the reason for the addition of the value 4 to the accumulated np and vp preference values. How do we find out that 4 is the appropriate value to add here ? The computing of preference values, like that of precedence values, is very much a question of heuristics, i.e. finding out by experiment which values give the best results.

            Accu is a simple predicate which returns in its first argument the sum of the values listed in its second argument, which must be a list. The mode is the following:

accu(-Totalvalue,+Valuelist).

            Accu is defined as follows:

accu(0,[]).

accu(Res,[Head|Tail]):-
                           nonvar(Head),
                           accu(Respart,Tail),
                           Res is Head + Respart.

accu(Res,[Head|Tail]):-
                           var(Head),
                           accu(Res,Tail).

            The first clause is the non-recursive case: an empty value list yields a total value of zero.

            The second clause calls accu recursively on the tail of the value list, and adds the value of the head to the result, provided the head is instantiated.

            The third clause takes care of uninstantiated heads: they are simply dropped, and accu is called on the tail of the list to return the final result.

6.3. Hard Coordination

            horatio can deal with certain types of so-called hard coordination. The reader is re­ferred to the last two sample parses (see Appendix F, page 155). The main principle is that only like elements can be conjoined, but the problem consists in determining the conditions that must be fulfilled for two elements to be regarded as similar enough to be coordinated. Here the argument list plays a crucial part, and its availability in the lexical entries themselves is a cru­cial condition for the strategies developed here to work.

            In order to exemplify the approach taken in horatio, we shall look at coordination at the sentence level. We first deal with the easiest case: the two clauses do not display any gap. The only requirement is that they be of the same Type, either finite or non-finite.

            For each linguistic phrase (from the clause level down) that can be coordinated, we de­fine a super-predicate (by convention its name should begin with x), itself defined as a call to the simpler predicate, the parsing of a conjunction, and a recursive call on the complex predi­cate (the rules are therefore right-recursive). Of course, the second defining clause of the complex predicate defines it as a call on the simpler predicate, to get out of the  recursion. In this way, we avoid the well-known problem of left-recursive rules (e.g. np --> np coord np) for a top-down parser as the one used in horatio.

            At the clause level, we therefore have the following piece of code, where xsentence is the complex predicate and c_sentence the simpler one. Note that the gap value (third ar­gument) is set to the empty list: the two clauses do not feature any gap.

xsentence(P0,P2,[],Prefs,[and_sentence,S1,S2],Type,Person,
            Number,Voice):-
inlist(and,P0),
c_sentence
(P0,[and|P1],[],Pref1,S1,Type,Person,Number,
            Voice),
xsentence(P1,P2,[],Pref2,S2,Type,Person2,Number2,Voice2),
accu(Prefs,[Pref1,Pref2]).

/* getting out of the recursion */
 xsentence(A,B,C,D,E,F,G,H,I):-
   c_sentence(A,B,C,D,E,F,G,H,I).

            We also note the simplistic call on inlist. It checks that the word AND is a member of the remaining word list[15]. This call is for efficiency only, and would have to be revised or dropped in a version of horatio that attempted to deal with other coordinating conjunctions than AND (recall that the comma is the coordinating conjunction of choice when the coordina­tion has more than two members).

            We next consider the case where we have a gap. It needs to be shared by the two con­joined clauses. Consider:

The linguistics which he thinks he teaches and she likes ...

            The italicized bit is a conjunction of two S's sharing the same gap, namely the relative pronoun. Sharing the gap implies here that the gap is of the same type (np or pp) and points to the same antecedent (and also, of course, has the same index). We therefore have the following piece of code:

xsentence(P0,P2,
           [gap(_,_,_,_,FunctioninS2,_,index(I),_)],
            Prefs,[and_sentence,S1,S2],Type,Person,Number,Voice):-
  inlist(and,P0),
  c_sentence(P0,[and|P1],
             [gap(Ref,Gaptype,_,_,FunctioninS1,_,index(I),_)],
            Pref1,S1,Type,Person,Number,Voice),
  nonvar(FunctioninS1),
  xsentence(P1,P2,
           [gap(Ref,Gaptype,_,_,FunctioninS2,_,index(I),_)],
            Pref2,S2,Type,Person2,Number2,Voice2),
  nonvar(FunctioninS2),
  accu(Prefs,[Pref1,Pref2]).

            Note here the calls on nonvar, to make sure that the gaps are real. Function assign­ment can only happen in the clause itself: if the function value is not left uninstantiated, it is evidence that a piece of structure was missing, since a function was assigned to its place-holder (the gap).

            The next case we consider is that of a gap that is filled somewhere to the right (the gaps in the previous case were filled on the left, as in interrogative or relative clauses). An example is

John likes and Mary reads the book

which we will analyse as two gapped S's (1: John likes GAP; 2: Mary reads GAP) followed by the gap filler (the book).

            Here we require that the gap fill the same function in the two coordinated S's (the ob­ject function in the example), and we check that the gap filler is semantically OK for both gaps. The gap functor has two positions for semantic restrictions (one for np gaps and one for pp gaps) and we carry out the check between the appropriate slot in the gap functor and the semantic feature list associated with the noun phrase or prepositional phrase filling the gap. Here is the code in the case of np gaps:

xsentence(P0,P3,[],Prefs,[and_sentence,S1,S2,NP],Type,
            Person,Number,Voice):-
  inlist(and,P0),
  sentence(P0,[and|P1],
       [gap(NP,npgap,Npsem1,_,FunctioninS1,_,index(I),_)],
       Pref1,S1,Type,Person,Number,Voice),
  nonvar(FunctioninS1),
  sentence(P1,P2,
       [gap(NP,npgap,Npsem2,_,FunctioninS2,_,index(I),_)],
       Pref2,S2,Type,Person2,Number2,Voice2),
  nonvar(FunctioninS2),
  FunctioninS1=FunctioninS2,
  xnounphrase(P2,P3,[],index(I),Pref,Weight,Rel,
            Intrel,FunctioninS1,NP,Numbernp,Personnp,Sem),
  sfok(Npsem1,Sem),
  sfok(Npsem2,Sem),
  accu(Prefs,[Pref1,Pref2]).

            Note that at the highest level the clause resulting from the coordination of the two gapped clauses and the gap filler must feature an empty list as gap value. This is in contrast with the preceding case where the gap was percolated to the resulting clause.

            The most complex case is that of clauses which are gapped in their main verb, such as

Mary teaches linguistics and John mathematics.

            Matsumoto 1991 leaves such cases to be dealt with by a device which does not belong to the main parsing scheme, the Extra-Grammatical Sentence Module. He writes (page 10): "Because the Extra-Grammatical Sentence Module is activated only when no sentence structure is found, the gap type of coordination handling is never tried if there is at least one possible parse obtained from the input". One of the sentences he discusses is the following:

John will cook the meals today and Barbara tomorrow (missing vp in the second S).

If the 'cannibalistic' reading is not rejected, it will preclude finding the 'gapped' reading. Matsumoto is aware of this danger (p.11), although he has not seen that the interpretation of one of his example sentences falls prey to it.

            The treatment of such gapped S's in horatio crucially depends on the availability of frame information in the lexicon. As a matter of fact, we need to go into the structure re­turned by the parsing of the first clause, get at the Class and Lex value for the verb, call the corresponding macro-clause (m_verb) to get at the Args required by the predicate, and at­tempt to parse what follows the subject of the second clause with a call to the arglist predi­cate, to which the required Args and Class values are passed. The m_verb clause also yields the semantic restriction on the subject and we use it to make sure that the subject of the second clause is semantically appropriate to its (gapped) verb. The following is the commented code for this more complex case:

xsentence(P0,P4,[],Prefs,
  [and_sentence,
   [clause,  [pred_arg_mod_structure,
               prop(vce:V,asp:A,mod:Modality,tns:Tense),
               [predicate(Lex,Agr)],
               SParse]],
   [clause,  [pred_arg_mod_structure,
               prop(vce:V,asp:A,mod:Modality,tns:Tense),
               [predicate(Lex,_)],
               SParse2]]
   ],
   Type,Person,Number,V):-
  inlist(and,P0),
  xnounphrase(P0,P1,[],index(I),Prefnp1,Weight1,Rel1,
                 Intrel1,subject,SUBJ,Number,Personnp,Semsubjnp) ,
/* e.g. Mary */

  var(Rel1),
  agree(Personnp,Number,Person),
  verbphrase(P1,[and|P2],subject(SUBJ,Semsubj),[],Prefvp1,
               [pred_arg_mod_structure,
               prop(vce:V,asp:A,mod:Modality,tns:Tense),
               [predicate(Lex,Agr)],
              SParse],
             Relvp1,Intrelvp1,Type,Tens,aspect(Aspect),Mod,
             Number,Person,Voice,nsubject(NSUBJ,Nsem)),
/* e.g. teaches linguistics */

/* P2,P3: subject of the second S, e.g. John */

  xnounphrase(P2,P3,[],index(I2),Prefnp2,Weight2,Rel2,
            Intrel2,subject,NP,Number2,Personnp2,Sem),
  var(Rel2),
  m_verb(Class,_,Lex,
         _,_,_,_,_,_,_,_,_,
         Semsubj,Args),

/* P3,P4: argument list in the second S */
/* e.g. mathematics */

  arglist(P3,P4,[],Status,Prefvp2,Preclist,Relvp2,
               Intrelvp2,Voice,Parse2,vp,
               Args,Func,subject(NP,Semsubj),
               nsubject(NSUBJ2,Nsem),
               Class),
  nonvar(Status),
  append([NSUBJ2],Parse2,AParse2),

/* the append predicate is used to append the subject at the beginning of the parse tree produced by arglist, from which the subject is excluded */

  insort(AParse2,SParse2),
  sfok(Nsem,Sem),
  accu(Prefs,[Prefnp1,Prefvp1,Prefnp2,Prefvp2]).

A Note on Generation

            The generation of coordinate structures is straightforward: we generate the first con­junct, append the coordinating conjunction AND, and then generate the second conjunct (recall that horatio knows only about one coordinator, AND). But there is one case that is slightly more complex. It is due to the fact that sometimes the parser factors out a piece of information, such as for instance the Property field (modality, aspect, voice, tense) in the case of coordi­nated YES-NO questions. We therefore need to redistribute the factored out information. We first make sure that factorisation has taken place by checking that the factored out element is uninstantiated at the places where it will have to be redistributed. The redistribution is accom­plished by the following piece of code:

gen([yes_no_question,[clause,[and_pred_arg_mod_structure,Properties,
[pred_arg_mod_structure,Vprop1|Rest1],
[pred_arg_mod_structure,Vprop2|Rest2]]]],
Sentence):-
    var(Vprop1),
    var(Vprop2),
    gen([yes_no_question,[clause,[pred_arg_mod_structure,
     Properties|Rest1]]],S1),
    gen([yes_no_question,[clause,[pred_arg_mod_structure,
     Properties|Rest2]]],S2),
    append(S1,[and],S1and),
    append(S1and,S2,Sentence).

            Such redistribution also applies to the Function value in the case of coordinated nps:

gen([Function,[and_nounphrase,NP1,NP2]],Gennp):-
    gen([Function,NP1],NP1gen),
    gen([Function,NP2],NP2gen),
    append([NP1gen],[and],L1),
    append(L1,[NP2gen],Gennp).

6.4. Long Distance Dependencies

            The treatment of long distance dependencies is similar to that suggested in GPSG (Generalized Phrase Structure Grammar) and MLG[16] (Modular Logic Grammar). Gap threading is used (via APPEND, not via difference list techniques). In horatio we use two types of gaps: np gaps and pp gaps. Consider:

The house in which he lives is nice.
The house he lives in is nice.

            In the first relative clause (he lives) we have a pp gap filled in by the relative preposi­tional phrase in which (a relative prepositional phrase is a prepositional phrase which contains a relative np). In the second relative clause (he lives in) we have an np gap filled by the zero relative pronoun. Coindexing will ensure that the np within the pp in the first case is coindexed with the antecedent the house; in the second case the zero relative will likewise be coindexed with the antecedent the house.

            Np gaps come into existence when an argbearer misses an np argument in the word list it is attempting to parse, i.e. cannot satisfy it as a regular np and calls on the relevant clause in the nounphrase predicate, which returns something without consuming anything in the input list (hence the P0,P0 values as input and output lists in the code quoted below). When the gap is parsed, the only value that is set is that of the function played by the gap. This is the reason why we often check whether this value is instantiated: if it is, we can be sure that the argbearer has missed one of its arguments. All the other values may come to be in­stantiated by unification, as will become clear when we discuss the percolation of the features in the gap functor. Here is the clause for the parsing of noun phrase gaps:

nounphrase(P0,P0,
        [gap(Antecedent,npgap,Npsem,_,Function,_,index(I),_)],
             index(I),0,0,Rel,Intrel,Function,
             [Function,
              [nounphrase,index(I),agr(third,Number),NP]],
             Number,third,Npsem).

            The zero values are the Weight and Preference values. Note that in the parse tree of a gapped np, only the function, index and agreement value are kept; it should be noted that NP is an uninstantiated variable, not a copy of the antecedent.

            The gap functor holds the following features:

Antecedent: this feature can hold a full copy of the antecedent of the gap;

gap type: npgap or ppgap;

Npsem: semantic restriction on the np;

PPsem: semantic restriction on the pp;

Function: function filled by the gap in clause structure;

Prep : preposition, if any;

Index: we keep an index, so that we do not need a full copy of the antecedent to show the coindexing relation;

NPindex: indexes the np within a pp; necessary for such pps as about whom, where the np whom must be coindexed with its antecedent.

             A pp gap is similar. Here is the relevant clause for the prepphrase predicate:

prepphrase(P0,P0,
     [gap(Antecedent,ppgap,Semnp,Sempp,Function,Prep,index(I),npindex(J))],
     index(I),npindex(J),0,0,Prep,Rel,Intrel,Function,
     [Function,[prepphrase,index(I),_,_]],
      Sempp,Semnp).      

            Note that the two index values are kept distinct. Only the index value for the whole pp is preserved in the parse tree.

            If, apart from Function, the values in the gap functor do not come from the parsing of the missing np or pp, where do they come from? In order to show the instantiation of the val­ues carried out by unification (of which percolation is a special case, being unification up or down a tree), we shall examine the treatment of relative clauses.

            The passing of values is to be found at two places:

1) in a defining clause for nounphrase which accounts for nps governing a relative clause, we percolate the features from the antecedent to the gap functor carried by the relative clause predicate;

2) in the defining clauses for relative clauses, we percolate the values from the gap functor in the relative clause to the xsentence predicate that parses the body of the relative clause. It follows that the xsentence predicate must also bear a gap functor (the gap may come from the subject or from an argument in the arglist of an argbearer that is a constituent of the S).

            Two things need to be kept in mind. The first is that even if we like to think of a direc­tion in the percolation process, it has none. It is a relation between values, not even a process, and certainly not a copy from one place to another. We have already seen that the Function value could not be assigned before the parsing of the gap.

            The second is that in the process just outlined, it is the gap functor carried by the rela­tive clause predicate which provides the link between the antecedent and the sentence (with a missing arg) which makes up the body of the relative.

            Let us consider first the nounphrase clause which takes care of nps governing a rela­tive clause. We have already looked at it (in the section on nps, see p.55), but this time we shall concentrate our attention on the unification pattern between antecedent and gap functor in the relative clause predicate. Here is the code:

nounphrase(P0,P2,[],index(I),
           Preftop,3,_,_,Function,
           [Function,Newrest],
           Number,third,Sem):-
 corenounphrase(P0,P1,[],index(I),
                  Pref,Weight,Rel,Intrel,Function,
                  [Function,Rest],
                  Number,third,NounSem),
 ( var(Intrel); Intrel \= int ),      /* not an interrogative */
 var(Rel),
 xrelclause(P1,P2,Prefrel,RELCL,
               [gap(_,_,Npsem,_,_,_,index(I),_)],
               Personinrel,Funcinrel,NounSem,Number),
 sfok(Npsem,NounSem),
 append(Rest,[RELCL],Newrest),
 accu(Preftop,[Pref,Prefrel]).

            The shared information covers the index, which is the means of indicating coindexing relationships in  horatio. Note that the semantic feature list of the antecedent np (NounSem) is not copied to the gap functor. The latter will hold a semantic restriction (Npsem), and the call to sfok will make sure that the antecedent fits the semantic restriction placed in the relative clause.

            We also see that Number and Nounsem are transferred from the antecedent to the rela­tive clause predicate (xrelclause), but are not part of the gap functor. We shall understand the rationale behind such transfer when we look at the xrelclause predicate.

            Let us take one type of relative clause to follow the fate of the percolated features: we shall look at relative clauses which feature a relative pronoun (non-zero relative) and an np gap. We should bear in mind that the last two features of the relclause predicate are copied from the antecedent: antecedent semantic feature list and antecedent number, in that order.

            The code is the following:

/*   the man whom the woman likes
   the man the book about whom the woman likes */

relclause
(P0,P2,Prefrel,[relative_clause,NP,S],
               [gap(_,npgap,Semrel,_,_,_,index(J),_)],
               Personrel,Functioninrel,NounSem,Numberant):-
     nounphrase(P0,P1,[],index(I),Prefnp,Weight,Semrel,
                rel(J),Functioninrel,
                NP,Number,third,Semwholenp),

/* bear in mind that the structure spanned by nounphrase is NOT the antecedent, but the relative pronoun (e.g. which) or the noun phrase containing a relative element (e.g. a book about whom) */

/* I and J can be the same, but need not: in a book about whom you have read I would refer to book, J to whom */

     nonvar(Semrel), /* the np must have a rel feature */
     sentence(P1,P2,
                 [gap(_,npgap,Npsem,_,Functioninrel,_,index(I),_)],
                 Prefs,S,finite,Personrel,Numberrel,Voice),
     nonvar(Functioninrel),
     ( Weight \= 1 , sfok(Npsem,Semwholenp),
       ifthen((Functioninrel=subject),
            agree(third,Number,Personrel))
;
       Weight = 1 , sfok(Npsem,NounSem),
        ifthen((Functioninrel=subject),
            agree(third,Numberant,Personrel))) ,
     accu(Prefrel,[Prefnp,Prefs]).

/* checking the Weight value is a slightly roundabout way of getting to know whether we have a relative pronoun alone (such as whom - Weight is 1) or an np containing a relative (such as a book about whom - Weight is heavier than 1). */

            The only feature that is directly shared between the gap functor in the relclause predicate and that in the sentence predicate is the gap type: here both gaps need to be np gaps. However, the sharing of values is more extensive if one considers the features in the nounphrase predicate (parsing the relative pronoun, or the noun phrase containing a relative pronoun). First of all, an index is shared: the index of the relative element within the noun phrase (J) is to be found at the relclause level. Second, the Semrel at the relclause level is shared by the nounphrase predicate. Semrel will be the same as Semwholenp when the noun phrase consists only of a relative pronoun. The relevant piece of code for the parsing of relative pronouns is repeated below:

nounphrase(P0,P1,[],index(I),
                 0,1,Semnp,rel(I),Function,
                 [Function, [nounphrase,index(I),agr(third,Number),Tree]],
                 Number,third,[Semnp]):-
 relative(P0,P1,Tree,Semnp,Sempp,np).

            We see that here the rel index and the index for the whole np are also shared. The se­mantic feature is shared, but inserted as a one-element list to conform to np semantics, which is always in list format.

            At the level of the sentence predicate, the semantic restriction on the gap is ex­pressed in the Npsem feature. If we have a noun phrase that consists of a relative pronoun only (the man whom he likes), its weight will be 1, and the semantic check can be carried out be­tween the semantic restriction placed by the sentence predicate on the gap (Npsem) and the semantics of the antecedent itself, which is housed in the last argument but one of the rel­clause predicate (NounSem). If the noun phrase contains a relative pronoun, but is not lim­ited to such a pronoun (as in the man a book about whom I have read), its weight will be dif­ferent from 1, and the semantic test is to be carried out with the semantics of the whole np as second argument (Semwholenp).

            If the noun phrase consists of a relative pronoun only (Weight equals 1), and that rela­tive is subject in the relative clause, we need to check subject-verb agreement in the relative clause. We can do so because the number of the antecedent has been percolated as last argu­ment of the relclause predicate (Numberant), and the person feature is percolated to sen­tence level in the sentence predicate (Personrel). The relative pronoun is always third per­son[17], so we have a check where the first two arguments (the np ones) are third and Numberant (the number of the antecedent) and the third argument (the verbal one) is Personrel carried by the sentence predicate.

            On the other hand, if the subject of the relative clause is a whole np containing a rela­tive pronoun (Weight is different from 1), as in The genius a book about whom has just been published, we need to check agreement with the second argument representing the number of the whole np, not that of the relative pronoun within it. We therefore have

agree(third, Number, Personrel),

where Number is the number of the whole np, returned by the nounphrase predicate.

            To illustrate pp gaps we shall look at interrogative clauses. We take as illustration finite interrogative clauses featuring a pp gap, as in

He knew in whose library she lived.

            We analyse the interrogative clause as a prepositional phrase followed by a sentence featuring a pp gap, and we rely on unification to relate the pp with the pp gap. The code is the following:

intclause(P0,P2,Prefint,[interrogative_clause,PP,S]):-
     xprepphrase(P0,P1,[],index(I),npindex(J),Prefpp,
            Weight,Prep,
                 Semrelint,int,Function,
                 PP,PPsem,Semnp),
     nonvar(Semrelint),
     sentence(P1,P2,
                 [gap(PP,ppgap,Npsem,PPsem,Function, P_Prep,index(I),npindex(J))],
                 Prefs,S,finite,Personint,Numberint,Voice),
     nonvar(Function),
     accu(Prefint,[Prefpp,Prefs]).

            In order to understand the need for two indices in pps (index and npindex), we return to the treatment of relative clauses similar to the interrogative clause we have just looked at. Consider

the man in whose library she lived

We need to establish a link between whose and the man. This link will be carried by the rel(X) functor within the pp.

            First, we must recall the structure of the lexical clause accounting for relative whose:

 determiner([whose|X],X,[det(whose),index(I)],_,norestriction,rel(I)).

            The last argument (rel(I)) has a variable position for an index - I - which will be in­stantiated so as to keep track of the relation between whose and its antecedent. Note that this index is also returned in the parse tree associated with determiner whose.

            Second, note that the rel(I) functor is percolated from the determiner to the noun phrase level when the np is parsed. It is called Intrel1 at the determiner level in the piece of code below, and Intrel at the np level.

corenounphrase(P0,P3,Gap,index(I),Pref,Weight,Rel,Intrel,
            Function,
                      [Function,[nounphrase,index(I),
                             agr(third,Number),DET,N,Parse]],
                    Number,third,Sem):-
     determiner(P0,P1,DET,Number,Rel1,Intrel1) ,
     xnoun(P1,P2,N,Number,Sem,Arglist),
     satisfylist(P2,P3,Gap,Status,Pref,Prec,Rel2,Intrel2,_,
                 Parse,np,Arglist,Func,
                 subject([subject,[nounphrase,index(I),
                          agr(third,Number),DET,N]],
               Verbsem)),
     sfok(Verbsem,Sem),
     ( var(Rel1),var(Intrel1),Rel=Rel2,Intrel=Intrel2;
      nonvar(Rel1),nonvar(Intrel1),Rel=Rel1,Intrel=Intrel1 ),
     ifthenelse(var(Status),Weight=2,Weight=3).

            From the np it will be percolated to the pp:

prepphrase(P0,P2,Gap,index(J),
            npindex(I),Prefnp,3,PREP,Rel,Intrel,Function,
                  [Function,[prepphrase,index(J),prep(PREP),NP]],
                  Sempp,Semnp):-
     prep(P0,P1,[prep(PREP)],Sempp) ,
     nounphrase(P1,P2,Gap,index(I),Prefnp,Precnp,Rel,
                Intrel,np_arg_of_prep,NP,Number,Person,Semnp).

            We can now look at the defining clause for relative clauses where the gap is a pp gap:

relclause(P0,P2,Prefrel,[relative_clause,PP,S],
               [gap(_,ppgap,Semrel,PPsem,_,Prep,index(J),npindex(I2))],
               Personrel,Function,NounSem,Numberant):-
     xprepphrase(P0,P1,[],index(I),npindex(I2),
            Prefpp,Weight,Prep,
                  Semrel,rel(J),
                  Function,
                  PP,PPsem,Semnp),
     nonvar(Semrel),
     sentence(P1,P2,
                 [gap(PP,ppgap,Npsem,PPsem,Function,P_Prep, index(I),npindex(I2))],
                 Prefs,S,finite,Personrel,Numberrel,Voice),
     nonvar(Function),
     accu(Prefrel,[Prefpp,Prefs]).

            The index of the rel functor at the pp level must be the same as that of the index func­tor in the relclause gap, i.e. point to the antecedent. In our example sentence, I would refer to the whole pp (in whose library), I2 would refer to the np within the pp (whose library), and J to the antecedent (the man).

7. Generation Issues

            There is some research in generation with unification grammars which tends towards the  ideal of reversibility: the same grammar is made use of in analysis and generation (cf. Isabelle et al. 1988; Dymetman and Isabelle 1990). By same grammar is meant here the same code embodying the grammar, not just the same linguistic concepts.

            horgen, the generation counterpart of horatio, does not use the same grammar. The reason is that  parses are supposed to be fully unambiguous and therefore generation ought to be much simpler than analysis, which is faced with potentially and actually ambiguous inputs.

            In this piece of research generation is used mainly as a test, to make sure that the sur­face text is retrievable from the parses produced by the analyzer. Such a property of parses is crucial in machine translation.

            In horgen generation is not backtrackable. Only one surface string is produced per parse. The non-backtrackability of generation implies that the clause order of the generation predicates is significant, and that the model exceptions first, general case last can be used without necessarily making use of the cut.

7.1. Freezing the Variables in the Parses

            The parses produced by horatio account for coindexing by means of shared variables in the index(VAR) functors. We need to convert them to strings before we start generating from them, to prevent generation from running wild. The conversion of variable to string is achieved by a small script applied to the parses before feeding them to horgen.

            This script is a Kornshell script executable under the MKS implementation of Unix utilities for OS/2 and MS-DOS. It runs as follows:

sed -E -e "s/index\(\_(....)\)/index\(\$\1\$\)/g"  $1 > tmp.tmp
if [ -s tmp.tmp ]
 then
     mv tmp.tmp $1
 else
     echo "Error !!!"
fi

            SED is a call to the UNIX stream editor. The command is to perform a global (g) substitution (s) of the source pattern index(_....), where the dots stand for alphanumeric char­acters. The target pattern includes the 4 alphanumeric characters (the digit 1 in the target pat­tern refers to the first parenthesized expression in the source pattern). They appear between dollar signs, the string delimiter in Arity Prolog. The variable marker (the underscore) in the source pattern is not copied to the target pattern. The net result is to transform variables into strings. An example is given below in the section on the cycle.

7.2. The Cycle

            horatio produces parses in which the gf's (grammatical functions) are deep, i.e. cor­respond to the specifications in the arglist of the predicate, the argument bearer. To retrieve the surface string, this work must be undone. Since the days of standard transformational grammar, it has been known that the control relations (i.e. the control of the subjects of sub­ordinate clauses by higher nps in argument functions), and transformations such as passive and raising, apply cyclically, to the more deeply embedded clauses first.

            Even if we do not wish to stick to a transformational treatment, this insight of TG is still crucial. To retrieve the surface string, we will need to undo the work done by horatio in cyclic fashion. Hence the title of the appropriate section in the horgen code: The Cycle.

            Consider the parse corresponding to the surface string:

I believe him to have been killed.

            Schematically, it looks like this:

I believe [ VAR has killed him ].

            We first need to apply the passive predicate in generation mode, whose job is to move the object into subject position and to create a by-phrase for the subject (here this by-phrase will not lead to the generation of anything, since it has no lexical material in the np slot). We obtain something along the lines of:

I believe [ he has been killed]

            We now apply subject to object raising: the subject of the subordinate clause is raised to object position in the main clause. At the same time the main verb of the subordinate clause is turned into an infinitive with TO:

I believe him [ he to have been killed ]

            Control will now take care of ghosting (i.e. depriving of lexical content) the controlled subject.

I believe him [ VAR to have been killed ]

            Generating the VAR will yield the empty list, which will later disappear by the list flattening and appending operations which complete the generation process:

I believe him to have been killed.

            We will now consider the generation of the resulting sentence in somewhat more detail, as a way of illustrating the cycle principle in action.

            We start from what the parser gives us. The pretty-printed parse looks like this:

 12
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: none,mod: none,tns: present)
    predicate(believe_1,agr(finite,present,sing,first))
     subject
      nounphrase
      index(_0398)
      agr(first,sing)
      ppro(first,sing,_041C)
     object
      clause
       pred_arg_mod_structure
       prop(vce: passive,asp: [perfect],mod: none,tns: present)
        predicate(kill_1,agr(en_passive))
         object
          nounphrase
          index(_0580)
          agr(third,sing)
          ppro(third,sing,masc)

            We see that the object of believe is clausal, and that kill has an object and no subject (passive has been undone; but the agr functor and the property list record that we have a pas­sive clause).

            Of course, the generator does not work on pretty-printed objects. Here is the raw object produced by the parser. It is a Prolog term (carriage returns have nonetheless been added to improve readability):

[12,[clause,[pred_arg_mod_structure,
prop(vce: active,asp: none,mod: none, tns: present),
[predicate(believe_1,agr(finite,present,sing,first))],
[[subject,[nounphrase,index(_0398),agr(first,sing),ppro(first,sing,_041C)]],
[object,[clause,[pred_arg_mod_structure,
prop(vce: passive,asp: [perfect],mod: none,tns: present), [predicate(kill_1,agr(en_passive))],
[[object,[nounphrase,index(_0580),
agr(third,sing),ppro(third,sing,masc)]]]]]]]]]].

            We note that two of the three uninstantiated variables appear as arguments of the in­dex functor. These are the ones that we need to freeze. We do so by converting them into strings, as explained in the preceding section. The result is as follows:

[12,[clause,[pred_arg_mod_structure,
prop(vce: active,asp: none,mod: none,tns: present),
[predicate(believe_1,agr(finite,present,sing,first))],
[[subject,[nounphrase,index($0398$),agr(first,sing),ppro(first,sing,_041C)]],
[object,[clause,[pred_arg_mod_structure,
prop(vce: passive,asp: [perfect],mod: none,tns: present),
[predicate(kill_1,agr(en_passive))],
[[object,[nounphrase,index($0580$),
agr(third,sing),ppro(third,sing,masc)]]]]]]]]]].

            We have two clauses, with the controlled clause in second position in the arglist of the controlling clause. The following defining clause for prepgen therefore applies:

prepgen([H2_cl1,[pred_arg_mod_structure,Prop1,Pred1,[Firstarg_cl1,
            [H1_cl2,[clause|Rcl2]|R2]|R1]]],
Res4):-
second_header(H2_cl1),
first_header(H1_cl2),
passive([clause|Rcl2],Res1),
control([H2_cl1,[pred_arg_mod_structure,Prop1,Pred1,[Firstarg_cl1,
        [H1_cl2,Res1|R2]|R1]]],Res2),
oraising(Res2,Res3),
passive(Res3,Res4).

            The first job carried out by prepgen is to check the clause headers, to make sure that we are in the right environment to apply the processes defined in passive, control, and oraising. Let us look at the check second_header. It will be instantiated in the follow­ing way when the call is made:

second_header(clause).

            The call will succeed because it is a fact recorded in the packet of clauses for the predi­cate second_header, listed below:

second_header(clause).
second_header(np_modifier).
second_header(adj_arg).
second_header(arg).
second_header(noun_arg).

            The next check, first_header, will consist of the following call

first_header(object).

and will also succeed, for a similar reason.

            Passive will then try to apply in the controlled clause. The instantiation is the follow­ing when the call is made:

passive([clause,[pred_arg_mod_structure, prop(vce:passive, asp:[perfect], mod:none, tns:present), [predicate(kill_1,agr(en_passive))],
[[object,[nounphrase, index($0580$),agr(third,sing),ppro(third,sing,masc)]]]]],_2888).

_2888 will get instantiated to the following term:

[clause,[pred_arg_mod_structure, prop(vce:passive, asp:[perfect], mod:none, tns:present), [predicate(kill_1,agr(en_passive))],
[[subject,[nounphrase, index($0580$),agr(third,sing),ppro(third,sing,masc)]]]]

            We see that the (deep) object has been turned into a (surface) subject. The job was per­formed by the following clause for passive:

passive([H1,[pred_arg_mod_structure,prop(vce:passive,B,C,D),Pred1,
    [[object,O]|Otherargs]]],
[H1,[pred_arg_mod_structure,prop(vce:passive,B,C,D),Pred1,
    [[subject,O]|Otherargs]]]):- second_header(H1).

            Note that the job is carried out in the head of the clause, the body consisting only of a header check.

            The next predicate to be called is control, which gets the whole structure as argu­ment, but with the result of the call to passive instead of the original controlled clause. The instantiation when the call is made is the following:

control([clause,[pred_arg_mod_structure,
prop(vce:active,asp:none, mod:none,tns:present),
[predicate(believe_1,agr(finite,present,sing,first))],
[[subject, [nounphrase,index($0398$),agr(first,sing), ppro(first,sing,_041C)]],
[object,[clause,[pred_arg_mod_structure,
prop(vce:passive, asp: [perfect],mod:none, tns:present),
[predicate(kill_1,agr(en_passive))],
[[subject, [nounphrase, index($0580$), agr(third,sing,masc) ]]]]]]]]],_28A8).

            As a result of the call, _28A8 is instantiated to the following term:

[clause,[pred_arg_mod_structure,
prop(vce:active,asp:none, mod:none,tns:present),
[predicate(believe_1,agr(finite,present,sing,first))],
[[subject, [nounphrase,index($0398$),agr(first,sing), ppro(first,sing,_041C)]],
[object,[clause,[pred_arg_mod_structure,
prop(vce:passive, asp: [perfect],mod:none, tns:present),
[predicate(kill_1,agr(en_passive))],
[[subject, [nounphrase, index($0580$), agr(third,sing,masc) ]]]]]]]]]

i.e. the same ! Control had no work to do, because the controller does not occur in the con­trolled clause, and so does not need to be ghosted. Control succeeded doing nothing on ac­count of the following of its defining clauses:

control(X,X).

            The next predicate to be applied is oraising. Since control has not modified the structure, we get the following call:

oraising([clause,[pred_arg_mod_structure,
prop(vce:active,asp:none, mod:none,tns:present),
[predicate(believe_1,agr(finite,present,sing,first))],
[[subject, [nounphrase,index($0398$),agr(first,sing), ppro(first,sing,_041C)]],
[object,[clause,[pred_arg_mod_structure,
prop(vce:passive, asp: [perfect],mod:none, tns:present),
[predicate(kill_1,agr(en_passive))],
[[subject, [nounphrase, index($0580$), agr(third,sing,masc) ]]]]]]]]],_2910).

            Oraising does change the structure. The defining clause that applies is the following:

oraising([H1,[pred_arg_mod_structure,Prop1,
         [predicate(Pred1,AgrPred1)],
    [Arg1,
     [object,[clause,
[pred_arg_mod_structure,Prop2,[predicate(P,agr(Agr))],
                     [[Subject|Rest]|Otherargs]]]]|R]]],

H1,
[pred_arg_mod_structure,Prop1,
        [predicate(Pred1,AgrPred1)],
    [Arg1,
     [object|Rest],
     [object,[clause,
[pred_arg_mod_structure,Propnew2,[predicate(P,agr(Agr))],
                     [Otherargs]]]]|R]]]):-

allsubject(Subject),
second_header(H1) ,
oraise(Pred1,Requires),
nonfinite(Agr),
Prop2 = prop(Voice,Aspect,Mod,Tns),
Propnew2 = prop(Voice,Aspect,Mod,tns:Requires).

            Allsubject is, like second_header, a check on the environment of the rule. It requires that the variable named Subject cover a subject function. The code for allsubject is the following:

allsubject(S):- subject(S),!.
allsubject(S):- subject_active(S).

subject
(subject).
subject(subject_pass).
subject_active
(subject).
subject_active(subject_inf).

            The call succeeds in the present case since Subject is instantiated to subject.

            Next, the oraise predicate checks that the verb of the main clause belongs to the right class, and at the same time it instantiates the Requires feature, which will tell us whether what will remain of the controlled clause should be infinitive or gerundive. The call here is the fol­lowing:

oraise(believe_1,X),

which succeeds and instantiates X to to. Oraise calls on the macro m_verb clause:

oraise(Oraisingverb,to):-
m_verb(vinf,_,Oraisingverb,_,_,_,_,_,_,_,_,_,_,
        [np(_,_,surf_object,_),
         np_vp(oblig,to_inf,object)]).

            The m_verb responsible for the success of oraise is the lexical clause for the rele­vant reading of BELIEVE, namely:

m_verb(vinf,_,believe_1,believe,believe,believe,believes,believing,
        believed,believed,believed,trans,human,
        [np(oblig,posprec(1,Wnp),surf_object,_),
         np_vp(oblig,to_inf,object)]).

/* he believes him to teach linguistics */

            Next is the check on the agreement value of the predicate in the controlled clause; it cannot be a finite clause, to which subject to object raising could not apply. In this case the call is:

nonfinite(en_passive)

which succeeds.

            Oraise then proceeds to copy the Property field of the controlled clause to a new variable, but replaces the value for tense (of no application in non-finite clauses) with the Requires feature, i.e. to, which will indicate to the relevant clause of the generator that an in­finitive with to vp should be generated as the remaining vp in the controlled clause.

            Let us now look at the structure returned by oraising. We see that the subject of the controlled clause is moved to the controlling clause, and its function changed to object. The controlled clause is returned as second object, but now misses its subject. Finally, the new property field is assigned to the controlled clause, which is thereby untensed. In the case of our example sentence, the arglist for the matrix clause has the following two objects:

[object,[nounphrase,index($0580$), agr(third,sing), ppro(third, sing, masc))),
[object,[clause,[pred_arg_mod_structure,
prop(vce:passive,asp:[perfect], mod:none, tns:to],
[predicate(kill_1, agr(en_passive))],[[]]]]]]

            Returning to prepgen, we find that there is another call to passive, this time on the matrix clause. This call will succeed trivially, since passive does not apply there:

passive([H1,[pred_arg_mod_structure,prop(vce:active,B,C,D)|R1]|R2],
             [H1,[pred_arg_mod_structure,prop(vce:active,B,C,D)|R1]|R2] ):-
second_header(H1),
!.

            Prepgen has now finished, and the generator will take the results it has delivered and attempt to generate a string from them:

generate(Tree,Sentence):-
      prepgen(Tree,Treeprep),
      gen(Treeprep,List),
      flatten(List,Sentence).

Sentence will be instantiated to [i,believe,him,to,have,been,killed] and output as:

i believe him to have been killed.

Appendix A. Non-Standard Arity Prolog Predicates

            This appendix briefly describes the pre-defined Prolog predicates that are specific to Arity Prolog (or at least non-standard) and have been made use of in horatio and/or horgen. Note that + in front of an argument indicates that it should be instantiated when the call to the predicate is made; a - sign indicates that the variable should still be uninstantiated.

Execution Control

[! P1, P2, ...!]: the snip symbol ([!!]) isolates code (a set of goals, the Ps here) to be skipped in the event of backtracking. Example from horatio on page 69.

abort(1): with 1 as argument, abort returns to the operating system

ifthenelse(Condition,YesAction,NoAction): If Condition succeeds, YesAction is attempted; otherwise, NoAction is. Ifthenelse can be simulated by a disjunction of goals: (Condition, Yesaction; not(Condition), NoAction).Example from horatio on page 56.

String Manipulation

concat(+String1, +String2, -String3): String3 results from the concatenation of String1 and String2. Example from horatio:

     concat(Output,$.ter$,Outterm),
     concat(Output,$.lst$,Outlist),

used to produce the names of the output files: .ter (raw Prolog terms) and .lst (pretty-printed terms)

File Operations

create(-Handle, +Filename): a file named Filename is created and associated with handle Handle. Example from horatio:

     dealwith(Input,_):-
                         Input=stdin,
                         create(HandleIn,'bidon'),
                         close(HandleIn),!.

open(-Handle, +Filename, +Mode): Mode is r for reading, w for writing or a for append­ing; a file with name Filename is opened and associated with handle Handle, for the operation specified in Mode. Example from horatio:

     dealwith(FileIn,HandleIn):-
                         open(HandleIn,FileIn,r).

close(+Handle): the file associated with handle Handle is closed. Example from horatio:

     dealwith(Input,_):-
                         Input=stdin,
                         create(HandleIn,'bidon'),
                         close(HandleIn),!.

read_line(+Handle, -String): Handle 0 is standard input; String is read as a one-line string from the input stream associated with Handle (the end of the string is indicated by a carriage return). Read_line offers minimal editing facilities (backspacing for deletion). Example from horatio:

     start(Outterm,Outlist,_,stdin):-
                 open(Y,Outterm,w),
                 open(Z,Outlist,w),
                 repeat,
                 nl,
                 write($Please key in your sentence or stop. to stop$),
                 nl, 
                 read_line(0,S),
                 open(HandleIn,'bidon',w),  /* for writing */
                 write(HandleIn,S),
                 close(HandleIn),
                 open(Handlenew,'bidon',r),    /* for reading */
                 getsentence(Sentence,Handlenew),
                 close(Handlenew),
                 nl,
                 write(Sentence),
                 process(Sentence,Y,Z).

Appendix B. File Organization and Compilation Directives

File Organization

            Both the parser (horatio.exe) and the generator (horgen.exe) result from compiling and linking several source files. The compiling and linking is achieved by calls to the Arity Prolog Compiler and the Microsoft Linker by two .BAT or .CMD files, i.e. batch files for MS-DOS or for OS/2. Except for the file name extension the files are the same for DOS and OS/2

Moratio.cmd

            Moratio.cmd is an OS/2 batch file used to generate and link the object modules, to create horatio.exe, the executable file for the parser; horatio.ari, lexatio1.ari, lexatio2.ari, lexatio3.ari are text files containing the source code of the program; horatio.ari is the gram­mar proper, lexatiox.ari are the files holding the lexicon; the full commented source is avail­able on the companion disk.

apc horatio,,/n

/* compiling the grammar, generating the data base */

/* APC.EXE is the Arity Prolog compiler */

/* horatio.ari contains the grammar for the parser */

/* it is compiled and produces horatio.idb as data base */

/*  /n indicates that a new Prolog data base needs to be created; its name defaults to that of the source file, but the extension is .idb instead of .ari, the default file extension for Arity Prolog source code */

/* the object file's name defaults to horatio.obj, since the second argument is left empty (hence the two commas with nothing in between) */

apc lexatio1,,horatio

/* compiling the lexicon (part 1), adding to the same data base (last argument) */

apc lexatio2,,horatio /* idem (part 2) */
apc lexatio3,,horatio /* idem (part 3) */
link code lexio1 lexio2 lexio3 horatio lexatio1 lexatio2 lexatio3, horatio,, arity doscalls,,

/* linking the code modules; horatio.obj, lexatio1.obj, lexatio2.obj, lexatio3.obj result from the compiling process carried out in the preceding lines by the Arity Prolog compiler. Code is part of the Arity Prolog delivery and is to be used as the first object module for all Arity Pro­log applications; lexio1 to 3 are cloned (with the utility clone.exe, also part of the Arity Prolog delivery) so as to be able to make use of far Prolog.  Arity and doscalls are the libraries. Un­der DOS there is no need for the doscalls library. The second argument is the name of the .exe file; here it is horatio.exe */

Morgen.cmd

/* similar to MORATIO.CMD; the first lexicon file (lexgen1.ari) differs slightly from the one used in analysis, namely lexatio1.ari; the executable file for the generator is called horgen.exe */

apc horgen,,/n
apc lexgen1,,horgen
apc lexatio2,,horgen
apc lexatio3,,horgen
link code lexio1 lexio2 lexio3 horgen lexgen1 lexatio2 lexatio3, horgen,, arity doscalls,,

Arity Prolog Compilation Directives

            When several files are meant to be compiled and linked together to yield a single execu­table file, the first code file to be compiled needs a declaration for a predicate called main, of zero arity and to be declared public. We therefore find in the file horatio.ari:

:- public main/0.

            The main predicate is itself defined by a call to the go predicate, which provides the GO step.

            All lexical clauses in both the parser and the generator are declared external and far. For instance, in horatio.ari, we have compilation directives such as

:- extrn verb/10:far.

            This means that the predicate verb is to be found in an external file (in casu lexa­tio2.ari), has arity 10 (needs 10 arguments), and is to be compiled as far[18] code.

            In the file lexatio2.ari, we find the following compilation directive

:- public verb/10:far.

making the predicate verb public, i.e. callable from other modules.

            Besides, each module containing far code needs to have its own segment declaration. For instance, in lexatio2.ari we find

:- segment(lexio2).

Lexio2.obj is obtained by an appropriate call on the clone.exe program, part of the Arity package, namely

clone lexio2

Appendix C. Input and Output

Input

            In horatio input is either from the keyboard (standard input, stdin in horatio code), or from a text file. In the latter case, the text must be pure ASCII, with one sentence per line, the end of the file being marked by a line containing only the word stop.

            In the former case, the input is read by the predicate read_line (which provides minimal editing facilities, such as the use of backspace to delete characters already entered at the keyboard) and saved into a dummy file called bidon. Then the dummy file is opened and input proceeds in the way described in the previous paragraph.

            The read_line predicate is an Arity Prolog extension to standard Prolog. It takes two arguments: the first is the read handle, which is 0 in the case of standard input. The second is the string to be read from the input stream associated with the handle. The string is read as a single line.

            The next job in input is to convert the string to a word list. This is achieved by a stan­dard string to word list converter program. We use the one given in Bratko 1990.

            The resulting word list is then submitted to the parsing process.

            It should be noted that in horatio input is within a loop. We keep reading in sentences until we hit the word list [stop], resulting from the end-of-file marker stop. When we do, we call the abort predicate to return to the operating system.

            In parsing we need to get at all the possible parses in the case of ambiguous input. This is why we fail at the end of the parsing process for a given sentence, whereas in generation we succeed (the generator does not backtrack once it has produced a string corresponding to a given parse).

            In the generator horgen input is never from the keyboard, as the expected input is a parse. Such parses are read in with the standard Prolog read predicate, which is able to read any well-formed Prolog term.

Output

            In horatio we need to produce two types of output:

1) raw objects: these are the Prolog terms which result from the parsing process. They are feedable to the generator horgen (after a transformation into strings of the uninstantiated vari­ables in the index functors - this transformation is described above, on page 107). Such parses go into a file with a .ter extension (for terms). The file name is created by concatenating the extension .ter to the base file name elicited from the user. We make sure that the parses end with a dot, so that they can be read in by the read predicate in the generator.

2) pretty-printed objects: these are meant to be examined by the grammar-designer for de­bugging and other purposes. They are collected into a file which gets an .lst extension. The pretty-printer used is elementary and standard. It is a slight adaptation of the one to be found in Clocksin and Mellish 1981 (the adaptation concerns the pretty-printing of uninstantiated vari­ables).

            horgen produces strings as output. It is a trivial extension to add the capability of capitalizing the first word of the output, as well as proper names occurring in the string. It is equally trivial to ensure that questions end with a question mark. These extensions are left as exercises for the reader.

Appendix D. Selective Lexicon Downloading

Getvoc.awk

BEGIN {RS="."}

{ if (FILENAME==ARGV[1])
        for (i=1;i<=NF;i++)  x[tolower($i)]= tolower($i)
}

{ if (FILENAME==ARGV[2]) {  RS= "@";FS=",";
                            for (i=2;i<=10;i++)
                                      {
                                      if ( tolower($i) in x &&  !($0 in y))
                                                   {print  > "voc.ari"
                                                    y[$0]="in"
                                                    }
                                      }
                         }
}

            The first argument is the text to be parsed, the second the lexicon to be searched. In the first file (the text to be parsed) we use a period (.) as record separator (RS is set to dot). In the second file (the lexicon to be used) we separate records with an @, which we therefore assign as new value for RS.

             The individual words are saved into the x array, and the lexicon is then searched for matches in the lexeme and morphological variant fields. The selected entries are saved to a file, voc.ari; the second condition in the last IF clause prevents repetitions in the voc.ari file. A batch procedure will then take care of the compilation of voc.ari, and of the linking of the re­sulting object code with that produced for the grammar files, in order to build the required ex­ecutable. Under OS/2, the batch file is the following:

echo %1. %1.  >filin
awkl -f getvoc.awk %1 lex.ari
copy lexdcl.ari+voc.ari lexsel.ari
apc lexsel,,horatio
link code lexio1 lexio2 horatio lexatio1 lexsel , horex,,arity doscalls,,
horex < filin

(%1 stands for the text file; awkl invokes the large model of awk in the MKS OS/2 implemen­tation[19]; lex.ari houses the complete horatio lexicon; lexdcl.ari contains the necessary compiler declarations for Arity Prolog; apc calls the Arity Prolog compiler; the linker (link) produces the executable file horex, which is fed the name of the text file twice, so that it can produce a %1.lst and a %1.ter file, the latter containing the raw objects, and the former the pretty-printed ones).

Appendix E. Importing Lexical Entries from Ldoce

Introduction

            It should be stated from the outset that a certain familiarity with at least the published version of LDOCE is presupposed in this appendix. A whole book (namely Boguraev and Briscoe 1989) has recently been devoted to computer applications of this dictionary data base, and the reader is referred to the first four chapters of that book, and the references given there, for more information on LDOCE, particularly on its grammar coding system. The published version of LDOCE contains explanatory prefatory material and a table of the grammar codes used in the dictionary, but does not explain the semantic or field codes used in the computer tape.

            There is now a lot of interest in the reuse of available lexical resources. However, much of the available literature on the topic is specula­tive and programmatic in nature. This appen­dix, on the contrary, is unashamedly pragmatic. It reports on an experiment that has been car­ried out, going down to the level of specifying the queries on the data base and the awk[20] pro­grams used to reformat the imported data.

            The application programs for retrieving material from the lexical data base and the awk programs all run on a PC under MS-DOS.

Description of the Liège Ldoce Data Base

            LDOCE in Liège is in relational data base format. The retrieval software is made up of a series of application programs written in Clipper, whose data definition and query lan­guage can be defined as an enhanced dBase language. The next section looks at the major tables and their fields. The data base design and the transformation of the LDOCE computer tape into the relational tables was carried out by Jacques Jansen. The application programs were written by the author.

Ldoce Data Base Design

Lemma Data Base: Coword

Field Name

Type

Nature

Width

ENTRYKEY

Char

shared by all LDOCE-derived dbf as a link to this dbf

4

HEADCLAS

Char

simple, compound or run-on: S,C,R

1

DEFINUM

Char

for run-on entries: link to the relevant  definition number in the main entry

2

VARISUF

Char

flag: is there a variable suffix? (e.g.. ic/ical, ise/ize)

1

FLAGCV

Char

 included in LDOCE's controlled vocabulary ?

1

FLAGIF

Char

does final consonant redoubling apply
(red -> redder)?

1

POS

Char

Part of speech: up to five different POS, each coded in one byte

5

NOTE

Char

 reference to grammatical note

3

WORD

Char

 lemma

34

Structure for ncoword.dbf

Ncoword is similar to coword, but also includes lexicographical WEIGHT of the lemma, measured in terms of number of definitions, examples, different grammatical codes and idi­omatic structures associated with the entry.


 

 DEFNU

Num

Number of definitions associated with the entry

 2

 IDIONU

Num

Number of idioms associated with the entry

 2

 EXNU

Num

Number of examples associated with the entry

 3

 GRNU

Num

Number of grammatical codes associated with the entry

 3

Definition Data Base: Codefi

Field Name

Type

Nature

 Width   

ENTRYKEY

Char

cf coword

4

HEADCLAS

Char

I identifies an idiomatic structure

1

DEFINUM

Char

sequential number of the definition

2

DEFILET

Char

small letter associated with a given definum: a, b, c, ...

1

DEFIMAT

Char

field codes: a four-byte field; codes the subject matter(s) (1 or 2) to which the lexical item under the given definition belongs

4

DEFISEM

Char

semantic codes: a ten-byte field; codes such properties as the semantic requirements to be placed on the deep subject (byte 5) and objects of verbs (bytes 10 and 8); also codes inherent se­mantic properties for nouns and selection restric­tions for adjectives (byte 5)

10

LINECNT

Char

sequential value of definition line (each defiline is 76 char long, and several may be needed to cover  a single definition; the @ sign identifies the first line )

1

DEFILINE

Char

text of one line of a given definition

76

Idiom Data Base: Coidio

Field Name

Type

Nature

Width

ENTRYKEY

Char

cf coword

4

HEADCLAS

Char

I identifies idioms or idiomatic structures

1

DEFINUM

Char

cf codefi

2

VARISUF

Char

cf coword

1

IDIOM

Char

 text of the idiom

60

Example Data Base: Coexam

Examples are associated with definitions and idioms

Field Name

Type

Nature

Width

ENTRYKEY

Char

cf coword

4

HEADCLAS

Char

cf codefi

1

DEFINUM

Char

cf codefi

2

DEFILET

Char

cf codefi

1

EXAMNUM

Char

sequential number of the example  (associated with a given word sense or idiom)

2

DCODNUM

Char

hand-coded: relation between example and grammatical code

2

LINECNT

Char

cf codefi

1

EXAMLINE

Char

line of example text

76


Grammatical Code Data Base: Codcod

Field Name

Type

Nature

Width

ENTRYKEY

Char

cf coword

4

HEADCLAS

Char

cf codefi

1

DEFINUM

Char

cf codefi

2

DCODNUM

Char

sequential number of grammatical code  associ­ated with a given definition

2

LEFTLINK

Char

type of link: optional, 'esp.', obligatory; this field codes the cohesiveness of the link between the lexical item and its lexical left environment

1

LEFTTYPO

Char

type of font: italic, bold,...: this field can be used to retrieve the nature of the coded lexical left environment; for instance, prepositions will be coded italic and adverbial particles bold

1

LEFTCTX

Char

word or word group specified as lexical  left en­vironment

3

GRAMCODE

Char

three-byte grammatical code; consult the LDOCE Table of Codes (in the printed form of the dic­tionary)

3

RGHTLINK

Char

cf leftlink

1

RGHTTYPO

Char

cf lefttypo

1

RGHTCTX

Char

cf leftctx

3

GRAMCOMM

Char

grammatical comment (partly formalized)

3

            It might be useful to look at the representation of a given lexical item in the Liège LDOCE data base, so that the reader can build for himself a more concrete picture of what the data base looks like. I have taken the entry for the verb believe, because it is short and illus­trates the results of the decompaction process that has been applied to the LDOCE grammar fields, a process that needed to be carried out if the data base was to prove usable for the re­trieval of grammatical information, as illustrated in this appendix. The need for a decompaction procedure is also discussed in Michiels 1982.

Believe in Ldoce

believe / pron /  v [Wv6]  1 [I0]  to have a firm religious faith  2  [T1] to consider to be true or honest: to believe someone |  to believe someone's reports  3 [T5a,b;V3;X (to be) 1, (to be) 7] to hold as an opinion; suppose:  I believe he has come. | He has come, I believe.|  "Has he come?" "I believe so" |  I believe him to have done it. |  I believe him (to be) honest  -- see unbelief (USAGE)

In the Liège data bases:

In COWORD

Entrykey

B@ZV

Headclas

S

Flagcv

1

Flagif

0

Pos

v

Word

believe

In CODCOD

Entrykey

B@ZV

B@ZV

B@ZV

B@ZV

B@ZV

B@ZV

B@ZV

Headclas

S

S

S

S

S

S

S

Definum

1

2

3

3

3

3

3

Dcodnum

1

1

1

2

3

4

5

Gramcode

I0

T1

T5a

T5b

V3

X1e

X7e


In CODEFI

Entrykey

B@ZV

B@ZV

B@ZV

Headclas

S

S

S

Definum

1

2

3

Defimat

RLRN[21]

....

....

Defisem

....H.....[22]

....H....X[23]

....H....T[24]

Linecnt

@

@

@

Defiline

to have a firm re­ligious faith

to consider to be true or honest

to hold as an opin­ion, suppose

In COEXAM

(first three examples only)

Entrykey

B@ZV

B@ZV

B@ZV

Headclas

S

S

S

Definum

2

2

3

Examnum

1

2

1

Dcodnum

1

1

Linecnt

@

@

@

Examline

to believe some­one

to believe some­one's reports

I believe he has come

            A good deal of criticism has been leveled recently against the use of the relational data base model for the implementation of lexical data bases. It is pointed out that the restrictions of the relational model (fixed number of fields, fixed field length) make it extremely difficult to implement a lexical data base in such a model. For instance, lexical items have different num­ber of homographs; homographs have different numbers of associated definitions; definitions have different numbers of associated examples; definitions and examples are of varying length, from a single word to a full sentence. However, it is in the very essence of the relational model to work with a series of tables, rather than a single one. Consequently, the fact that one defini­tion has one example, whereas the following has six, is not really a problem if the definition table and the example table are distinct tables related by one or several common fields, as they are in the Liège LDOCE data base.

            Clearly, efficiency considerations make the relational model unusable for real time re­trieval in NLP tasks such as machine translation. But then again, the full power of the rela­tional model is not necessarily needed in such cases, as retrieval is most often to be carried out on the basis of the morphological variant or lemma, and not on the basis of the complex que­ries on grammatical, semantic and field code information that the relational model can handle. The power of the latter, however, can be put to good use in a lexicographer's workbench.

The Importation Process from Ldoce into Horatio's Lexicon

            In its present state of development, horatio works with 18 verb classes. The formats of the classes selected for the importation experiment are briefly described below. For each verb class, the filter used on the Liège LDOCE data base is described, as well as the raw data from the Liège data base and the AWK program used to convert to the format used by horatio . 

            An essential feature of horatio is that it goes down to the level of reading, i.e. a se­lected interpretation of a lexical item according to properties of its environment. Such a level of delicacy is crucial for most NLP tasks, machine translation being a prime example.  As can be guessed from the sample entries reproduced in this appendix, the importation from LDOCE concerns mainly entries beginning with a, and no effort has been made to select entries on the basis of frequency, lexicographical weight or membership of a domain-related sublanguage (lexicographical weight could have been assessed on the basis of the information contained in ncoword and domain-relatedness captured by means of the subject field codes in defimat in codefi).

            We shall concentrate on showing how the m_verb clauses for various verb classes (indicated by the first argument of the m_verb clause) can be derived fom LDOCE by a three-step process:

a) obtain the relevant material from LDOCE (this is accomplished by EXPORT, a Clipper application which enables the linguist to select lexical entries - at the reading level - according to entry-level, semantic and grammatical properties; irregular inflectional morphology is taken care of by retrieval from a specialized data base; regular inflectional morphology is generated by a procedure belonging to EXPORT)

b) reformat the material obtained in a) by means of the appropriate AWK program

c) hand-check the results (the assignment of LDOCE semantic codes is often debatable; inflec­tional variants also have to be checked)

Intransitive Verbs

A. FILTERS ON LDOCE

on ENTRIES (coword)

            'v' $ pos .and. headclas ="S"

(the POS field includes (operator $) "v" - i.e. it is a verb -  and headclas is "S"- i.e. it is not a compound, multi-word unit or a run-on entry - "C" or "R" values)

on SEMANTICS (codefi)

            headclas # "I"

Headclas must be different from (# operator)  "I", i.e. must not be an idiom.

on GRAMMAR (codcod)

            gramcode= "I0 " .and. rghtlink = " "

Gramcode is a three-byte code; rghtlink gives the nature of the link with a lexical element at­tached to the item on the right. By specifying an empty rghtlink we make sure that we do not select prepositional or phrasal verbs (for which see below, p. 138 and 141).

B. RAW IMPORTED MATERIAL

            As pointed out above, the application programs include a rather elementary generator of inflectional variants; this generator has access to a data base of irregular forms and uses the flag for consonant redoubling of coword (FLAGIF field - see above, p. 126).

(Rnu = LDOCE reading number)

LEX                  Vs        Ving      Ved    Ven    Pos  Gr   Sem    Rnu            
abate                abates abating abated abated v     I0  ....T.....  1
DEF   (of winds, storms, disease, pain, etc.) to become less strong; decrease
EX    The ship waited till the storm abated before sailing out to sea

abrade               abrades abrading abraded abraded v     I0  ....Z....5  1a
DEF   (esp. of skin) to wear away by hard rubbing; to cause (esp. skin) to wear away by hard rubbing

accelerate           accelerates accelerating accelerated accelerated v     I0  ....Z....Z  1       
DEF   to (cause to) move faster

accept               accepts accepting accepted accepted v     I0  ....H....Z  1       
DEF   to take or receive (something offered or given), esp. willingly; receive with favour
EX    I cannot accept your gift. He asked her to marry him and she accepted (him)

C. AWK PROGRAM TO CONVERT RAW MATERIAL TO HORATIO FORMAT

            The if-clauses test the value of the fifth byte in the 10-byte field allocated to the seman­tic codes; this byte houses the restrictions on the subject: J codes "Movable and Solid", deemed to roughly correspond to the feature "thing" in horatio; H is "human" and T is "abstract"; absence of semantic restriction on the subject is coded in horatio by means of Prolog's anonymous variable (_).

BEGIN {nu=1}

# nu is the entry counter

{  semsubj="_"

   # semsubj is assigned the anonymous variable unless J, H or T is found at the appropriate place in
   # the semantic info field

   if (substr($8,5,1)=="J") semsubj = "thing"
   if (substr($8,5,1)=="H") semsubj = "human"
   if (substr($8,5,1)=="T") semsubj = "abstract"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)

   # the first part of the IF clause deals with definition and example lines; these appear as comments
   # in the Prolog clauses

      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(verbintr,%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,intrans,%s,\n[]).\n",
          $1,$9,$1,$1,$1,$2,$3,$4,$4,$5,semsubj)

                # $1 = Lex
                # $9 = Rnu
                # $2 = Vs
                # $3 = Ving
                # $4 = Ved
                # $5 = Ven

          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(verbintr,abate_1,abate,abate,abate,abates,abating,
abated,abated,abated,intrans,abstract,
[]).
/* DEF   (of winds, storms, disease, pain, etc.) to become less strong; decrease */
/* EX    The ship waited till the storm abated before sailing out to sea */

Note

/* The T feature transcribed as abstract is made more precise by the list of typical subjects headed by OF; such lists are best read as lists of thesauric heads; they are hard to make use of */

/* NU 2 */
m_verb(verbintr,abrade_1a,abrade,abrade,abrade,abrades,abrading,
abraded,abraded,abraded,intrans,_,
[]).
/* DEF   (esp. of skin) to wear away by hard rubbing; to cause (esp. skin) to wear away by hard rubbing */

Note

/* The Z value transcribed as the anonymous variable is hardly what we need: the selectional restriction is very strong, but is once again buried in the OF list  */

/* NU 3 */
m_verb(verbintr,accelerate_1,accelerate,accelerate,accelerate,accelerates,accelerating,
accelerated,accelerated,accelerated,intrans,_,
[]).
/* DEF   to (cause to) move faster */

/* NU 4 */
m_verb(verbintr,accept_1,accept,accept,accept,accepts,accepting,
accepted,accepted,accepted,intrans,human,
[]).
/* DEF   to take or receive (something offered or given), esp. willingly; receive with favour */
/* EX    I cannot accept your gift. He asked her to marry him and she accepted (him) */

            The main problem with intransitive verbs is how to distinguish them from transitive verbs that can have a zero (understood) object, either definite or indefinite. 

Mono-Transitive Verbs

A. FILTERS ON LDOCE

ENTRY LEVEL            'v' $ pos .and. headclas = "S"

SEMANTICS              headclas # "I"

GRAMMAR                gramcode = "T1 " .and. rghtlink = " "

B. RAW IMPORTED MATERIAL

abandon              abandons abandoning abandoned abandoned v     T1  ....H....T  1       
DEF   to leave completely and for ever; desert
EX    The sailors abandoned the burning ship

abandon              abandons abandoning abandoned abandoned v     T1  ..D.H....H  2       
DEF   to leave (a relation or friend) in a thoughtless or cruel way
EX    He abandoned his wife and went away with all their money

abandon              abandons abandoning abandoned abandoned v     T1  ....H....T  3       
DEF   to give up, esp. without finishing
EX    The search was abandoned when night came, even though the child had not been found

abate                abates abating abated abated v     T1  ....T....T  2       
DEF   lit to make less
EX    His pride was not abated by his many mistakes

abate                abates abating abated abated v     T1  ....H....T  3        
DEF   law to bring to an end (esp. in the phr. abate a nuisance)

 C. AWK PROGRAM

             Note that byte 10 in defisem (field 8) codes the semantic restriction on the object of a transitive verb.

BEGIN {nu=1}
{  semsubj="_"
   if (substr($8,5,1)=="J") semsubj = "thing"
   if (substr($8,5,1)=="H") semsubj = "human"
   if (substr($8,5,1)=="T") semsubj = "abstract"
  
   semobj = "_"
   if (substr($8,10,1)=="J") semobj = "thing"
   if (substr($8,10,1)=="H") semobj = "human"
   if (substr($8,10,1)=="T") semobj = "abstract"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(verbtr,%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[np(oblig,posprec(1,Wnp),object,%s)]).\n",$1,$9,$1,$1,$1,$2,$3,$4,$4,$5,semsubj,semobj)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(verbtr,abandon_1,abandon,abandon,abandon,abandons,abandoning,
abandoned,abandoned,abandoned,trans,human,
[np(oblig,posprec(1,Wnp),object,abstract)]).
/* DEF   to leave completely and for ever; desert */
/* EX    The sailors abandoned the burning ship */

Note

/* The T value giving rise to the abstract value results from a miscoding on the part of the LDOCE lexicographers */

/* NU 2 */
m_verb(verbtr,abandon_2,abandon,abandon,abandon,abandons,abandoning,
abandoned,abandoned,abandoned,trans,human,
[np(oblig,posprec(1,Wnp),object,human)]).
/* DEF   to leave (a relation or friend) in a thoughtless or cruel way */
/* EX    He abandoned his wife and went away with all their money */

/* NU 3 */
m_verb(verbtr,abandon_3,abandon,abandon,abandon,abandons,abandoning,
abandoned,abandoned,abandoned,trans,human,
[np(oblig,posprec(1,Wnp),object,abstract)]).
/* DEF   to give up, esp. without finishing */
/* EX    The search was abandoned when night came, even though the child had not been found */

/* NU 6 */
m_verb(verbtr,abate_2,abate,abate,abate,abates,abating,
abated,abated,abated,trans,abstract,
[np(oblig,posprec(1,Wnp),object,abstract)]).
/* DEF   lit to make less */
/* EX    His pride was not abated by his many mistakes */

/* NU 7 */
m_verb(verbtr,abate_3,abate,abate,abate,abates,abating,
abated,abated,abated,trans,human,
[np(oblig,posprec(1,Wnp),object,abstract)]).
/* DEF   law to bring to an end (esp. in the phr. abate a nuisance) */

Note
/*The T feature is again much too general */

Prepositional Verbs

Examples: look at, listen to: strongly bound preposition

A. FILTERS ON LDOCE

ENTRY          'y' $ pos .and. headclas ="C"

('y' is the POS assigned to prepositional verbs; headclas is now "C", i.e. compound (more than one word))

SEM                headclas # "I"

GRAMMAR   substr(gramcode,1,2) ="T1"

B. RAW IMPORTED MATERIAL

abide by             abides abiding abided abided y     T1  ....H....T  1
DEF   to be faithful to; obey (laws, agreements, etc.)
EX    If you join the club you must abide by its rules
abide by             abides abiding abided abided y     T1  ....H....T  2       
DEF   to wait for or accept
EX    You must abide by the results of your mistakes
account for          accounts accounting accounted accounted y     T1  ....H...YT  1  -I to
DEF   to give a statement showing how money or goods left in one's care have been dealt with
EX    He has to account to the chairman for all the money he spends
account for          accounts accounting accounted accounted y     T1  ....H....T  2  -I to
DEF   to give an explanation or reason for
EX    He could not account for his foolish mistake
account for          accounts accounting accounted accounted y     T1  ..I.H....O  3       
DEF   infml to kill, shoot, or catch
EX    I think I accounted for 3 of the attackers

C. AWK PROGRAM

This program generates two clauses for each selected reading, in accordance with Quirk et al.'s concept of multiple analysis for such verbs, to which the reader is referred (see Quirk et al. 1985, §§ 2.61, 16.5 and here p. 19).

BEGIN {nu=1}
{  semsubj="_"
   if (substr($9,5,1)=="J") semsubj = "thing"
   if (substr($9,5,1)=="H") semsubj = "human"
   if (substr($9,5,1)=="T") semsubj = "abstract"
  
   semobj = "_"
   if (substr($9,10,1)=="J") semobj = "thing"
   if (substr($9,10,1)=="H") semobj = "human"
   if (substr($9,10,1)=="T") semobj = "abstract"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(vtrprep,%s_%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[pp(oblig,posprec(1,Wpp),pp_arg,%s,_,%s)]).\n",
          $1,$2,$10,$1,$1,$1,$3,$4,$5,$5,$6,semsubj,semobj,$2)
        
# $2 is the preposition
 printf("m_verb(vtrprep,%s_%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n
[string(oblig,posprec(1,0),[%s]),\nnp(oblig,posprec(2,Wnp),object,%s)]).\n",
          $1,$2,$10,$1,$1,$1,$3,$4,$5,$5,$6,semsubj,$2,semobj)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(vtrprep,abide_by_1,abide,abide,abide,abides,abiding,
abided,abided,abided,trans,human,
[pp(oblig,posprec(1,Wpp),pp_arg,abstract,_,by)]).
m_verb(vtrprep,abide_by_1,abide,abide,abide,abides,abiding,
abided,abided,abided,trans,human,
[string(oblig,posprec(1,0),[by]),
np(oblig,posprec(2,Wnp),object,abstract)]).
/* DEF   to be faithful to; obey (laws, agreements, etc.) */
/* EX    If you join the club you must abide by its rules */

Note
/* Here too the specification of the thesauric type of the object is much more precise than the T code */

/* NU 2 */
m_verb(vtrprep,abide_by_2,abide,abide,abide,abides,abiding,
abided,abided,abided,trans,human,
[pp(oblig,posprec(1,Wpp),pp_arg,abstract,_,by)]).
m_verb(vtrprep,abide_by_2,abide,abide,abide,abides,abiding,
abided,abided,abided,trans,human,
[string(oblig,posprec(1,0),[by]),
np(oblig,posprec(2,Wnp),object,abstract)]).
/* DEF   to wait for or accept */
/* EX    You must abide by the results of your mistakes */

/* NU 5 */
m_verb(vtrprep,account_for_1,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[pp(oblig,posprec(1,Wpp),pp_arg,abstract,_,for)]).
m_verb(vtrprep,account_for_1,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[string(oblig,posprec(1,0),[for]),
np(oblig,posprec(2,Wnp),object,abstract)]).
/* DEF   to give a statement showing how money or goods left in one's care have been dealt with */
/* EX    He has to account to the chairman for all the money he spends */

/* NU 6 */
m_verb(vtrprep,account_for_2,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[pp(oblig,posprec(1,Wpp),pp_arg,abstract,_,for)]).
m_verb(vtrprep,account_for_2,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[string(oblig,posprec(1,0),[for]),
np(oblig,posprec(2,Wnp),object,abstract)]).
/* DEF   to give an explanation or reason for */
/* EX    He could not account for his foolish mistake */

/* NU 7 */
m_verb(vtrprep,account_for_3,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[pp(oblig,posprec(1,Wpp),pp_arg,_,_,for)]).
m_verb(vtrprep,account_for_3,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[string(oblig,posprec(1,0),[for]),
np(oblig,posprec(2,Wnp),object,_)]).
/* DEF   infml to kill, shoot, or catch */
/* EX    I think I accounted for 3 of the attackers */

Transitive Phrasal Verbs

Examples: look up, put off: transitive phrasal verbs: look it up, *look up it  

A. FILTERS ON LDOCE

ENTRY            'z' $ pos .and. headclas = "C"

('z' is the pos associated with phrasal verbs in LDOCE)

SEM                headclas # "I"

GRAMMAR            substr(gramcode,1,2) ="T1"

B. RAW IMPORTED MATERIAL

act out              acts acting acted acted z     T1  ....H....T  1       
DEF   to express (thoughts, unconscious fears, etc.) in actions and behaviour rather than in words
add up               adds adding added added z     T1  ....H....T  2       
DEF   to add (numbers) together to get a total

C. AWK PROGRAM

BEGIN {nu=1}
{  semsubj="_"
   if (substr($9,5,1)=="J") semsubj = "thing"
   if (substr($9,5,1)=="H") semsubj = "human"
   if (substr($9,5,1)=="T") semsubj = "abstract"
  
   semobj = "_"
   if (substr($9,10,1)=="J") semobj = "thing"
   if (substr($9,10,1)=="H") semobj = "human"
   if (substr($9,10,1)=="T") semobj = "abstract"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(vphr,%s_%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[part(oblig,posprec(1,2),[%s]),\nnp(oblig,posprec(1,Wnp),object,%s)]).\n",
          $1,$2,$10,$1,$1,$1,$3,$4,$5,$5,$6,semsubj,$2,semobj)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(vphr,act_out_1,act,act,act,acts,acting,
acted,acted,acted,trans,human,
[part(oblig,posprec(1,2),[out]),
np(oblig,posprec(1,Wnp),object,abstract)]).
/* DEF   to express (thoughts, unconscious fears, etc.) in actions and behaviour rather than in words */

/* NU 2 */
m_verb(vphr,add_up_2,add,add,add,adds,adding,
added,added,added,trans,human,
[part(oblig,posprec(1,2),[up]),
np(oblig,posprec(1,Wnp),object,abstract)]).
/* DEF   to add (numbers) together to get a total */

Note
/* In both these entries the bracketed material in the definition is again much more precise than the semantic code; the two cases are different, however: in act_out, the list is open-ended and has exemplificatory value only; in add_up, the one-member list is best read as head of a the­sauric class */

Verbs Taking an Object and an Object Complement

A. Object complement is a noun phrase

Example: consider:  he considers the teacher a genius

A. FILTERS ON LDOCE

ENTRY            'v' $ pos .and. headclas = "S"

SEM                headclas # "I"

GRAMMAR            gramcode="X1 "  .or. gramcode="X1e"

(in the Liège LDOCE data base, X1e corresponds to X (to be) 1 in the printed version of LDOCE)

B. RAW IMPORTED MATERIAL

acclaim              acclaims acclaiming acclaimed acclaimed v     X1  ....H..T.H  2       
DEF   to declare to be or publicly recognize as, esp. with loud shouts of approval or praise
EX    They acclaimed him as the best writer of the year. They acclaimed her their leader
account              accounts accounting accounted accounted v     X1  ....H....Z  1       
DEF   to consider
EX    He was accounted a wise man. He accounted himself lucky to be alive
acknowledge          acknowledges acknowledging acknowledged acknowledged v     X1e ....H....H  2       
DEF   to recognize, accept, or admit (as)
EX    He was acknowledged to be the best player. He was acknowledged as their leader. They acknowledged themselves (to be) defeated

C. AWK PROGRAM

BEGIN {nu=1}
{  semsubj="_"
   if (substr($8,5,1)=="J") semsubj = "thing"
   if (substr($8,5,1)=="H") semsubj = "human"
   if (substr($8,5,1)=="T") semsubj = "abstract"
  
   semobj = "_"
   if (substr($8,10,1)=="J") semobj = "thing"
   if (substr($8,10,1)=="H") semobj = "human"
   if (substr($8,10,1)=="T") semobj = "abstract"

 # semattr houses restrictions on the object complement
   semattr = "_"
   if (substr($8,8,1)=="J") semattr = "thing"
   if (substr($8,8,1)=="H") semattr = "human"
   if (substr($8,8,1)=="T") semattr = "abstract"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(vcomp,%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[np(oblig,posprec(1,Wnp),object,%s),\nnp(oblig,posprec(2,Wnp2),object_attribute,%s)]).\n",
          $1,$9,$1,$1,$1,$2,$3,$4,$4,$5,semsubj,semobj,semattr)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(vcomp,acclaim_2,acclaim,acclaim,acclaim,acclaims,acclaiming,
acclaimed,acclaimed,acclaimed,trans,human,
[np(oblig,posprec(1,Wnp),object,human),
np(oblig,posprec(2,Wnp2),object_attribute,abstract)]).
/* DEF   to declare to be or publicly recognize as, esp. with loud shouts of approval or praise */
/* EX    They acclaimed him as the best writer of the year. They acclaimed her their leader */

/* NU 2 */
m_verb(vcomp,account_1,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[np(oblig,posprec(1,Wnp),object,_),
np(oblig,posprec(2,Wnp2),object_attribute,_)]).
/* DEF   to consider */
/* EX    He was accounted a wise man. He accounted himself lucky to be alive */

/* NU 3 */
m_verb(vcomp,acknowledge_2,acknowledge,acknowledge,acknowledge,acknowledges,acknowledging,
acknowledged,acknowledged,acknowledged,trans,human,
[np(oblig,posprec(1,Wnp),object,human),
np(oblig,posprec(2,Wnp2),object_attribute,_)]).
/* DEF   to recognize, accept, or admit (as) */
/* EX    He was acknowledged to be the best player. He was acknowledged as their leader. They acknowledged themselves (to be) defeated */

Note
/* The [hu] feature on the object is too restrictive: The problem was acknowledged one of the hardest in the field  */

B. Object complement is an adjective phrase

Example: consider:  he considers the teacher very intelligent

A. FILTERS ON LDOCE

ENTRY            'v' $ pos .and. headclas = "S"

SEM                headclas # "I"

GRAMMAR            gramcode="X7 " .or.  gramcode="X7e"

(in the Liège LDOCE data base, X7e corresponds to X (to be) 7 in the printed version of LDOCE)

B. RAW IMPORTED MATERIAL

account              accounts accounting accounted accounted v     X7  ....H....Z  1       
DEF   to consider
EX    He was accounted a wise man. He accounted himself lucky to be alive
acknowledge          acknowledges acknowledging acknowledged acknowledged v     X7e ....H....H  2       
DEF   to recognize, accept, or admit (as)
EX    He was acknowledged to be the best player. He was acknowledged as their leader. They
acknowledged themselves (to be) defeated

C. AWK PROGRAM

BEGIN {nu=1}
{  semsubj="_"
   if (substr($8,5,1)=="J") semsubj = "thing"
   if (substr($8,5,1)=="H") semsubj = "human"
   if (substr($8,5,1)=="T") semsubj = "abstract"
  
   semobj = "_"
   if (substr($8,10,1)=="J") semobj = "thing"
   if (substr($8,10,1)=="H") semobj = "human"
   if (substr($8,10,1)=="T") semobj = "abstract"

   semattr = "_"
   if (substr($8,8,1)=="J") semattr = "thing"
   if (substr($8,8,1)=="H") semattr = "human"
   if (substr($8,8,1)=="T") semattr = "abstract"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(vcomp,%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[np(oblig,posprec(1,Wnp),object,%s),\nadjp(oblig,posprec(2,W),object_attribute,%s)]).\n",
          $1,$9,$1,$1,$1,$2,$3,$4,$4,$5,semsubj,semobj,semattr)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(vcomp,account_1,account,account,account,accounts,accounting,
accounted,accounted,accounted,trans,human,
[np(oblig,posprec(1,Wnp),object,_),
adjp(oblig,posprec(2,W),object_attribute,_)]).
/* DEF   to consider */
/* EX    He was accounted a wise man. He accounted himself lucky to be alive */

/* NU 2 */
m_verb(vcomp,acknowledge_2,acknowledge,acknowledge,acknowledge,acknowledges,acknowledging,
acknowledged,acknowledged,acknowledged,trans,human,
[np(oblig,posprec(1,Wnp),object,human),
adjp(oblig,posprec(2,W),object_attribute,_)]).
/* DEF   to recognize, accept, or admit (as) */
/* EX    He was acknowledged to be the best player. He was acknowledged as their leader. They acknowledged themselves (to be) defeated */

Ditransitive Verbs

A. FILTERS ON LDOCE

ENTRY            'v' $ pos .and. headclas = "S"

SEM                headclas # "I"

GRAMMAR            gramcode="D1 "

B. RAW IMPORTED MATERIAL

accord               accords according accorded accorded v     D1  ..F.H..T.H  2  -I to
DEF   fml to give; allow
EX    He was accorded permission to use the library
afford               affords affording afforded afforded v     D1  ...BZ....T  3       
DEF   fml & lit to provide with; supply with; give
EX    The tree afforded us shelter from the rain
allocate             allocates allocating allocated allocated v     D1  ....H..TAU  2       
DEF   to give as a share
EX    We allocated the society some money
allocate             allocates allocating allocated allocated v     D1  ....H...AT  3       
DEF   to set apart for somebody or some purpose
EX    That space has already been allocated for building a new hospital

C. AWK PROGRAM

            Paul Procter's memorandum to the LDOCE editors dated 16/9/1974 specifies that byte 10 of what is for us defisem should be used to place semantic restrictions on the first object or nominal complement of a ditransitive verb. Byte 8 should be used for the second object. How­ever, there is some ambiguity in these specifications, in so far as the indirect object can be positionally the first object, but conceptually the second. As a matter of fact, the LDOCE lexicographers have been rather inconsistent in their coding of semantic restrictions on vio verbs. The awk program caters for the most frequent type of coding, i.e. byte 8 for the direct object and byte 10 for the indirect one.

BEGIN {nu=1}
{  semsubj="_"
   if (substr($8,5,1)=="J") semsubj = "thing"
   if (substr($8,5,1)=="H") semsubj = "human"
   if (substr($8,5,1)=="T") semsubj = "abstract"
  
   semiobj = "_"
   if (substr($8,10,1)=="J") semiobj = "thing"
   if (substr($8,10,1)=="H") semiobj = "human"
   if (substr($8,10,1)=="T") semiobj = "abstract"

   semobj = "_"
   if (substr($8,8,1)=="J") semobj = "thing"
   if (substr($8,8,1)=="H") semobj = "human"
   if (substr($8,8,1)=="T") semobj = "abstract"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(vio,%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[np(oblig,posprec(2,Wnp1),object,%s),\nio(opt,posprec(1,W2),indirect_object,%s,_)]).\n",
          $1,$9,$1,$1,$1,$2,$3,$4,$4,$5,semsubj,semobj,semiobj)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(vio,accord_2,accord,accord,accord,accords,according,
accorded,accorded,accorded,trans,human,
[np(oblig,posprec(2,Wnp1),object,abstract),
io(opt,posprec(1,W2),indirect_object,human,_)]).
/* DEF   fml to give; allow */
/* EX    He was accorded permission to use the library */

/* NU 2 */
m_verb(vio,afford_3,afford,afford,afford,affords,affording,
afforded,afforded,afforded,trans,_,
[np(oblig,posprec(2,Wnp1),object,_),
io(opt,posprec(1,W2),indirect_object,abstract,_)]).
/* DEF   fml & lit to provide with; supply with; give */
/* EX    The tree afforded us shelter from the rain */

Note
/* The T feature is misplaced: it should be placed on the direct object, not the indirect one */

/* NU 3 */
m_verb(vio,allocate_2,allocate,allocate,allocate,allocates,allocating,
allocated,allocated,allocated,trans,human,
[np(oblig,posprec(2,Wnp1),object,abstract),
io(opt,posprec(1,W2),indirect_object,_,_)]).
/* DEF   to give as a share */
/* EX    We allocated the society some money */

/* NU 4 */
m_verb(vio,allocate_3,allocate,allocate,allocate,allocates,allocating,
allocated,allocated,allocated,trans,human,
[np(oblig,posprec(2,Wnp1),object,_),
io(opt,posprec(1,W2),indirect_object,abstract,_)]).
/* DEF   to set apart for somebody or some purpose */
/* EX    That space has already been allocated for building a new hospital */

Note
/* Again, T is misplaced */

Verbs Taking an Object and a That-Clause

Example: tell:  he told the teacher that he had taught linguistics

A. FILTERS ON LDOCE

ENTRY            'v' $ pos .and. headclas = "S"

SEM                headclas # "I"

GRAMMAR            substr(gramcode,1,2) = "D5"

B. RAW IMPORTED MATERIAL

advise               advises advising advised advised v     D5  ....H..T.X  1       
DEF   to tell (somebody) what one thinks should be done; give advice to (somebody)
EX    I advise waiting till the proper time. I will do as you advise. I advised her that she should wait. I advised her where to stay. I advise you to leave now

advise               advises advising advised advised v     D5  ..F.H..T.X  2       
DEF   fml to give notice to; inform
EX    I have advised her that we are coming. Will you advise us (of) when the bags should arrive?

assure               assures assuring assured assured v     D5a ....H..Z.H  1       
DEF   to try to cause to believe or trust in something; promise; try to persuade
EX    I assure you that this medicine cannot harm you. He assured us of his ability to work
assure               assures assuring assured assured v     D5a ....H..Z.H  2       
DEF   to make (oneself) sure or certain
EX    Before going to bed she assured herself that the door was locked

C. AWK PROGRAM

BEGIN {nu=1}
{  semsubj="_"
   if (substr($8,5,1)=="J") semsubj = "thing"
   if (substr($8,5,1)=="H") semsubj = "human"
   if (substr($8,5,1)=="T") semsubj = "abstract"
  
   semiobj = "_"
   if (substr($8,10,1)=="J") semiobj = "thing"
   if (substr($8,10,1)=="H") semiobj = "human"
   if (substr($8,10,1)=="T") semiobj = "abstract"

  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(viothat,%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[io(opt,posprec(1,W2),indirect_object,%s,np),\ns(oblig,posprec(2,W),object]).\n",
          $1,$9,$1,$1,$1,$2,$3,$4,$4,$5,semsubj,semiobj)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(viothat,advise_1,advise,advise,advise,advises,advising,
advised,advised,advised,trans,human,
[io(opt,posprec(1,W2),indirect_object,_,np),
s(oblig,posprec(2,W),object]).
/* DEF   to tell (somebody) what one thinks should be done; give advice to (somebody) */
/* EX    I advise waiting till the proper time. I will do as you advise. I advised her that she should wait. I advised her where to stay. I advise you to leave now */

/* NU 2 */
m_verb(viothat,advise_2,advise,advise,advise,advises,advising,
advised,advised,advised,trans,human,
[io(opt,posprec(1,W2),indirect_object,_,np),
s(oblig,posprec(2,W),object]).
/* DEF   fml to give notice to; inform */
/* EX    I have advised her that we are coming. Will you advise us (of) when the bags should arrive? */

/* NU 3 */
m_verb(viothat,assure_1,assure,assure,assure,assures,assuring,
assured,assured,assured,trans,human,
[io(opt,posprec(1,W2),indirect_object,human,np),
s(oblig,posprec(2,W),object]).
/* DEF   to try to cause to believe or trust in something; promise; try to persuade */
/* EX    I assure you that this medicine cannot harm you. He assured us of his ability to work */

/* NU 4 */
m_verb(viothat,assure_2,assure,assure,assure,assures,assuring,
assured,assured,assured,trans,human,
[io(opt,posprec(1,W2),indirect_object,human,np),
s(oblig,posprec(2,W),object]).
/* DEF   to make (oneself) sure or certain */
/* EX    Before going to bed she assured herself that the door was locked */

Note
/* This second reading should be restricted to assure oneself  */

Verbs Taking an Object and a Prepositional Object

A. FILTERS ON LDOCE

ENTRY            'v' $ pos .and. headclas = "S"

SEM                headclas # "I"

GRAMMAR            substr(gramcode,1,2) = "T1" .and. righttypo="I"

(if righttypo = "I"  (italics) then there is a lexical item in the righthand-side environment and it is a preposition: recall that particles are in bold and prepositions in italics in the rghtctx field)

B. RAW IMPORTED MATERIAL

abandon              abandons abandoning abandoned abandoned v     T1  ....H....H  4  -I to
DEF   to give (oneself) up completely to a feeling, desire, etc.
EX    He abandoned himself to grief. abandoned behaviour
abet                 abets abetting abetted abetted v     T1  ....H....X  1  -I in
DEF   to encourage or give help to (a crime or criminal)
EX    He abetted the thief in robbing the bank
absent               absents absenting absented absented v     T1  ..F.H....H  1  -I fro
DEF   to keep (oneself) away
EX    He absented himself from the meeting
absolve              absolves absolving absolved absolved v     T1  ....X....H  2  -I fro
DEF   to free (someone) from fulfilling a promise or a duty, or from having to suffer for wrongdoing
abstract             abstracts abstracting abstracted abstracted v     T1  ....H....C  1  -I fro
DEF   tech to remove by drawing out gently; separate
abstract             abstracts abstracting abstracted abstracted v     T1  ..E.H....C  2  -I fro
DEF   euph to steal
acclaim              acclaims acclaiming acclaimed acclaimed v     T1  ....H..T.H  2  -I as
DEF   to declare to be or publicly recognize as, esp. with loud shouts of approval or praise
EX    They acclaimed him as the best writer of the year. They acclaimed her their leader

C. AWK PROGRAM

Here the semantic restrictions on the direct object are in byte 10; byte 8 codes restrictions on the prepositional object.

BEGIN {nu=1}
{  semsubj="_"
   if (substr($8,5,1)=="J") semsubj = "thing"
   if (substr($8,5,1)=="H") semsubj = "human"
   if (substr($8,5,1)=="T") semsubj = "abstract"
  
   semobj = "_"
   if (substr($8,10,1)=="J") semobj = "thing"
   if (substr($8,10,1)=="H") semobj = "human"
   if (substr($8,10,1)=="T") semobj = "abstract"

  # sempobj houses semantic restrictions on the prepositional object
   sempobj = "_"
   if (substr($8,8,1)=="J") sempobj = "thing"
   if (substr($8,8,1)=="H") sempobj = "human"
   if (substr($8,8,1)=="T") sempobj = "abstract"

   # the following if-clauses expand the abbreviations used for prepositions in the right context field
   if ($11 == "fro") $11="from"
   if ($11 == "wit") $11="with"
   if ($11 == "til") $11="till"
   if ($11 == "unt") $11="until"
   if ($11 == "int") $11="into"
   if ($11 == "amo") $11="among"
   if ($11 == "bet") $11="between"
   if ($11 == "ouf") $11="out_of"
   if ($11 == "uon") $11="upon"
  
   if (($1 ~ /DEF/ || $1 ~ /EX/) && NF>1)
          printf("/* %s */\n",$0)
      else
          { if (NF>1)
          {printf("\n/* NU %d */\n",nu)  
          printf("m_verb(vobjfreepp,%s_%s,%s,%s,%s,%s,%s,\n%s,%s,%s,trans,%s,\n[np(oblig,posprec(1,Wnp),object,%s),\npp(oblig,posprec(1,Wpp),pp_arg,%s,_,%s)]).\n",
          $1,$9,$1,$1,$1,$2,$3,$4,$4,$5,semsubj,semobj,sempobj,$11)
          nu++}}
   }

D. SAMPLE ENTRIES IMPORTED FROM LDOCE

/* NU 1 */
m_verb(vobjfreepp,abandon_4,abandon,abandon,abandon,abandons,abandoning,
abandoned,abandoned,abandoned,trans,human,
[np(oblig,posprec(1,Wnp),object,human),
pp(oblig,posprec(1,Wpp),pp_arg,_,_,to)]).
/* DEF   to give (oneself) up completely to a feeling, desire, etc. */
/* EX    He abandoned himself to grief. abandoned behaviour */

Note
/* The adjective abandoned is best dealt with as a separate entry; the entry here should be abandon oneself to */

/* NU 2 */
m_verb(vobjfreepp,abet_1,abet,abet,abet,abets,abetting,
abetted,abetted,abetted,trans,human,
[np(oblig,posprec(1,Wnp),object,_),
pp(oblig,posprec(1,Wpp),pp_arg,_,_,in)]).
/* DEF   to encourage or give help to (a crime or criminal) */
/* EX    He abetted the thief in robbing the bank */

/* NU 3 */
m_verb(vobjfreepp,absent_1,absent,absent,absent,absents,absenting,
absented,absented,absented,trans,human,
[np(oblig,posprec(1,Wnp),object,human),
pp(oblig,posprec(1,Wpp),pp_arg,_,_,from)]).
/* DEF   to keep (oneself) away */
/* EX    He absented himself from the meeting */

Note
/* The entry should read absent oneself from; He absented her from the meeting  is jocular at best*/

/* NU 4 */
m_verb(vobjfrepp,absolve_2,absolve,absolve,absolve,absolves,absolving,
absolved,absolved,absolved,trans,_,
[np(oblig,posprec(1,Wnp),object,human),
pp(oblig,posprec(1,Wpp),pp_arg,_,_,from)]).
/* DEF   to free (someone) from fulfilling a promise or a duty, or from having to suffer for wrongdoing */

/* NU 5 */
m_verb(vobjfreepp,abstract_1,abstract,abstract,abstract,abstracts,abstracting,
abstracted,abstracted,abstracted,trans,human,
[np(oblig,posprec(1,Wnp),object,_),
pp(oblig,posprec(1,Wpp),pp_arg,_,_,from)]).
/* DEF   tech to remove by drawing out gently; separate */

/* NU 6 */
m_verb(vobjfreepp,abstract_2,abstract,abstract,abstract,abstracts,abstracting,
abstracted,abstracted,abstracted,trans,human,
[np(oblig,posprec(1,Wnp),object,_),
pp(oblig,posprec(1,Wpp),pp_arg,_,_,from)]).
/* DEF   euph to steal */

/* NU 7 */
m_verb(vobjfreepp,acclaim_2,acclaim,acclaim,acclaim,acclaims,acclaiming,
acclaimed,acclaimed,acclaimed,trans,human,
[np(oblig,posprec(1,Wnp),object,human),
pp(oblig,posprec(1,Wpp),pp_arg,abstract,_,as)]).
/* DEF   to declare to be or publicly recognize as, esp. with loud shouts of approval or praise */
/* EX    They acclaimed him as the best writer of the year. They acclaimed her their leader */

Appendix F. Sample Parses

            The backbone of the parses is the predicate-argument-modifier model. Properties of the clause (voice, aspect, modality, tense) are recorded in the prop structure. The arguments are given in canonical order according to their deep gf.  Athematic elements do not appear in the parses. The sharing of the variable within the index structure is used to indicate coindex­ing.

1. He was eager to back down.

/* control  */
/* mwu */
Parse:
[he,was,eager,to,back,down]

 7                          /* preference index */
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: none,mod: none,tns: past) 
              /* Voice, Aspect, Modality, Tense */
    predicate(be,agr(finite,past,sing,third))
     subject
      nounphrase
      index(_02B0)       /* shared: used to indicate coreference */
      ppro(third,sing,masc)
     subject_attribute
      adjectivephrase
       adjective(eager_1)
        adj_arg
         pred_arg_mod_structure
         prop(vce: active,asp: none,mod: none,tns: present)
          predicate(back_down_1,agr(infinitive))
           subject
            nounphrase
            index(_02B0)       /* shared: used to indicate coreference */
            ppro(third,sing,masc)

2. They took the problems he had seen into account.

/* mwu */
/* linear precedence */
/* long distance dependencies */
Parse:
[they,took,the,problems,he,had,seen,into,account]

 16
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: none,mod: none,tns: past)
    predicate(take_into_account_1,agr(finite,past,plural_or_second))/* mwu */
     subject
      nounphrase
      index(_0360)
      ppro(third,plural)
     object
      nounphrase
      index(_0544)
       det(the)
       noun(problem_1,agr(plural))
       relative_clause
        clause
         pred_arg_mod_structure
         prop(vce: active,asp: [perfect],mod: none,tns: past)
          predicate(see_1,agr(en_active))
           subject
            nounphrase
            index(_06BC)
            ppro(third,sing,masc)
           object
            gapped_nounphrase
            index(_0544)

3. John wants to appear to be loved by Mary.

/* control and raising */
 /* passive */
Parse:
[john,wants,to,appear,to,be,loved,by,mary]

 14
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: none,mod: none,tns: present)
    predicate(want_1,agr(finite,present,sing,third))
     subject
      nounphrase
      index(_0334)
       noun(john,agr(sing))
     object          /* the object of WANT is clausal */
      clause
       pred_arg_mod_structure
       prop(vce: active,asp: none,mod: none,tns: present)
        predicate(appear_1,agr(infinitive))
         subject              /* the subject of APPEAR is clausal */
          clause
           pred_arg_mod_structure
           prop(vce: passive,asp: none,mod: none,tns: present)
            predicate(love_1,agr(en_passive))
             subject              /* passive undone */
              nounphrase
              index(_08AC)
               noun(mary,agr(sing))
             object
              nounphrase
              index(_0334)
               noun(john,agr(sing))

4. John appears to want to be loved by Mary.

/* control and raising */
/* passive */
Parse:
[john,appears,to,want,to,be,loved,by,mary]

 14
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: none,mod: none,tns: present)
    predicate(appear_1,agr(finite,present,sing,third))
     subject
      clause         /* the subject of APPEAR is clausal */
       pred_arg_mod_structure
       prop(vce: active,asp: none,mod: none,tns: present)
        predicate(want_1,agr(infinitive))
         subject
          nounphrase
          index(_0334)
           noun(john,agr(sing))
         object
          clause        /* the object of WANT is clausal */
           pred_arg_mod_structure
           prop(vce: passive,asp: none,mod: none,tns: present)
            predicate(love_1,agr(en_passive))
             subject       /* passive is undone */
              nounphrase
              index(_08AC)
               noun(mary,agr(sing))
             object
              nounphrase
              index(_0334)
               noun(john,agr(sing))

5. The genius a book about whom he has read teaches mathematics.

/* long distance dependencies */
Parse:
[the,genius,a,book,about,whom,he,has,read,teaches,mathematics]

 19
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: none,mod: none,tns: present)
    predicate(teach_1,agr(finite,present,sing,third))
     subject
      nounphrase
      index(_03C8)                /* coref 1 */
       det(the)
       noun(genius_1,agr(sing))
       relative_clause
        object
         nounphrase
         index(_051C)                /* coref 2 */
          det(a)
          noun(book_1,agr(sing))
           np_arg_of_prep
            prepphrase
            index(_0668)
            prep(about)
             np_arg_of_prep
              nounphrase
              index(_03C8)         /* coref 1 */
               relative(whom)
        clause
         pred_arg_mod_structure
         prop(vce: active,asp: [perfect],mod: none,tns: present)
          predicate(read_1,agr(en_active))
           subject
            nounphrase
            index(_075C)
            ppro(third,sing,masc)
           object
            gapped_nounphrase
            index(_051C)                /* coref 2 */
     object
      nounphrase
      index(_0B68)
       det(zero)
       noun(mathematics_1,agr(sing_uncountable))

6. I am reading in the library a book the students want me to read.

/* linear precedence */
/* control */
/* long distance dependencies */
Parse:
[i,am,reading,in,the,library,a,book,the,students,want,me,to,read]

 26
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: [progressive],mod: none,tns: present)
    predicate(read_1,agr(ing))
     subject
      nounphrase
      index(_03E0)
      ppro(first,sing)
     object
      nounphrase
      index(_06E0)
       det(a)
       noun(book_1,agr(sing))
       relative_clause
        clause
         pred_arg_mod_structure
         prop(vce: active,asp: none,mod: none,tns: present)
          predicate(want_1,agr(finite,present,plural_or_second))
           subject
            nounphrase
            index(_0860)
             det(the)
             noun(student_1,agr(plural))
           object
            clause
             pred_arg_mod_structure
             prop(vce: active,asp: none,mod: none,tns: present)
              predicate(read_1,agr(infinitive))
               subject
                nounphrase
                index(_0A98)
                ppro(first,sing)
               object
                gapped_nounphrase
                index(_06E0)
     vp_modifier
      prepphrase
      index(_056C)
      prep(in)
       np_arg_of_prep
        nounphrase
        index(_05C0)
         det(the)
         noun(library_1,agr(sing))

7. The teacher has been given a book and the students a library.

/* gapping */
/* passive */
Parse:
[the,teacher,has,been,given,a,book,and,the,students,a,library]

 8
  and_sentence
   clause
    pred_arg_mod_structure
    prop(vce: passive,asp: [perfect],mod: none,tns: present)
     predicate(give_1,agr(en_passive))
      object
       nounphrase
       index(_075C)
        det(a)
        noun(book_1,agr(sing))
      indirect_object
       nounphrase
       index(_049C)
        det(the)
        noun(teacher_1,agr(sing))
   clause
    pred_arg_mod_structure
    prop(vce: passive,asp: [perfect],mod: none,tns: present)
       /* gapped v */
     predicate(give_1,_0494)
      object
       nounphrase
       index(_0A7C)
        det(a)
        noun(library_1,agr(sing))
      indirect_object
       nounphrase
       index(_04CC)
        det(the)
        noun(student_1,agr(plural))

8. She likes the books that I have written and you have put into the library.

/* coordination with across-the-board deletions */
Parse:
[she,likes,the,books,that,i,have,written,and,you,have,put,into,the,library]

 28
  clause
   pred_arg_mod_structure
   prop(vce: active,asp: none,mod: none,tns: present)
    predicate(like_1,agr(finite,present,sing,third))
     subject
      nounphrase
      index(_0434)
      ppro(third,sing,fem)
     object
      nounphrase
      index(_05EC)                /* coref */
       det(the)
       noun(book_1,agr(plural))
       and_relative_clause
        relative_clause
         object
          nounphrase
          index(_05EC)                /* coref */
           relative(that)
         clause
          pred_arg_mod_structure
          prop(vce: active,asp: [perfect],mod: none,tns: present)
           predicate(write_1,agr(en_active))
            subject
             nounphrase
             index(_0844)
             ppro(first,sing)
            object
             gapped_nounphrase
             index(_05EC)         /* coref */
        relative_clause
         clause
          pred_arg_mod_structure
          prop(vce: active,asp: [perfect],mod: none,tns: present)
           predicate(put_1,agr(en_active))
            subject
             nounphrase
             index(_0B2C)
             ppro(second)
            object
             gapped_nounphrase
             index(_05EC)                       /* coref */
            pp_arg
             prepphrase
             index(_0D74)
             prep(into)
              np_arg_of_prep
               nounphrase
               index(_0DC8)
                det(the)
                noun(library_1,agr(sing))

Appendix G. Test Suites

For Analysis

Designed for Horatio

They failed.
He was eager to back down.
Do the facts allow the explanation he gave to the students ?
They should back up the teacher.
They should back the good teachers up.
They should back up the teacher they like.
The teacher should have been backed up.
She must allow that John is a good teacher.
She must allow John is a bad teacher.
You must allow for the oversimplifications he has made.
The teacher allows the boys money for books.
He told her that he loved Mary.
She told him what to see.
John has alienated the students from the teacher.
He allowed the students into the library.
The students he had allowed into the library were reading books.
They are teachers.
He is reluctant to go into the library.
The problem is that she knows him.
We have been in the library.
He has become a good teacher.
The books belong in the library.
The girl went to the library.
He brought the books he had liked to the library.
He brought to the library the books he liked.
He considers the claim she has made an oversimplification.
They declared the claim valid.
They will decide where to go.
They did away with the bad teachers.
They want him to kick the bucket.
They should pay attention to the problems he saw.
Great attention was paid to the problems he had seen.
The students had been put at risk.
They took the problems he had seen into account.
They took into account the problems they had seen.
He should take them into account.
The workshop will take place in the library.
They were shooting the breeze.
They allowed her to teach linguistics.
She was allowed to teach linguistics.
They wanted to teach linguistics.
He wanted them to put the workshop off.
John tried to teach linguistics.
* John tried them to teach linguistics.
They persuaded her to teach linguistics.
She was persuaded to teach linguistics.
They expected her to teach linguistics.
She was expected to teach linguistics.
Mary is expected to be elected.
She promised to teach linguistics.
She promised them to teach linguistics.
She seems to have taught linguistics.
It seems that she has taught mathematics.
The book seems to have been read by the students.
The book was expected to have been read.
She is eager to teach.
She is easy to please.
She is an easy woman to please.
The teacher was seen to read a bad book.
The students saw John teach mathematics.
Teachers avoid reading books.
She wants to avoid their reading bad books.
They believed him to have killed a student.
He was believed to have killed a student.
* The book seems to read.
The book seems to be read.
The book seems to have been read by the student.
* The book was seen to read.
The book is believed to have been read.
The man is believed to have read the book.
* The man is believed to have been read.
Mary tends to be annoyed by John.
John tends to annoy Mary.
John tries to annoy Mary.
Mary tries to be annoyed by John.
John wants to appear to be loved by Mary.
John appears to want to be loved by Mary.
When Mary saw John she told him that she wanted him to meet the teacher.
He warned her that she had been seen before she went to the library.
If he saw her he must have seen her before she went into the library.
The teacher who teaches linguistics is good.
The workshop that he wants to put off will fail.
The genius a book about whom he has read teaches mathematics.
She likes the town in which she lives.
She likes the town which she lives in.
She likes the town that she lives in.
She likes the town she lives in.
She likes the town where she lives.
The teacher whose books she likes thinks that she is a good student.
I know the university which she tells him she knows he wants her to go to.
Who knew that John expected her to break down ?
What might the man have been looking at ?
On which table has he put the books ?
Which table has he put the books on ?
Where did he go ?
Have you met Mary ?
Do I know him ?
Are they the teachers who taught you linguistics ?
I knew where he wanted to go.
You must decide which books the students should read.
I told him where to go.
He must have been told where to go.
Might he have been writing a book ?
Does he believe her to have gone in for linguistics ?
Have you read the letter to the teacher about the library ?
The problem with you is that you know me.
Do you back up the decision to give him money ?
They are easy to teach.
John is reluctant to teach linguistics.
John is black.
John has seen a black dog.
He is sure to tell them what to read.
He is sure I will tell them what to read.
Mary is an easy woman to please.
The man reading a book in the library is a teacher.
I want to read a book written by a student.
He went to the library with Mary.
Mary was reading a book about linguistics in the library.
The woman is reading a book in the library.
* The woman is reading in the library a book.
I am reading in the library a book the students want me to read.
She gave books to the students.
* She gave to the students books.
She gave the students good books.
* She gave books the students.
She gave the students the books she wanted them to read.
The teacher took the problems into account.
The teacher took into account the problems.
The teacher took them into account.
* The teacher took into account them.
The teacher took into account the problems the students had seen.
Do you like books about linguistics ?
The man reading a book in the library is a good teacher.
He considers the claim she made an oversimplification.
The students were persuaded to read the books in the library.
He had been looked down on.
Mary has been given a book.
A good book has been given to Mary.
A book has been given to Mary by the student.
The students are expected to read books about linguistics.
The teacher was seen to read a book about women.
Books should be read.
The student was declared a genius.
The problems were paid attention to.
Great attention was paid to the problems the students had seen.
The books they said they liked were put in the library.
He had been told where to meet her.
He was believed to have killed a bad student.
The good books seem to have been read by the students.
The teacher whose books I told her I liked knows the university I have persuaded her to go to.
The students like the books the teacher wants them to read.
What does the teacher think the student is learning ?
Who is the man the woman has been looking for listening to ?
On which table might the man have put the books ?
Which table might the man have put the books on ?
I decided what to tell her I believed her to like.

With coordination:

Mary teaches linguistics and John is learning mathematics.
Mary teaches linguistics and John mathematics.
Mary is and John wants to be in the library.
Mary is in and John wants to be in the library.
Mary went to the library and John to the workshop.
The teacher has been given a book and the students a library.
John and the students want to put off the workshop.
John likes dogs and black cats.
He looked at the teacher and the students.
She made a valid and true claim.
The teacher turned up and broke down.
She declares and considers him a genius.
She told him where to go and what to see.
He can and should see her.
He relied on and liked the students.
He liked and relied on the students.
He backed up and liked the decision to give them money.
He liked and backed up the decision to give them money.
She likes the books that I have written and you have put into the library.
They had been tripping the light fantastic and shooting the breeze.
They tripped the light fantastic and shot the breeze.
They may trip the light fantastic and shoot the breeze.

Based on Flickinger et al. 1987

Abrams works.
Abrams hired Browne.
Abrams showed the office to Browne.
Abrams showed Chiang the office.
Abrams became competent.
Abrams became a manager.
* Abrams became in the office.
* Abrams became working.
Abrams is interviewing an applicant.
Abrams is interviewing.
Abrams is working for Browne.
Abrams is working.
Abrams works for Browne.
* Abrams works of Browne.
Abrams approves of Browne.
* Abrams approves for Browne.
She hired him.
He hired her.
* She hired he.
* He hired she.
* Her hired he.
* Him hired she.
* Her hired him.
* Him hired her.
He showed it to her.
He showed her it.
He showed her an office.
He showed it to Chiang.
* He showed Chiang it.
He interviewed them.
* He interviewed they.
They interviewed him.
* Them interviewed him.
I interviewed Abrams.
Abrams interviewed me.
* Me interviewed Abrams.
* Abrams interviewed I.
We interviewed Abrams.
Abrams interviewed us.
* Us interviewed Abrams.
* Abrams interviewed we.
You interviewed Abrams.
Abrams interviewed you.
She and I interviewed Abrams.
I and she interviewed Abrams.
* Me and her interviewed Abrams.
* Her and me interviewed Abrams.
* Her and I interviewed Abrams.
* She and me interviewed Abrams.
* Me and she interviewed Abrams.
* I and her interviewed Abrams.
Abrams interviewed her and me.
* Abrams interviewed she and I.
Abrams interviewed me and her.
* Abrams interviewed I and she.
* Abrams interviewed her and I.
* Abrams interviewed she and me.
* Abrams interviewed me and she.
* Abrams interviewed I and her.
Whom does she work for ?
Who does she work for ?
For whom does she work ?
Who hired Browne ?
* Whom hired Browne ?
A manager works.
Managers work.
* A manager work.
* Managers works.
I work.
* I works.
You work.
* You works.
He works.
* He work.
We work.
* We works.
They work.
* They works.
They list women who have bookcases.
* They list women which have bookcases.
* They list bookcases who women have.
They list bookcases which women have.
A manager is an employee.
Managers are employees.
Managers are a problem.
Abrams may hire Browne.
Abrams might hire Browne.
Abrams can hire Browne.
Abrams shall hire Browne.
Abrams should hire Browne.
Abrams will hire Browne.
Abrams would hire Browne.
Abrams must hire Browne.
Abrams has hired Browne.
Abrams is hiring Browne.
Abrams could have hired Browne.
Abrams could be hiring Browne.
Abrams could have been hiring Browne.
Browne was interviewed by Abrams.
Browne could be interviewed by Abrams.
Browne has been interviewed by Abrams.
Browne is being interviewed by Abrams.
Browne could have been interviewed by Abrams.
Browne could be being interviewed by Abrams.
Browne has been being interviewed by Abrams.
Browne could have been being interviewed by Abrams.
* Abrams could hired Browne.
* Abrams could hiring Browne.
* Abrams has hire Browne.
* Abrams has hiring Browne.
* Abrams is hire Browne.
* Abrams is hired Browne.
* Abrams has could hire Browne.
* Abrams is coulding hire Browne.
* Abrams is having hired Browne.
* Abrams has could be hiring Browne.
* Abrams could be having hired Browne.
* Abrams could may hire Browne.
* Abrams has had hired Browne.
* Abrams is being hiring Browne.
* Abrams did could hire Browne.
* Browne is could interviewed by Abrams.
* Browne is had interviewed by Abrams.
* Browne did be interviewed by Abrams.
* Browne is did interviewed by Abrams.
* Browne is been interviewed by Abrams.
Abrams knew who to hire.
Abrams knew who Browne hired.
Abrams knew that Browne hired Chiang.
Abrams laid off a programmer.
Abrams laid a programmer off.
Abrams managed to hire Browne.
Abrams promised Browne to hire Chiang.
Abrams promised Browne to be interviewed by Chiang.
Browne was promised by Abrams to hire Chiang.
Browne was promised to hire Chiang.
Abrams urged Browne to hire Chiang.
Browne was urged to hire Chiang.
Abrams urged Browne to be interviewed.
Abrams was urged to be interviewed.
Abrams failed to hire Browne.
Abrams failed to be interviewed by Browne.
Abrams was hired by Browne.
Abrams was hired.
He was hired by her.
* Him was hired by her.
* He was hired by she.
An office was shown to Abrams by Chiang.
An office was shown to Abrams.
An office was shown by Abrams to Browne.
Devito was shown an office by Abrams.
An office was shown Chiang by Abrams.
Abrams was urged to hire Browne by Chiang.
Abrams was urged by Devito to hire Browne.
Abrams was urged to hire Browne.
Abrams was known to be interviewing Browne.
Abrams was known by Chiang to be interviewing Browne.
Abrams was approved of by Chiang.
Abrams was approved of.
Abrams is competent.
Abrams is a manager.
Abrams is in the office.
Abrams hired a woman who was competent.
Abrams hired women who were competent.
* Abrams hired a woman who were competent.
* Abrams hired women who was competent.
Abrams hired women whose manager was competent.
Abrams hired a woman who Browne interviewed.
Abrams hired a woman who Browne approved of.
Abrams hired a woman who Browne knew Chiang interviewed.
Abrams has a bookcase which is heavy.
Abrams has a bookcase that is heavy.
* Abrams has a bookcase what is heavy.
Abrams has an office that Browne showed Chiang.
Abrams has an office which Browne showed Chiang.
Abrams has an office Browne showed Devito.
Abrams has an office that Browne showed to Devito.
Abrams interviewed a woman who Browne showed an office to.
Abrams has an office Devito showed to Chiang.
Abrams hired a woman that was competent.
Abrams hired a woman was competent.
Abrams hired a woman that Browne interviewed.
Abrams hired a woman Browne interviewed.
Abrams hired a woman that Browne approved of.
Abrams hired a woman Devito approved of.
Abrams hired a woman that Browne knew Chiang interviewed.
Abrams hired a woman Browne knew Devito interviewed.
Abrams hired a woman interviewed by Chiang.
Abrams hired a woman working for Chiang.
Abrams hired a woman of whom Chiang approved.
Abrams has a bookcase of which Chiang approved.
Abrams hired a woman the manager of whom Chiang had interviewed.
Abrams hired a woman whose manager Chiang had interviewed.
Abrams interviewed programmers whose manager was Chiang.
* Abrams interviewed programmers whose manager were Chiang.
* Which project is Abrams a programmer and the manager of ?
* Which project does Abrams manage the department and ?
* Which manager did Abrams have an office that Chiang showed to ?
* Which projects does Abrams know who had worked on ?
* Which managers does Abrams know which offices Browne had shown to ?
* Which programmer did Abrams know which office had ?
* Which programmer was that Abrams had interviewed known by Browne ?
Is Abrams a manager ?
Is Abrams competent ?
* Are Abrams competent ?
Does Abrams work for Browne ?
Could Abrams work for Browne ?
Is Abrams working for Browne ?
Has Abrams worked for Browne ?
Could Abrams be working for Browne ?
Could Abrams have worked for Browne ?
Has Abrams been working for Browne ?
Could Abrams have been working for Browne ?
* Has been Abrams working for Browne ?
* Has Abrams is working for Browne ?
* Has Abrams be working for Browne ?
* Have Abrams could be working for Browne ?
Could Abrams have been interviewed by Browne ?
Has Abrams been interviewed by Browne ?
Was Abrams interviewed by Browne ?
Was Abrams interviewed ?
Was Abrams being interviewed by Browne ?
* Did Abrams be interviewed by Browne ?
* Was Abrams been interviewing by Browne ?
Who works for Chiang ?
Who is a manager ?
Who will be a manager ?
Who is managed by Abrams ?
What was shown to Browne ?
Which programmer works for Abrams ?
Which programmer is a manager ?
Which programmer will be a manager ?
Which programmer was hired by Abrams ?
Who does Chiang employ ?
What did Abrams show Browne ?
What did Abrams show to Chiang ?
What was Abrams shown ?
Which programmer did Abrams interview ?
What project does Abrams manage ?
Which office was Abrams shown ?
Where does Abrams work ?
Where was Abrams interviewed ?
Who does Abrams work for ?
Which office does Abrams work in ?
Whom did Abrams show an office to ?
Who is Browne managed by ?
Which department does Abrams know the manager of ?
Whose department does Abrams work in ?
Of whom does Abrams approve ?
In which office does Abrams work ?
To whom did Abrams show an office ?
Who did Abrams show an office ?
Who was shown an office by Abrams ?
By whom is Browne managed ?
Of which department is Browne the manager ?
In whose department does Chiang work ?
Whose manager did Abrams hire ?
* Whose did Abrams hire manager ?
Abrams knows who hired Browne.
Abrams knows who was hired by Browne.
Abrams knows which managers interviewed Browne.
Abrams knows who Browne hired.
Abrams knows who showed Browne an office.
Abrams knows who Browne was hired by.
Abrams knows whom Browne was hired by.
Abrams knows by whom Browne was hired.
Abrams knows which programmers Browne hired.
Abrams knows where Browne works.
The managers of the projects are trustworthy.
The consultants to the managers were hired by Abrams.
The programmer who was hired by Abrams manages the project.
Abrams manages the project that has consultants.
* Abrams manages the project has consultants.
* The programmer was hired by Abrams manages the project.
The programmer whom Browne hired manages the project.
Abrams works on the project that Browne manages.
Abrams works on the project Browne manages.
Abrams works for a competent manager.
Browne hired a competent programmer who Abrams interviewed.
It appears that Chiang interviewed Browne.
Programmers are hard to interview.
* Programmers are hard to interview engineers.
* Programmers is hard to interview.
Chiang was hard to show an office to.
Offices are hard to show to Chiang.
Devito was hard to show an office.
Offices are hard to show Devito.
He worked.
He was working.
He had worked.
He had been working.
He is working.
He has worked.
He has been working.
He will work.
He will be working.
He will have worked.
He will have been working.
Abrams hired a programmer before Devito interviewed an engineer.
Abrams hired a programmer after Devito interviewed an engineer.

With coordination:

Chiang is a manager and Devito is a programmer.
Chiang and Devito work.
* Chiang and Devito works.
Chiang hired Devito and Devito manages Browne.
Chiang hired Devito and manages Browne.
* Chiang hired Devito and manage Browne.
Abrams was interviewed and Browne was hired.
Abrams interviewed programmers and Browne was hired.
Abrams interviews programmers and is managed by Browne.
Abrams was interviewed and was hired.
Abrams was interviewed by Browne and hired by Devito.
Abrams was interviewed and hired by Browne.
Abrams was interviewed and hired.
Chiang works and manages programmers.
* Chiang work and manages programmers.
* Chiang work and manage programmers.
Chiang interviewed programmers and showed Browne an office.
Chiang hired and manages Devito.
* Chiang hired and manage Devito.
Devito works for Devito and with Browne.
Devito works for and with Chiang.
An old and trustworthy employee manages Devito.
* An old and trustworthy employee manage Devito.
Devito is old and trustworthy.
The managers and programmers are employees.
Devito manages a programmer who Abrams interviewed and Browne hired.
Devito manages a programmer Abrams interviewed and Browne hired.
Devito is the manager who interviewed Abrams and hired Browne.
Did Abrams interview a programmer and hire an engineer ?
Who did Abrams interview and Browne hire ?
Who was interviewed by Abrams and hired by Browne ?
* Who does Devito manage and Chiang work for ?
* Who does Devito manage and Chiang works for ?
* Who does Devito manages and Chiang works for ?
* Who does Devito manage Browne and Chiang work for ?
Who was Abrams interviewed by and hired by ?
* Who was Abrams interviewed by Browne and hired by ?
* Who does Devito manage and Chiang work for Browne ?
Devito is the manager with whom and for whom Chiang works.

For Generation

(the following are generatable from the corresponding parses produced by horatio)

they failed
he was eager to back down
do the facts allow the explanation he gave the students
they should back up the teacher
they should back up the good teachers
they should back up the teacher they like
the teacher should have been backed up
she must allow john is a good teacher
she must allow john is a bad teacher
you must allow for the oversimplifications he has made
the teacher allows the boys money for books
he told her he loved mary
she told him what to see
john has alienated the students from the teacher
he allowed the students into the library
the students he had allowed into the library were reading books
they are teachers
he is reluctant to go into the library
the problem is she knows him
we have been in the library
he has become a good teacher
the books belong in the library
the girl went to the library
he brought the books he had liked to the library
he brought the books he liked to the library
he considers the claim she has made an oversimplification
they declared the claim valid
they will decide where to go
they did away with the bad teachers
they want him to kick the bucket
they should pay attention to the problems he saw
great attention was paid to the problems he had seen
the students had been put at risk
they took into account the problems he had seen
they took into account the problems they had seen
he should take them into account
the workshop will take place in the library
they were shooting the breeze
they allowed her to teach linguistics
she was allowed to teach linguistics
they wanted to teach linguistics
he wanted them to put off the workshop
john tried to teach linguistics
they persuaded her to teach linguistics
she was persuaded to teach linguistics
they expected her to teach linguistics
she was expected to teach linguistics
mary is expected to be elected
she promised to teach linguistics
she promised them to teach linguistics
she seems to have taught linguistics
she seems to have taught mathematics
the book seems to have been read by the students
the book was expected to have been read
she is eager to teach
she is easy to please
she is an easy woman to please
the teacher was seen to read a bad book
the students saw john teach mathematics
teachers avoid reading books
she wants to avoid them reading bad books
they believed him to have killed a student
he was believed to have killed a student
the book seems to be read
the book seems to have been read by the student
the book is believed to have been read
the man is believed to have read the book
mary tends to be annoyed by john
john tends to annoy mary
john tries to annoy mary
mary tries to be annoyed by john
john wants to appear to be loved by mary
john appears to want to be loved by mary
when mary saw john she told him she wanted him to meet the teacher
if he saw her he must have seen her before she went into the library
the teacher who teaches linguistics is good
the workshop that he wants to put off will fail
the genius a book about whom he has read teaches mathematics
she likes the town in which she lives
she likes the town which she lives in
she likes the town that she lives in
she likes the town she lives in
she likes the town where she lives
the teacher whose books she likes thinks she is a good student
* i know the university which she tells him she knows he wants she go to
who knew john expected her to break down
what might the man have been looking at
on which table has he put the books
which table has he put the books on
where did he go
have you met mary
do i know him
are they the teachers who taught you linguistics
i knew where he wanted to go
you must decide which books the students should read
i told him where to go
he must have been told where to go
might he have been writing a book
does he believe her to have gone in for linguistics
have you read the letter to the teacher about the library
the problem with you is you know me
do you back up the decision to give him money
they are easy to teach
john is reluctant to teach linguistics
john is black
john has seen a black dog
he is sure to tell them what to read
he is sure i will tell them what to read
mary is an easy woman to please
the man reading a book in the library is a teacher
i want to read a book written by a student
he went to the library with mary
mary was reading a book about linguistics in the library
the woman is reading a book in the library
i am reading a book the students want me to read in the library
she gave the students books
she gave the students good books
she gave the books she wanted them to read the students
she gave the students the books she wanted them to read
the teacher took into account the problems
the teacher took them into account
the teacher took into account the problems the students had seen
do you like books about linguistics
the man reading a book in the library is a good teacher
he considers the claim she made an oversimplification
the students were persuaded to read the books in the library
he had been looked down on
mary has been given a book
mary has been given a good book
mary has been given a book by the student
the students are expected to read books about linguistics
the teacher was seen to read a book about women
books should be read
the student was declared a genius
the problems were paid attention to
great attention was paid to the problems the students had seen
the books they said they liked were put in the library
he had been told where to meet her
he was believed to have killed a bad student
the good books seem to have been read by the students
the teacher whose books i told her i liked knows the university i have persuaded her to go to
the students like the books the teacher wants them to read
what does the teacher think the student is learning
who is the man the woman has been looking for listening to
on which table might the man have put the books
which table might the man have put the books on
i decided what to tell her i believed her to like

Appendix H. Prolog Predicates

            The predicates covered in this appendix are the nonpredefined Prolog predicates used in horatio and horgen and described in the body of the text. Each predicate is assigned one or several class(es), which reflect(s) the area and/or job it is mainly used for. The list of classes is the following:

CHECK: used for checking purposes

DATA BASE: defined by a set of facts (no rules)

EXPANSION: expands a lexical macro-clause

GENERATION: used by the generator

LEXICAL: belongs to the vocabulary files

MACRO: macro-clause for the vocabulary

PARSING: used by the parser

UTIL: utility predicate

            The purpose section describes what the predicate is used for in horatio and/or horgen. It is not a description of the purpose of the predicate in general terms. The argument pattern section is likewise geared towards the application in hand. The list of arguments de­scribes any or all of the typical args used in the predicate argument structure. It should be noted that the parsing predicates implement difference list parsing and therefore open up with two arguments concerned with the word list: the first is the word list on entry, and the second the word list on exit. These two arguments are called InputWordList and OutputWordList in argument patterns. The parsing predicates often also include the following arguments:

     Gaplist, which houses the list of gaps to be passed up and down in the treatment of long distance dependencies.

      Preference, a number reflecting the preference to be assigned to the parse (a tree with a higher preference is to be preferred over one with a lower precedence)

     Weight, a number reflecting the structural complexity of a phrase, to account for end placement of weightier elements

accu
            Class: UTIL   
            Purpose: accumulates the preference indexes of the constituents in order to compute the preference index for the whole S
            Argument pattern: accu(Total, ListofValues)
            List of arguments:            Total: sum of the list of values contained in ListofValues
                                                ListofValues: values to be added up

adverb_sentence
            Class: PARSING
            Purpose: parses an adverbial clause attached to a main declarative S
            Argument pattern: adverb_sentence(InputWordList, OutputWordList, Gaps, Preference, Parsetree, Finiteness, Person, Number, Voice)
            List of arguments:            Gaps is set to the empty list
                                                Finiteness is set to finite

adj
            Class: LEXICAL / EXPANSION 
            Purpose: expands the macro-clause
m_adj for adjectives
            Argument pattern: adj(InputWordList, OutputWordList, LexicalTree, SemRestr, Arglist)
            List of arguments:            SemRestr is the semantic restriction the adjective places on the noun it modifies
                                                Arglist is the adjective's argument list; it may be empty

agree
            Class: PARSING / CHECK
            Purpose: checks subject-verb agreement
            Argument pattern: agree(Personnp, Numbernp, VpAgr)
            List of arguments:            Personnp: person of subject NP
                                                Numbernp: number of subject NP
                                                VpAgr: one of /firstsg, thirdsg, other/

allopt
            Class: CHECK
            Purpose: checks that all remaining args in the arglist are optional (
allopt is called after satisfylist has applied to the arglist)
            Argument pattern: allopt(Arglist)
            List of arguments: Arglist is the list of remaining arguments, after
satisfylist has worked through it

allsubject
            Class: CHECK / GENERATION
            Purpose: checks that the function passed as argument is a subject
            Argument pattern: allsubject(SubjectFunction)
            List of arguments: SubjectFunction: subject, subject_inf, subject_pass, ...

append
            Class: UTIL
            Purpose: standard list appending        
            Argument pattern: append(L1, L2, L3)

arglist
            Class: PARSING
            Purpose: modifies, if necessary, the list of arguments to be satisfied. It does so by call­ing
reog to deal with arg-changing transformations such as passive or raising
            Argument pattern: arglist(InputWordList, OutputWordList, Gaps, Status, Preference, Precedence, RelStat, RelorInt, Voice, Parsetree, Nature, Arglist, PromotedFunction, SubjectFunctor1, SubjectFunctor2, Class)
            List of arguments:            Status indicates whether an argument or a modifier has been found; it is set to 1 if the answer is yes
                                                RelStat shows whether a relative or interrogative pronoun or de­terminer has been found
                                                RelorInt is used to keep relatives and interrogatives apart
                                                Nature shows whether we have an np or a vp arglist
                                                PromotedFunction is the function promoted to subject
                                                SubjectFunctor1 is a subject functor showing the running subject before satisfaction of the
arglist predicate
                                                SubjectFunctor2: running subject after satisfaction of the
arglist predicate
                                                Class is the predicate class of the arg-bearing predicate                     

assoc
            Class:UTIL / DATA BASE
            Purpose: associates a rank with a grammatical function (used to give args in canonical order in the parse)
            Argument pattern: assoc(Gf, Rank)
            List of arguments:            Gf: a grammatical function (subject, object, ...)
                                                Order: a number reflecting the position of the arg in the canonical order

aux
            Class: LEXICAL
            Purpose: lexical predicate for auxiliaries
            Argument pattern: aux(InputWordList, OutputWordList, LexemeValue, Type, TypeRequired, Number, AgreementValue, Tense)
            List of arguments:            Type: inflectional type of the aux
                                                TypeRequired: inflectional type of the next aux to the right

before
            Class: UTILITY / CHECK
            Purpose: checks that an argument tree precedes another in canonical order
            Argument pattern: before[F1,R1], [F2,R2]
            List of arguments: the arguments are lists whose first member is a grammatical function (object, pp_arg, ...)

c_sentence
            Class: PARSING
            Purpose: parses main declarative clauses, with or without on either side or on both  ad­verbial subordinate clauses
            Argument pattern: c_sentence(InputWordList, OutputWordList, Preference, ParseTree, Finiteness, Person, Number, Voice)
            List of arguments: Finiteness is set to finite: we are dealing with main declarative clauses

checkaux3
            Class: PARSING
            Purpose: records relationships between aspect, voice and auxiliary type
            Argument pattern: checkaux3(AspectList, InflectionalSpec, Voice)
            List of arguments:            Aspectlist is a list, possibly empty; members are progressive and perfect
                                               
InflectionalSpec is an inflectional specification on the next verb to the right
                                                Voice is either passive or uninstantiated (interpreted as active voice)
                                               

checksem
            Class: SEMANTICS
            Purpose: checks that two semantic restrictions are compatible
            Argument pattern: checksem(Semrestriction1, Semrestriction2)

control
            Class: GENERATION
            Purpose: takes care of deleting the controller element in the controlled clause; the con­troller is always the subject in that clause; its index is retained but the body of the np is ghosted
            Argument pattern: control(DeeperTree, MoreSurfacyTree)

controller
            Class: GENERATION / CHECK / DATA BASE
            Purpose: checks that the grammatical function is one that can control the governed sub­ordinate clause, i.e. act as subject
            Argument pattern: controller(Function)
            List of arguments: Function: subject, object, subject_inf, indirect_object, ...

corenounphrase
            Class: PARSING
            Purpose: parses typical, 'core' nps
            Argument pattern: corenounphrase(InputWordList, OutputWordList, Gaplist, Index, Preference, Weight, Rel, IntRel, Function, ParseTree, Number, Person, Semantics)
            List of arguments: see
nounphrase

cv
            Class: LEXICAL / CHECK / GENERATION
            Purpose: checks that a verb belongs to the class of control verbs. Such verbs have an arg which plays the role of subject in a nonfinite complement clause (infinitive or ing clause) which is itself an arg of the control verb
            Argument pattern: cv(ControlVerb, Required)
            List of arguments: Required: one of /to, bare, ing /, indicating the type of nonfinite complement clause

determiner
            Class: LEXICAL
            Purpose: provides lexical entries for determiners
            Argument pattern: determiner(InputWordList, OutputWordList, LexicalParseTree, Number, Semantics, Index)
            List of arguments:            Semantics is a semantic restriction used for relative and inter­rogative determiners
                                                Index is an index functor (rel(Index)) used for indexing purposes in relative clauses; Index is set to the string int for interrogative determiners

drop
            Class:UTILITY
            Purpose: removes athematic arguments from the list of arguments to appear in the parse tree
            Argument pattern: drop([AthematicFunction, RestofTree])
            List of arguments: the arg to drop is the list corresponding to the tree for the athematic argument, whose head is the name of the athematic function (surf_object, subject_inf, ...)

expos
            Class: LEXICAL / GENERATION
            Purpose: checks that a predicate belongs to the class of extraposition verbs (such as seem as used in It seems that ...)
            Argument pattern: expos(Extraposverb, to)
            List of arguments: to is the infinitive particle to

extrapos
            Class: GENERATION
            Purpose: extraposes the subject clause to the right and fills the surface subject position with place-filler IT (It seems that John likes linguistics)
            Argument pattern: extrapos(DeeperTree, MoreSurfacyTree)

first
            Class: UTIL
            Purpose: generates the first auxiliary in an auxiliary list
            Argument pattern: first(List, FirstElement)

first_header
            Class: CHECK / GENERATION
            Purpose: checks on the environment in generation
            Argument pattern: first_header(String)
            List of arguments: String: object, cplt_s, ...

flatten
            Class: UTIL
            Purpose: flattens a list (code is borrowed from Bratko 1990, p.572)
            Argument pattern: flatten(List, FlattenedList)

gen
            Class: GENERATION
            Purpose: central predicate in generation
            Argument pattern: gen(ParseTreeList, GeneratedStringList)
            List of arguments:            ParseTreeList is a list representing a parse tree
                                                GeneratedStringList is a list of generated words

genasp1
            Class: GENERATION
            Purpose: generates aspectual auxiliaries (
genasp1 takes care of perfect HAVE)
            Argument pattern: genasp1(AspectSlotInParseTree, GeneratedAux, RequiredType1, RequiredType2)
            List of arguments:            RequiredType1 is the inflectional type of the aux
                                                RequiredType2 is the inflectional type of the next aux to the right

genasp2
            Class: GENERATION
            Purpose: cf.
genasp1, but deals with the generation of progressive aux BE
            Argument pattern: cf. 
genasp1

generate
            Class: GENERATION
            Purpose: generates strings from parse trees; it is higher than
gen, because it first calls prepgen to undo the results of passive, raising, etc. and then gen to perform the actual gen­eration
            Argument pattern: generate(ParseTreeList, GeneratedStringList)
            List of arguments: cf.
gen

genlist
            Class: GENERATION
            Purpose: help predicate for
gen; it generates from a list by calling gen on the head and then itself on the tail
            Argument pattern: genlist(ParseTreeList,GeneratedStringList)
            List of arguments: cf.
gen

genyesno
            Class: GENERATION
            Purpose: generates auxiliaries in yes-no questions; it inserts DO when necessary
            Argument pattern: genyesno(ClausePropertyList, Person, Number, GeneratedAuxList)
            List of arguments: ClausePropertyList is the prop functor to be found at clause level

getagr
            Class: GENERATION
            Purpose: fills in the agreement feature in the first auxiliary
            Argument pattern: getagr(Auxiliary, Auxagree, Tense, ToOrNotTo)
            List of arguments:            Tense: to, bare and ing are also possible values, by the side of proper tense values (present / past)
                                                ToOrNotTo: either a one-element list ([to]) or an empty list ([])

inlist
            Class: UTIL / CHECK
            Purpose: nondeterministic check that an element belongs to a list
            Argument pattern: inlist(Element, List)

insert
            Class: UTIL
            Purpose: inserts an element into a list (used by insertion sort)
            Argument pattern: insert(Element, List, NewList)

insort
            Class: UTIL
            Purpose: sorts the args in canonical order for outputting in the parse, dropping athe­matic args from the list (see
drop)
            Argument pattern: insort(List, SortedList)

interrogative
            Class: LEXICAL
            Purpose: provides lexical entries for interrogatives
            Argument pattern: interrogative(InputWordList, OutputWordList, Tree, Semrestric, PPsemantics, PpOrNp)
            List of arguments:            Semrestric: semantic restriction for interrogative nps
                                                PPsemantics: a list of semantic features for interrogatives playing the part of an adverbial (why, when, how, ...)
                                                PpOrNp: indicates whether the interrogative plays the part of an np or an adverbial (who vs when)

m_adj
            Class: LEXICAL / MACRO
            Purpose: provides lexical entries (macro-clauses) for adjectives (expanded by
adj)
            Argument pattern: m_adj(Lexeme, Adjective, Semrestric, Arglist)
            List of arguments:            Lexeme: includes a reading number, e.g. black_1
                                                Adjective: the positive degree of the adjective, e.g. black
                                                Semrestric: the semantic restriction placed by the adjective on the noun it modifies
                                                Arglist: the adjective's argument list, possibly empty

m_noun
            Class: LEXICAL / MACRO
            Purpose: provides lexical entries (macro-clauses) for nouns (expanded by
noun)
            Argument pattern: Two patterns, the first one for countable nouns, the second for un­countable ones:
                                                m_noun(Lexeme, Sing, Plural, Semantics, Arglist)
                                                m_noun(Lexeme, Noun, Semantics, Arglist)
            List of arguments:            Lexeme: includes a reading number, e.g. patience_1
                                               
Sing, Plural: inflectional variants
                                                Noun: invariant form (for uncountables)
                                                Semantics: noun semantics is a list of semantic features
                                                Arglist: the noun's argument list, possibly empty

m_verb
            Class: LEXICAL / MACRO
            Purpose: provides lexical entries (macro-clauses) for verbs (expanded by
verb)
            Argument pattern: m_verb(Verbclass, Particle, Lexeme, Infinitive, Firstp, Secondp, Thirdp, Ing, FirstOrThirdPast, SecondOrPluralPast, En, Transitivity, Semrestric, Arglist)
            List of arguments:            Verbclass: provides a handle for the lexicographer, and is used by the parser and the generator as a check
                                                Particle: the type and form of the particle attached to the verb, if any, e.g. part0:down
                                                Lexeme: includes a reading number, as in look_down_on_1
                                               
Infinitive: bare infinitive form, e.g. look
                                               
Firstp: first person singular present tense
                                                Secondp: second person present tense, plural present tense (all persons)
                                                Thirdp: third person singular present tense
                                                Ing: ing-form
                                                FirstOrThirdPast: first or third singular past tense
                                                SecondOrPluralPast: second person past tense; plural past tense (all persons)
                                                En: en-form
                                                Transitivity: transitivity feature
                                                Semrestric: semantic restriction on the deep subject
                                                Arglist: the verb's argument list

modifier
            Class: PARSING
            Purpose: parses modifiers, strings in the input list that do not match lexical args as provided in the predicate's arglist
            Argument pattern: modifier(InputWordList, OutputWordList, Type, Gaplist, Preference, Precedence, Rel, Intrel, ParseTree, SubjectFunctor)
            List of arguments:            Type: to distinguish vp from np modifiers
                                                Rel, Intrel: to keep track of relatives or interrogatives inside the modifier
                                                SubjectFunctor: to provide a subject to be used for vps within the modifier

modppnp
            Class: CHECK / DATA BASE
            Purpose: checks that a given preposition can head an np modifier
            Argument pattern: modppnp(Preposition)

modppvp
            Class: CHECK / DATA BASE
            Purpose: checks that a given preposition can head a vp modifier
            Argument pattern: modppvp(Preposition)

myappend
            Class: UTIL
            Purpose: does list appending; is able to deal with the value none sometimes used in the property list of the clause
            Argument pattern: cf. append

nonfinite
            Class: CHECK / GENERATION / DATA BASE
            Purpose: checks that the inflectional specification of the wordform is non-finite
            Argument pattern: nonfinite(InflectionSpec)
            List of arguments: InflectionSpec: ing, toinfinitive, en_active, ...

noun
            Class: LEXICAL
            Purpose: expands macro-clause for nouns, i.e.
m_noun
            Argument pattern: noun(InputWordList, OutputWordList, ParseTree, Number, Semlist, Arglist)
            List of arguments: cf.
m_noun

nounphrase
            Class: PARSING
            Purpose: parses non-coordinate nps
            Argument pattern: nounphrase(InputWordList, OutputWordList, Gaplist, Index, Preference, Weight, Rel, Intrel, Function, ParseTree, Number, Person, Semantics)
            List of arguments:            Index: used for coindexing purposes
                                                Rel, Intrel: used in the treatment of relative and interrogative nps
                                                Number, Person: percolated from the head noun
                                                Semantics: a list of semantic features percolated from the np head

nsubject
            Class: PARSING / CHECK
            Purpose: checks that the np which bears the function passed as arg can be the subject of further vps to the right (the running subject, used in control relations)
            Argument pattern: nsubject(Function)
            List of arguments: Function: a grammatical function, such as subject_inf, object, ...

oraise
            Class: LEXICAL / GENERATION
            Purpose: checks that a verb belongs to the class of object-raising verbs
            Argument pattern: oraise(Verb, Requires)
            List of arguments: Requires is set to the particle to, indicating that oraise verbs are followed by a to-infinitive (I believe him to have taught linguistics)

oraising
            Class: GENERATION
            Purpose: carries out subject-to-object raising in generation: the subject in the controlled clause is extracted and made the object of the matrix clause (the
control predicate will take care of deleting the subject of the controlled clause through ghosting)
            Argument pattern: oraising(DeeperTree, MoreSurfacyTree)

parse
            Class: PARSING
            Purpose: main predicate for the parser.
Parse parses main declarative clauses, yes-no questions and wh-questions
            Argument pattern: parse(InputWordList, OutputWordList, ParseTree)

passive
            Class: GENERATION
            Purpose: restores the surface subject and produces a by-phrase for the deep subject
            Argument pattern: passive(DeeperTree, MoreSurfacyTree)

pick
            Class: UTIL
            Purpose: non-deterministic selection of an element in a list
            Argument pattern: pick(List, Element, RemainderOfList)

poids