Sorry this took an extra week to get out. I’ll try to be more reliable in the future. Please forgive me :(
The development of syntactic theory after the core of EST was laid down was relatively hodgepodge. Some aspects were relatively minor in the grand scheme of things, and other developments were significant large scale changes in the way people viewed grammar. In this part on the Revised Extended Standard Theory (REST), I’m going to address the relative minor aspect of COMP and S’, and then move on to a more fundamental paradigm shift: traces and Move α. It’s worth pointing out that from this point on, people were thinking of sentences as being trees, not merely the vague “annotated strings” of previous decades.
COMP and S’
The development of cycles in syntax allowed for a number of useful generalizations to be made about what parts of sentences could be reordered. To recap, the idea of cycle is basically that all transformations apply to a lower sentence before applying to the higher sentence that contains it. One important hypothesis that grew out of the use of cycles is that once a cycle has been finished (once all transformations have been applied to it), no movement out of that cycle can happen. This poses significant problems for questions like who does John think Mary knows? where “who” originates as a complement to “likes” and would need to escape the lower sentence. Since this is a main-sentence movement, the embedded sentence should be a completed cycle, and thus impenetrable to extraction. Despite this obvious flaw, the PIC does make a lot of other phenomena simple to explain, and so people sought a way to reconcile these two opposing facts. In comes Comp and S’.
Sentences traditionally had been viewed as some sort of structure that had a structure roughly like NP (tense) VP, and which had no projection above them. With the introduction of the S’ structure, a minor optional layer on top of S was introduced. S’ has the structure COMP S. COMP is actually something of a metasyntactic variable, in that it denotes any phrase that moves from within S to adjoin with S. What’s the benefit? Well, the idea of the COMP position is that it is not subject to the PIC. In other words, if something can move to the COMP position, it can escape from the restriction the PIC places on structure. This notion made it possible to both account for the apparent power of the PIC and the odd behavior of embedded clause questions.
A derivation of who does John think Mary knows? is relatively straight-foward , then: the WH phrase “who” raises from next to knows to COMP of the lower sentence, and then raises to the question word position of the higher phrase:
- John thinks Mary knows who
- John thinks who Mary knows (raise to COMP)
- who does John think Mary knows (raise to question word position; “do” support)
A more important development was the notion of traces. A great deal of work had been done attempting to understand the relationship between transformations and the interpretations of the derived sentences. In a sentence where something moves from one position to another, but still retains some sort of interpretation from the original position, the question could be asked “how does it maintain that interpretation?” Up until trace theory, the normal idea was to have some semantic interpretation rule that goes along with each transformation. This was inelegant, and very ad-hoc, providing no generalizations, and so an important notion from logic, that of the bound variable, was sort of imported into syntactic theory. Traces are essentially just phonologically empty phrases left behind in the position of a phrase that moves, having the same category (NP’s leave behind [NP ε], etc.), which are “bound” by (i.e. sharing the value of) the phrase that moved. Often times they’re denoted as t, leaving out the category information, with indexes to link them visually to the phrases that bind them when more than one traces is shown.
An example derivation of the WH movement in who does John think Mary knows? with traces would be:
- John thinks Mary knows who
- John thinks who Mary knows t
- who does John think t Mary knows t
This now gives us the interpretation roughly like who is the x such that John thinks Mary knows x. The use of traces as bound variables makes it incredibly easy to describe the apparent interpretation of a single element as if it were in multiple places. Other phenomena, such as the inability to have a sentence like who what does John think ate? (intended to mean something like who is the person x, and what is the thing y, such that John thinks x ate y?), can be accounted for simply by saying that “what” cannot raise into the COMP position of the lower clause because the trace of “who” is occupying that position:
- John thinks [S whoi ate whatj]
- John thinks [S' whoi [S ti ate whatj]]
- whoi does John think [S' ti [S ti ate whatj]] lower COMP has ti, so there’s no space for whatj to move into in order to escape the lower clause
In prior work by Emonds, it had been shown that a certain kind of transformations, structure-preserving transformations, could account for a large variety of phenomena. The notion was simple: a phrase could only be shifted into a position where another phrase of the same kind had been, or could be base generated. That is to say, an NP could only be displaced from its position into another position that either previously had an NP in it, or where an NP could be generated by the base phrase structure rules of the grammar. Trace theory simplified this, by removing the possibility of moving into a position where a phrase previously had been, by simply retaining a phonologically empty trace there, making structure preserving transformations simply movement into an empty position where a phrase of that sort could be base generated. This sort of transformation turned out to show up all over the place, in all sorts of phenomena, and a generalization was made that said NPs (specifically) weren’t moved around by normal transformations, but rather by a general rule called Move NP, which said simply, for each NP, dislocate it somewhere until no more dislocations can be made. Later, a similar rule, Move WH, was conceived. Eventually, it was realized that all transformations could be described in terms of a rule Move α, which says, for each phrase (or word) α, dislocate it somewhere until no more dislocations can be made.
It was also quickly realized, however, that this idea, moving stuff until you can’t move anymore, was pretty darned unconstrained, and so a number solutions were found that not only constrained the grammar, but accounted for even more data. One such method was the case filter, whereby all NPs are required to be assigned case at some point in the derivation of a sentence. Any sentence with a caseless NP was ungrammatical. A second, and undoubtedly more crucial notion, is the idea of defining candidate positions for the landing site of a dislocation. The idea presented at the time was that dislocation could only move an item into a c-commanding position, where the landing site is sister to the original position, or to one of the original position’s ancestor nodes in the tree for the sentence (a simple way of thinking of this is that a node c-commands its sisters and their descendants). Movement of any other sort was ungrammatical. another way of describing this that was popular in the literature (and still is to this day), is to say that traces must be bound by their antecedents (which we get from trace theory), and antecedents can only bind traces from a c-commanding position.
This addition now made grammatical sentences especially easy to generate. Simply start with some basic sentence, and just start moving things up the tree, extending the tree when necessary according to the phrase structure rules, and throw away any sentence that violates the case filter, or has movement to a non-c-commanding position, and so forth. This would eventually give rise to a whole new industry in syntax of discovering what filters or conditions on movement existed, and also on using existing filters and conditions to make inferences about the existence of new structures, or nature of old structures, that weren’t obvious at first site. This reduction of all of syntax to some core phrase structure, a single generic Move α rule, and a collection of constraints on derived structures, would eventually give rise in the early 80s to the program know as Government and Binding (GB).