TreeAdjoiningGrammars.ppt
Tree Adjoining Grammars,CIS 530 Intro to NLP,CIS 530-Intro to NLP,2,Context Free Grammars,S,NP,S,V,does,S,NP,V,think,VP,S,NP,who,Harry,Bill,Who does Bill think Harry likes?,Context Free Grammars:Derivations,CIS 530-Intro to NLP,3,Context Free Grammars,S,NP,S,V,does,S,NP,V,think,VP,S,NP,who,Harry,Bill,Who does Bill think Harry likes?,Context Free Grammars:Semantics,Meaning relations of the predicate/argument structures is lost in the treelikes(Harry,who),CIS 530-Intro to NLP,4,Context Free Grammars,CFGs can be parsed in time proportional to n3,where n is the length of the input in words by algorithms like CKY.,Context Free Grammars:Complexity,CIS 530-Intro to NLP,5,Transformational Grammars,S,NP,S,V,does,S,NP,V,think,VP,S,NP,V,NP,likes,VP,Harry,Bill,Who does Bill think Harry likes?,Context Free Deep StructureplusMovement Transformations,CIS 530-Intro to NLP,6,TGs can be parsed in exponential time2n,where n is the length of the input in wordsExponential time is intractable,because exponentials grow so quickly,Transformational Grammars:Complexity,CIS 530-Intro to NLP,7,Lexicalized LTAG,Finite set of elementary trees anchored on lexical items-encapsulates syntactic and semantic dependenciesElementary trees:Initial and Auxiliary,CIS 530-Intro to NLP,8,LTAG:A set of Elementary Trees,CIS 530-Intro to NLP,9,a1:,S,NP,V,NP,likes,a2:,S,NP,V,NP,likes,NP,e,S,transitive,object extraction,some other trees for likes:subject extraction,topicalization,subject relative,object relative,passive,etc.,VP,VP,LTAG:Examples,CIS 530-Intro to NLP,10,Lexicalized LTAG,Finite set of elementary trees anchored on lexical items-encapsulates syntactic and semantic dependenciesElementary trees:Initial and AuxiliaryOperations:Substitution and Adjoining,CIS 530-Intro to NLP,11,a:,X,b:,X,g:,X,b,Substitution,CIS 530-Intro to NLP,12,a:,X,b:,X*,X,g:,X,X,b,Tree b adjoined to tree a at the node labeled X in the tree a,Adjoining,CIS 530-Intro to NLP,13,LTAG:A derivation,CIS 530-Intro to NLP,14,LTAG:A derivation,CIS 530-Intro to NLP,15,LTAG:A derivation,CIS 530-Intro to NLP,16,LTAG:A derivation,NP,S,a2:,CIS 530-Intro to NLP,17,LTAG:A derivation,NP,S,a2:,CIS 530-Intro to NLP,18,LTAG:A derivation,NP,S,a2:,NP,S,a2:,S,CIS 530-Intro to NLP,19,LTAG:A derivation,NP,S,a2:,S,NP,V,NP,likes,e,VP,b1:,S,CIS 530-Intro to NLP,20,LTAG:A derivation,NP,S,a2:,CIS 530-Intro to NLP,21,LTAG:Semantics,S,NP,S,V,does,S,NP,V,think,VP,S,NP,V,NP,likes,e,VP,who,Harry,Bill,who does Bill think Harry likes,Meaning relations of the predicate/argument structures are clear in the original base trees!,CIS 530-Intro to NLP,22,S,NP,V,NP,likes,NP,e,S,VP,S,NP,V,S*,b1:,think,VP,b2:,V,S,does,S*,NP,NP,NP,who,Harry,Bill,a3:,a2:,a4:,a5:,substitution,adjoining,who does Bill think Harry likes,LTAG:A Derivation,CIS 530-Intro to NLP,23,who does Bill think Harry likes,a2:,likes,a3:,who,b1:,think,a4:,Harry,b2:,does,a5:,Bill,*Compositional semantics on this derivation structure*Related to dependency diagrams,substitution,adjoining,LTAG:Derivation Tree,TAGs:Complexity,TAGs can be parsed in polynomial timen5 rather than n3 for CFGsTAGS are a prime example of mildly context sensitive grammars(MCSGs)Plausible:MCSGs are sufficient to capture the grammars of all human languagesE.g.can parse Swiss German,CIS 530-Intro to NLP,24,CIS 530-Intro to NLP,25,Context Free Grammars Structure doesnt well represent“domains of locality”reflecting meaning Parsed in polynomial time n3(n is the length of the input)Transformational Grammars Captures domains of locality,accounting for surface word order by“movement”Parsing is intractable,requring 2n time Tree Adjoining Grammars Captures domains of locality,with surface discontiguities the result of adjunction Parsed in polynomial time n5(rather than n3 for CFGs),Adequacy vs.Complexity,TAGs&Mildly Context Sensitive Languages:Swiss German,CIS 530-Intro to NLP,26,CIS 530-Intro to NLP,27,English relative clauses are nested,NP1 The mouse VP1 ate the cheeseForm:NP1 VP1NP1 The mouse NP2 the cat VP2 chased VP1 ate the cheeseForm:NP1 NP2 VP2 VP1Theorem:Languages of form wwr are context free,CIS 530-Intro to NLP,28,CFG trees naturally nest structure,V,NP,ate,the cheese,VP1,S,NP,VP2,S,CIS 530-Intro to NLP,29,Swiss German sentences are harder.,In English:NP1 Claudia VP1 watched NP2 Eva vp2 make NP3 Ulrich VP2 workForm:NP1 VP1 NP2 VP2 NP3 VP3Not hard In Swiss German:NP1 Claudia NP2 Eva NP3 Ulrich VP1 watched vp2 make VP3 workForm:NP1 NP2 NP3 VP1 VP2 VP3Theorem:Languages of form ww cannot be generated by Context Free Grammars,CIS 530-Intro to NLP,30,Scrambling:N1 N2 N3 V1 V2 V3,V1,VP,N1 VP,VP,VP,N1,e,VP,N2 VP,VP,VP,N2,V2,e,VP,VP,N3 VP,VP,VP,N3,V3,e,VP,CIS 530-Intro to NLP,31,Scrambling:N1 N2 N3 V1 V2 V3,VP,N1,VP,N2 VP,VP,VP,VP,N3 VP,VP,VP,N3,V3,e,VP,CIS 530-Intro to NLP,32,Scrambling:N1 N2 N3 V1 V2 V3,VP,VP,VP,N3,V3,e,CIS 530-Intro to NLP,33,A Simple Synchronous TAG translator,CIS 530-Intro to NLP,34,Substituting in“John”and“Mary”,CIS 530-Intro to NLP,35,Substituting“Apparently”,Parsing TAGs by“Supertagging”:Reducing parsing to POS tagging+,CIS 530-Intro to NLP,37,Supertag disambiguation-supertagging,Given a corpus parsed by an LTAG grammarWe have statistics of supertags-unigram,bigram,trigram,etc.These statistics combine the lexical statistics as well as the statistics of the constructions in which the lexical items appear,CIS 530-Intro to NLP,38,Supertagging,the purchase price includes two ancillary companies,a9,b2,a1,.,.,.,a10,a6,a2,.,.,.,b1,a11,a7,a3,.,.,.,b3,a12,b4,a4,.,.,.,a13,a8,a5,.,.,.,On the average a lexical item has about 8 to 10 supertags,CIS 530-Intro to NLP,39,Supertagging,the purchase price includes two ancillary companies,a9,b2,a1,.,.,.,a10,a6,a2,.,.,.,b1,a11,a7,a3,.,.,.,b3,a12,b4,a4,.,.,.,a13,a8,a5,.,.,.,-Select the correct supertag for each word-shown in blue-Correct supertag for a word means the supertag that corresponds to that word in the correct parse of the sentence,CIS 530-Intro to NLP,40,Supertagging-performance,-Performance of a trigram supertagger,-Performance on the WSJ corpus,Size of thetraining corpus,Size of thetest corpus,#of wordscorrectly supertagged,%correct,Baseline,47,000,35,391,75.3%,1 million,47,000,43,334,92.2%,Srinivas(1997),CIS 530-Intro to NLP,41,Abstract character of supertagging,Complex(richer)descriptions of primitivesContrary to the standard mathematical conventionDescriptions of primitives are simpleComplex descriptions are made from simple descriptionsAssociate with each primitive all information associated with it,CIS 530-Intro to NLP,42,Complex descriptions of primitives,Making descriptions of primitives more complexIncreases the local ambiguity,i.e.,there are more descriptions for each primitiveHowever,these richer descriptions of primitives locally constrain each otherAnalogy to a jigsaw puzzle-the richer the description of each primitive the better,CIS 530-Intro to NLP,43,Complex descriptions of primitives,Making the descriptions of primitives more complexAllows statistics to be computed over these complex descriptionsThese statistics are more meaningfulLocal statistical computations over these complex descriptions lead to robust and efficient processing,CIS 530-Intro to NLP,44,A different perspective on LTAG,Treat the elementary trees associated with a lexical item as if they are super part of speech(super-POS or supertags)Local statistical techniques have been remarkably successful in disambiguating standard POSApply these techniques for disambiguating supertags-almost parsing,