CPSC Computational Linguistics - cs.ubc.ca

Intelligent Systems (AI2) Computer Science cpsc422, Lecture 23 Oct 30, 2019 Slide credit: Probase Microsoft Research Asia, YAGO Max Planck Institute, National Lib. Of Medicine, NIH CPSC 422, Lecture 23 Slide 1 To summarize: Truth in firstorder logic Sentences are true with respect to an interpretation World contains objects (domain elements) Interpretation specifies referents for constant symbols objects predicate symbols relations function symbols functional relations

C An atomic sentence predicate(term1,...,termn) is true iff the objects referred to by term1,...,termn are in the relation referred to by predicate RLL JLL Quantifiers Allows us to express Properties of collections of objects instead of enumerating objects by name

Universal: for all Properties of an unspecified object Existential: there exists CPSC 422, Lecture 22 Slide 3 Universal quantification Everyone at UBC is smart: x At(x, UBC) Smart(x) x P(x) is true in an interpretation I iff P is true with x being each possible object in I Equivalent to the conjunction of instantiations of P ...

At(KingJohn, UBC) Smart(KingJohn) At(Richard, UBC) Smart(Richard) At(Ralphie, UBC) Smart(Ralphie) CPSC 422, Lecture 22 Slide 4 Existential quantification Someone at UBC is smart: x At(x, UBC) Smart(x) x P(x) is true in an interpretation I iff P is true with x being some possible object in I Equivalent to the disjunction of instantiations of P At(KingJohn, UBC) Smart(KingJohn) At(Richard, UBC) Smart(Richard)

At(Ralphie, UBC) Smart(Ralphie) ... CPSC 422, Lecture 22 Slide 5 Properties of quantifiers x y is not the same as y x x y Loves(x,y) There is a person who loves everyone in the world y x Loves(x,y) Everyone in the world is loved by at least one person Quantifier duality: each can be expressed using the other x Likes(x,IceCream) x Likes(x,IceCream) x Likes(x,Broccoli) x Likes(x,Broccoli) CPSC 422, Lecture 22

Slide 6 FOL: Inference Resolution Procedure can be generalized to FOL Every formula can be rewritten in logically equivalent CNF Additional rewriting rules for quantifiers Similar Resolution step, but variables need to be unified (like in DATALOG) CPSC 422, Lecture 22 7 NLP Practical Goal for FOL: the ultimate Web question-answering

system? Map NL queries into FOPC so that answers can be effectively computed What African countries are not on the Mediterranean Sea? c Country (c) ^ Borders (c, Med .Sea ) ^ In(c, Africa ) Was 2007 the first El Nino year after 2001? ElNino (2007) y Year ( y ) ^ After ( y,2001) ^ Before ( y,2007) ElNino ( y ) CPSC 422, Lecture 22 8 Just a sketch: to provide some context for some concepts /

techniques covered in 422 CPSC 422, Lecture 23 Slide 9 Logics in AI: Similar slide to the one for planning Propositional Definite Clause Logics Proposition al Logics Ontologies Semantics and Proof Theory

Satisfiability Testing (SAT) First-Order Logics Production Systems Cognitive Architectures Semantic Web Informatio n Hardware Verification

Product Configuration Video Games Summarizati on Tutoring Systems CPSC 422, Lecture 21 Slide 10 Lecture Overview Ontologies what objects/individuals should we represent? what relations (unary, binary,..)? Inspiration from Natural Language:

WordNet and FrameNet Extensions based on Wikipedia and mining the Web (YAGO, ProBase, Freebase) Domain Specific Ontologies (e.g., Medicine: MeSH, UMLS) Links to Web Interfaces on course webpage Each can be downloaded CPSC 422, Lecture 23 11 Ontologies Given a logical representation (e.g., FOL) What individuals and relations are there and we need to model? In AI an Ontology is a specification of what

individuals and relationships are assumed to exist and what terminology is used for them What types of individuals What properties of the individuals ..... CPSC 422, Lecture 23 Slide 12 Ontologies: inspiration from Natural Language : How do we refer to individuals and relationships in the world in Natural Languages e.g., English? Where do we find definitions for words? Most of the definitions are circular? They are

descriptions. Fortunately, there is still some useful semantic info (Lexical Relations): w1 w2 same Form and Sound, different Meaning Homonymy w1 w2 same Meaning, different Form Synonymy Antonymy w1 w2 opposite Meaning Hyponymy w1 w2 Meaning1 subclass of Meaning 2 CPSC 422, Lecture 23 13 Polysemy

Def. The case where we have a set of words with the same form and multiple related meanings. Consider the homonym: bank commercial bank1 vs. river bank2 Now consider: VGH is the hospital with the largest blood bank in BC or A PCFG can be trained using derivation trees from a tree bank annotated by human experts Are these a new independent senses of bank? CPSC 422, Lecture 23 14 Synonyms Def. Different words with the same meaning.

Substitutability- if they can be substituted for one another in some environment without changing meaning or acceptability. Would I be flying on a large/big plane? ? became kind of a large/big sister to ? You made a large/big mistake CPSC 422, Lecture 23 15 Hyponymy/Hypernym Def. Pairings where one word denotes a sub/super class of the other

Since dogs are canids Dog is a hyponym of canid and Canid is a hypernym of dog car/vehicle doctor/human CPSC 422, Lecture 23 16 Lexical Resources Databases containing all lexical relations among all words Development: Mining info from dictionaries and thesauri Handcrafting it from scratch

WordNet: first developed with reasonable coverage and widely used, started with [Fellbaum 1998] for English (versions for other languages have been see 17 CPSC 422, Lecture developed 23 WordNet 3.0 Part Of Speech Noun Verb

Adjective Adverb Totals Unique Strings 117798 11529 21479 4481 155287 Word-Sense Pairs Synset s

146312 82115 25047 13767 30002 18156 5580 3621 206941 117659 For each word: all possible senses (no distinction between homonymy and polysemy) For each sense: a set of synonyms (synset) and a CPSC gloss 422, Lecture 23 18 WordNet: entry for table

The noun "table" has 6 senses in WordNet. 1. table, tabular array -- (a set of data ) 2. table -- (a piece of furniture ) 3. table -- (a piece of furniture with tableware) 4. mesa, table -- (flat tableland ) 5. table -- (a company of people ) 6. board, table -- (food or meals ) e verb "table" has 1 sense in WordNet. postpone, prorogue, hold over, put over, ble, shelve, set back, defer, remit, put off old back to a later time; "let's postpone the exam") CPSC 422, Lecture 23 19 WordNet Relations (between

synsets!) fi CPSC 422, Lecture 23 20 WordNet Hierarchies: Vancouver WordNet: example from ver1.7.1 For the three senses of Vancouver (city, metropolis, urban center) (municipality) (urban area) (geographical area) (region) (location)

(entity, physical thing) (administrative district, territorial division) (district, territory) (region) (location (entity, physical thing) (port) (geographic point) (point) (location) (entity, physical thing) CPSC 422, Lecture 23 21 Visualizing Wordnet Relations

C. Collins, WordNet Explorer: Applying visualization principles to lexical semantics, University of Toronto, Technical Report kmdi 2007-2, 2007. CPSC 422, Lecture 24 Slide 22 Web interface & API CPSC 422, Lecture 23 Slide 23 Wordnet: NLP Tasks First success in obscure task for Probabilistic Parsing (PP-attachments): words + word-classes extracted from the hypernym hierarchy increase accuracy

from 84% to 88% [Stetina and Nagao, 1997] Word sense disambiguation Lexical Chains (summarization) and many others ! More importantly starting point for larger Ontologies! CPSC 422, Lecture 23 24 More ideas from NLP. Relations among words and their meanings (paradigmatic) Internal structure of individual words (syntagmatic) CPSC 422, Lecture 23

25 Predicate-Argument Structure Represent relationships among concepts, events and their participants I ate a turkey sandwich for lunch w: Isa(w,Eating) Eater(w,Speaker) Eaten(w,TurkeySandwich) MealEaten(w,Lunch) Nam does not serve meat w: Isa(w,Serving) Server(w, Nam) Served(w,Meat) CPSC 422, Lecture 23 26

Semantic Roles: Resources Move beyond inferences about single verbs IBM hired John as a CEO John is the new IBM hire IBM signed John for 2M\$ FrameNet: Databases containing frames and their syntactic and semantic structures argument book online Version 1.5-update (revised in 2016) for English (versions for other languages are under development) FrameNet

Tutorial at NAACL/HLT 2015! CPSC 422, Lecture 23 27 FrameNet Entry Hiring Definition: An Employer hires an Employee, promising the Employee a certain Compensation in exchange for the performance of a job. The job may be described either in terms of a Task or a Position in a Field. Inherits From: Intentionally affect Lexical Units: commission.n, commission.v, give job.v, hire.n, hire.v, retain.v, sign.v, take on.v CPSC 422, Lecture 23

28 FrameNet : Semantic Role Labeling Some roles.. Employer Employee Task Position np-vpto In 1979 , singer Nancy Wilson HIRED him to open her nightclub act . . np-ppas Castro has swallowed his doubts and HIRED Valenzuela as a cook in his small restaurant .

CPSC 422, Lecture 23 29 Lecture Overview Ontologies what objects/individuals should we represent? what relations (unary, binary,..)? Inspiration from Natural Language: WordNet and FrameNet Extensions based on Wikipedia and mining the Web & Web search logs (YAGO, ProBase, Freebase,) Domain Specific Ontologies (e.g., Medicine: MeSH, UMLS) CPSC 422, Lecture 23

30 YAGO2: huge semantic knowledge base Derived from Wikipedia, WordNet and GeoNames. (started in 2007, paper in www conference) 106 entities (persons, organizations, cities, etc.) >120* 106 facts about these entities. YAGO accuracy of 95%. has been manually evaluated. Anchored in time and space. YAGO attaches a temporal dimension and a spatial dimension to many of its facts and entities. CPSC 422, Lecture 23 31

Freebase Collaboratively constructed database. Freebase contains tens of millions of topics, thousands of types, and tens of thousands of properties and over a billion of facts Automatically extracted from a number of resources including Wikipedia, MusicBrainz, and Notable Names Database (NNDB ) as well as the knowledge contributed by the

human volunteers. All was available for free through the APIs or to download from weekly data dumps CPSC 422, Lecture 23 Slide 32 Fast Changing Landscape. On 16 December 2015, Google officially announced the Knowledge Graph API, which is meant to be a replacement to the Freebase API. Freebase.com was officially shut down on 2 May 2016.[6] CPSC 422, Lecture 23

Slide 33 Probase (MS Research) Harnessed from billions of web pages and years worth of search logs Extremely large concept/category space (2.7 million categories). Probabilistic model for correctness, typicality (e.g., between concept and instance) CPSC 422, Lecture 23 Slide 34 CPSC 422, Lecture 23 Slide 35

A snippet of Probase's core taxonomy CPSC 422, Lecture 23 Slide 36 Frequency distribution of the 2.7 million concepts The Y axis is the number of instances each concept contains and on the X axis are the 2.7 million concepts ordered by their size. besides popular concepts such as cities and musicians, which are included by almost every general purpose taxonomy, Probase has millions of long tail concepts such as anti-parkinson treatments, "celebrity wedding dress designers and basic

CPSC 422, Lecture 23 Slide 37 Fast Changing Landscape. From Probase page [Sept. 2016] Please visit our Microsoft Concept Graph release for up-todate information of this project! Another one DBpedia DBpedia is a crowd-sourced community effort to extract structured information from Wikipedia and make this information available on the Web. allows you to ask sophisticated queries against Wikipedia, CPSC data 422, Lecture

23 on the Web Slide 38 link the different sets Interesting dimensions to compare Ontologies (but form Probase so possibly biased) CPSC 422, Lecture 23 Slide 39 Lecture Overview Ontologies what objects/individuals should we represent? what relations (unary, binary,..)? Inspiration from Natural Language:

WordNet and FrameNet Extensions based on Wikipedia and mining the Web (YAGO, ProBase, Freebase) Domain Specific Ontologies (e.g., Medicine: MeSH, UMLS) CPSC 422, Lecture 23 40 Domain Specific Ontologies: UMLS, MeSH Unified Medical Language System: brings together many health and biomedical vocabularies Enable interoperability (linking medical terms, drug names) Develop electronic health records, classification tools

Search engines, data mining CPSC 422, Lecture 23 Slide 41 Portion of the UMLS Semantic Net CPSC 422, Lecture 23 Slide 42 Ontologies : Summary CPSC 422, Lecture 23 43

Learning Goals for todays class You can: Define an Ontology Describe and Justify the information represented in Wordnet and Framenet Describe and Justify the three dimensions for comparing ontologies CPSC 422, Lecture 23 Slide 44 Assignment-3 out due Nov 18 (8-18 hours working in pairs on programming parts is strongly advised) Next class Fri

Similarity measures in ontologies (e.g., Wordnet) CPSC 422, Lecture 23 45 DBpedia is a structured twin ofWikipedia. Currently it describes more than 3.4 million entities. DBpedia resources bear the names of the Wikipedia pages, from which they have been extracted. YAGO is an automatically created ontology, with taxonomy structure derived from WordNet, and knowledge about individuals extracted from Wikipedia. Therefore, the identifiers of resources describing individuals in YAGO are named as the corresponding Wikipedia pages. YAGO contains knowledge about more than 2 million entities and 20 million facts about them. Freebase is a collaboratively constructed database. It contains knowledge automatically extracted from a number of resources

including Wikipedia, MusicBrainz,2 and NNDB,3 as well as the knowledge contributed by the human volunteers. Freebase describes more than 12 million interconnected entities. Each Freebase entity is assigned a set of human-readable unique keys, which are assembled of a value and a namespace. One of the namespaces is the Wikipedia namespace, in which a value is the name of the Wikipedia page describing an entity. CPSC 422, Lecture 23 46 Summary Relations among words and their meanings Wordnet YAGO

Probase Internal structure of individual words PropBank VerbNet FrameNet CPSC 422, Lecture 23 47 Table 1: Scale of concept dimension # of concepts SenticNet

14,244 Freebase 1,450 WordNet 25,229 WikiTaxonomy < 127,325 YAGO 149,162 DBPedia 259 ResearchCyc 120,000 KnowItAll N/A TextRunner N/A OMCS 23,365 NELL

123 Probase 2,653,872 name CPSC 422, Lecture 23 Slide 48 Today 12 Feb : Syntax-Driven Semantic Analysis Meaning of words Relations among words and their meanings (Paradigmatic) Internal structure of individual

words (Syntagmatic) CPSC 422, Lecture 23 49 Practical Goal for (Syntax-driven) Semantic Analysis Map NL queries into FOPC so that answers can be effectively computed What African countries are not on the Mediterranean Sea? c Country (c) ^ Borders (c, Med .Sea ) ^ In(c, Africa ) Was 2007 the first El Nino year after 2001? ElNino (2007) y Year ( y ) ^ After ( y,2001) ^ Before ( y,2007) ElNino ( y )

CPSC 422, Lecture 23 50 Semantic Analysis I am going to SFU on Tu Meanings of grammatical structures Meaning s of words Common-Sense Domain knowledge

Discours e Structure Context Shall we meet on Tue? What time is it? Sentence The garbage truck just left Syntax-driven Semantic Analysis Literal Meanin g

Further Analysis Intended meaning CPSC 422, Lecture 23 I N F E R E N C E 51 Compositional Analysis

Principle of Compositionality The meaning of a whole is derived from the meanings of the parts What parts? The constituents of the syntactic parse of the input CPSC 422, Lecture 23 52 Compositional Analysis: Example AyCaramba serves meat e Serving (e)^ Server (e, AyCaramba )^ Served (e, Meat )

CPSC 422, Lecture 23 53 Augmented Rules Augment each syntactic CFG rule with a semantic formation rule Abstractly A 1...n { f ( 1.sem,...n.sem)} i.e., The semantics of A can be computed from some function applied to the semantics of its parts. The class of actions performed by f

will be quite restricted. CPSC 422, Lecture 23 54 Simple Extension of FOL: Lambda Forms A FOL sentence with xP(x) variables in it that are to be bound. Lambda-reduction: variables are bound by xP ( x)( Sally ) treating the lambda form P ( Sally ) as a function with formal

arguments xyIn( x, y ) Country ( y ) xyIn ( x, y ) Country ( y )( BC ) yIn( BC , y ) Country ( y ) yIn( BC , y ) Country ( y ) yIn( BC , y ) Country ( y )(CANADA) 422, Lecture 23 55 ) In( BC ,CPSC CANADA ) Country (CANADA Augmented Rules: Example Concrete entities assigning FOL Attachments constants PropNoun -> AyCaramba {AyCaramba}

{MEAT} MassNoun -> meat copying from Simple non-terminals daughters up to mothers. Attachments NP -> PropNoun {PropNoun.se m} NP -> MassNoun {MassNoun.se CPSC 422, Lecture 23 56 m} Augmented Rules: Example Semantics attached to one daughter is

applied to semantics of the other S -> NP VP daughter(s). {VP.sem(NP.sem)} VP -> Verb NP {Verb.sem(NP.se m) lambda-form Verb -> serves xy e Serving (e) ^ Server (e, y ) ^ Served (e, x) CPSC 422, Lecture 23 57 Exampl e y

AC y MEAT AC . MEAT

{VP.sem(NP.sem)} S -> NP VP {Verb.sem(NP.sem) VP -> Verb NP Verb -> serves xy e Serving (e)^ Server (e, y )^ Served (e, x) {PropNoun.sem} NP -> PropNoun {MassNoun.sem} NP -> MassNoun {AC} PropNoun -> 422, AyCaramba {MEAT} CPSC Lecture 23 58

References (Project?) Text Book: Representation and Inference for Natural Language : A First Course in Computational Semantics Patrick Blackburn and Johan Bos, 2005, CSLI J. Bos (2011): A Survey of Computational Semantics: Representation, Inference and Knowledge in Wide-Coverage Text Understanding. Language and Linguistics Compass 5(6): 336366. Next Time Read Chp. 19 (Lexical Semantics) CPSC 422, Lecture 23 59

Next Time Read Chp. 20 Computational Lexical Semantics Word Sense Disambiguation Word Similarity Semantic Role Labeling CPSC 422, Lecture 23 60 Lexeme: Orthographic form + Phonological form + Meaning (sense) [Modulo inflectional morphology] content?

bank? duck? Stem? Word? Lemma? celebration? celebrate? banks? Lexicon: A collection of lexemes CPSC 422, Lecture 23 61 Homonymy

Def. Lexemes that have the same forms but unrelated meanings Examples: Bat (wooden stick-like thing) vs. Bat (flying scary mammal thing) Plant (.) vs. Plant () Homographs content/ content Homonyms CPSC 422, Lecture 23 Homophone s wood/would

62 Relevance to NLP Tasks Information retrieval (homonymy): QUERY: bat Spelling correction: homophones can lead to real-word spelling errors Text-to-Speech: homographs (which are not homophones) CPSC 422, Lecture 23 63 Polysemy Lexeme (new def.): Orthographic form + Phonological form +

Set of related senses How many distinct (but related) senses? They serve meat He served as Dept. Head She served her time. Different subcat Intuition (prison) Does AC serve vegetarian food? Zeugma Does AC serve Rome? (?)Does AC serve vegetarian food and 64

CPSC 422, Lecture 23 Rome? Thematic Roles: Usage Sentence Syntax-driven Semantic Analysis Literal Meaning expressed with thematic roles Further Analysis Intended meaning CPSC 422, Lecture 23 Constraint

Generation Eg. Instrument Eg. with Subject? Support more abstract INFERENCE Eg. Result did not exist before 65 Semantic Roles Def. Semantic generalizations over the specific roles that occur with specific verbs.

I.e. eaters, servers, takers, givers, makers, doers, killers, all have something in common We can generalize (or try to) across other roles as well CPSC 422, Lecture 23 66 Thematic Role Examples fi fl CPSC 422, Lecture 23 67

Thematic Roles fi fi Not definitive, not from a single theory! CPSC 422, Lecture 23 68 Problem with Thematic Roles NO agreement of what should be the standard set NO agreement on formal definition Fragmentation problem: when you try to formally define a role you end up creating more specific sub-roles

Two solutions Generalized semantic roles Define verb (or class of verbs) specific semantic roles CPSC 422, Lecture 23 69 Generalized Semantic Roles Very abstract roles are defined heuristically as a set of conditions The more conditions are satisfied the more likely an argument fulfills that role Proto-Agent Proto-Patient

Undergoes change of Volitional involvement in event or state state Sentience (and/or Incremental theme perception) Causally affected by Causing an event or another participant change of state in another Stationary relative to participant movement of another Movement (relative to participant position of another (does not exist

participant) (exists independently CPSC 422, the of Lecture 23independently of 70 event named) event, or at all) Semantic Roles: Resources Databases containing for each verb its syntactic and thematic argument structures PropBank: sentences in the Penn Treebank annotated with semantic roles Roles are verb-sense specific Arg0 (PROTO-AGENT), Arg1(PROTOPATIENT), Arg2,. (see also

VerbNet) CPSC 422, Lecture 23 71 PropBank Example Increase go up incrementally Arg0: Arg1: Arg2: Arg3: Arg4:

causer of increase thing increasing amount increase byGlosses for start point human reader. Not formally end point defined PropBank semantic role labeling would identify common aspects among these three examples Y performance increased by 3% Y performance was increased by the new X technique CPSC 422, Lecture 23 The new X technique

increased performance72of

Recently Viewed Presentations

• Posterior: Transversalis fascia, reinforced medially by the conjoint tendon. ... Posteriorly by the superior ramus of the pubis ??(pec?tineal ??or Cooper's ligament) The hernia forms a bulge in the upper inner aspect of the thigh.
• At the end of Year 3 we assess children using the Symphony and Abacus assessments, which are based on the National Curriculum. In order for the children to achieve the expected standard they need to meet the majority of the...
• The fall in output was the largest since records began to be kept in 1952." (Economist, February 21, 2009) "Several economists are now forecasting that Taiwan's GDP will contract by 3% or more this year, which would be the steepest...
• The 'moral problem' Motivational Humeanism. MH1: An agent can be motivated to act only by some combination of a desire and a means-end belief. MH2: Desires and beliefs are 'distinct existences' in the sense that they are
• Melting and dissolving are the same thing. [Lee (93), MS: Key] B.5.1.1 . Salt becomes liquid salt when it dissolves ... Kansas State University. (C/O Emmett Wright, College of Education, Bluemont Hall, Kansas State University, Manhattan, KS 1-913-532-7838) Use of...
• The discussion is guided by students' response to what they have read. You may hear talk about events and characters in the book, the author's craft, or personal experiences related to the story."SchlickNoe, K. L. & Johnson. N.L., Getting Started...
• Lesson 4: Ch 7 & 8 - Will Callum and Sephy's relationship last? To know. the plot and theme of the story. To understand . the relationships in the story are developing as the plot progresses.
• Consistent axes across poster. Results Cont: Minimize use of tables. Difficult to grasp quickly. Use figure legends/captions as text. Put text near figure it's describing ~1 paragraph per image/image group. RUBRIC! RESULTS. Maximal Impact/Points from: