Set the Language

We weren't able to detect the audio language on your flashcards. Please select the correct language below.

Front

Back

Flashcards
»
447 Final Flashcards

447 Final Flashcards

by DiegoDeveras, Dec. 2015

Subjects: NLP

Favorite

Add to folder

Flag

Shuffle
Toggle On

Toggle Off
Alphabetize
Toggle On

Toggle Off
Front First
Toggle On

Toggle Off
Both Sides
Toggle On

Toggle Off
Read
Toggle On

Toggle Off

Reading...

Front

Card Range To Study

through

Play button

Progress

1/94

Click to flip

Use LEFT and RIGHT arrow keys to navigate between flashcards;

Use UP and DOWN arrow keys to flip the card;

H to show hint;

A reads text to speech;

94 Cards in this Set

Front
Back

	What are grammar formalisms?	The rules of a language, a precise way to detail and describe the structure of sentences.
	What are specific grammars?	Implementations of a particular grammar formalism
	Types of verbs?	1) Intransitive Verbs: Takes only a subject (sleep) 2) Transitive Verbs: Takes one direct object (eat) 3) Ditransitive Verbs: Takes one indirect object (give)
	Why can't FSA describe grammar?	FSA's cannot generate a hierarchical structure that suits most grammars.
	Difference between strong and weak generative capacity?	Weak generative capacity: Only generating strings that fit a language. Strong generative capacity: Also concerns with generating the right structure that fit a language
	Use of CFG's?	CFG's can capture recursion. However, they are not able to capture bounded recursion (only embed one or two relative causes). They are equivalent to pushdown automata (PDA's).
	DG?	Dependency Grammar: describes the structured of sentences as a directed acyclic graph (words = nodes, dependencies = edges).
	Constituent?	A constituent is a phrase that acts as a single unit.
	How to find constituents?	1) Substitution Test 2) Movement Test 3) Answer Test -- used as an answer
	Arguments vs Adjuncts	Arguments: Mandatory Adjuncts: Optional
	Chomsky Normal Form	Can only have 2 kinds of right-had sides: 1) 2 Nonterminals 2) 1 Terminal
	CKY Algorithm	O(n3\|G\|) dynamic programming algorithm to find the POS tagging of the a sentence
	Number of Parse Trees in CKY	# of Parse Trees #(VP --> V NP) = #(V)#(NP) Then add up all the possible combinations
	Probabilistic Context-Free Grammar	A CFG with probabilities associated with each transaction. Each possible left hand side has to all equate to 1
	PCFG Modeling	It makes independence assumptions --> The flatter the better.
	Bias/Variance Tradeoff	A probability model has low bias if it makes few independence assumptions, but this leads to a higher variance which is a poor estimate of the data.
	Parser Evaluation: Evaluation of phrase-structure trees	Precision: What percentage of retrieved documents are relevant to the query? P = TP/(TP + FN) Recall: What percentage of the relevant documents were retrieved? R = TP/(TP + FP) F-Measure: Harmonic Mean
	Penn Treebank	The Penn Treebank was the first publicly available POS annotated corpus
	Ways to improve performance in PFCG	1) Change the internal grammar 2) Change the probability model
	Parent Transformation	PCFG assume the expansion of any nonterminal is independent of its parent. We can change the grammar by adding the name of the parent node to each nonterminal
	Lexicalization with PCFG	PFGs cant distinguish between "eat sushi with chopsticks" and "eat sushi with tuna", therefore we need to take into account head words.
	Dependency Grammer	Word-word dependencies are a component of many (most/all) grammar formalisms. Dependency grammar assumes that syntactic structure consists only of dependences Purely descriptive
	Kinds of dependencies	Head-argument: eat sushi Head-modifier: fresh sushi Head-specifier: the sushi Coordination: Sushi and Sashimi
	Dependency	There is a syntactic relation between a head H and a dependent D in a construction C if: The head H determines the syntactic category of the construction C The head H determines the semantic category of the construction C; D gives semantic specification The H is obligatory. The head selects D and determines whether D is obligatory or not.
	Prague Dependency Treebank	3 levels of annotation: Morphological Surface-syntactic Semantic
	Parsing Algorithms for DG	1) Transition-based parsers 2) Graph-based parsers
	Transition-based parsing	Transition based shift reduce parsing processes the sentences S w0w1...wn from left to right. It constructs a single tree. Utilizes a stack, and a buffer to create arcs.
	Feature Structures	List of features (= attributes), and values. Represented as AVM (attribute value matrix) Can be recursive Can be used as a way to properly do constraints
	Grammar Formalisms	Formalisms provide a language in which linguistic theories can be expressed and implemented. Formalisms define elementary objects and recursive operations which generate complex objects from simple objects
	Mildly context-sensitive gramars	Contains all context-free grammars/languages Can be parsed in polynomial time (On6) Strong generative capacity: capture certain kinds of dependences: nested and cross serial
	Tree-Adjoining Grammar (TAG)	Tree-rewriting format that defines operations on trees (elementary objects = trees) Is lexicalized by anchoring to another lexical item Is mildly context sensitive
	Combinatory Categorial Grammar (CCG)	Categories: specify subcat lists of words/constituents Combinatory rules: specify how constituents can combine Lexicon: Specifies which categories a word can have Derivations: spell out process of combining constituents
	CCG Categories	Simple (atomic) categories: NP, S, PP Complex Categories (functions): VP, intransitive verbs: S\NP Transitive Verb: (S\NP)/NP Adverbs: (S\NP)/(S/NP) Prepositions: ((S\NP)\(S\NP))/NP (NP/NP)/NP PP/NP
	Function Application	Forward Application: (S\NP)/NP NP => S\NP Backward Application: (NP S\NP) => S
	Why is MT Difficult	1) Correspondences 2) Lexical Divergences: 3) Syntactic divergences
	MT - Correspondences	One to one, One to Many/Many to One, Many to Many Reordering Required
	MT - Lexical Divergences	Different senses of homonymous words generally have different translations (river bank, vs financial bank). Different senses of polysemous words may also have different translations:
	MT - Lexical Specificity and Morphological Divergences	Lexical Specificity: German Kürbis = English pumpkin or squash English brother = Chinese gege (older), didi (younger) Morphological divergences: un nouveau, une nouvelle, etc.
	MT - Syntactic Divergences	Head-marking vs. Dependent-marking
	MT - Semantic Differences	Peter swims vs Peter is swimming
	Direct Translation	1) Morphological Analysis of Source String 2) Lexical transfer (using a translation dictionary) 3) Local reordering 4) Morphology
	Limits of direct translation:	Phrasal reordering
	Syntactic Transfer	Requires a syntactic parse of the source language, followed by reordering of the tree. Moves the words around to work.
	Interlingua Approach	Based on the assumption there is a common representation for a phrase in all languages, just have to find that and say that in the language.
	Noisy Channel Model	E = argmax P(F\|E) X P(E) P(F\|E) = Translation model, captures faithful-ness of the translation P(E) = Language Model, captures the fluency of the translation Used to translate
	Translation Probability	Use a phrase table and phrase alignment on a parallel corpus to get the counts
	IBM Model 1	Based on the noisy channel, word-based translation models 1) Generates length 2) Generate an alignment 3) Generate the words of the translation
	IBM Model 1 Parameters	Length Probability: P(m\|n): Probability of generating a source sentence of length m given a target sentence of length n? Alignment Probability: P( A\|m,n) Translation Probability: P(fj="lac"\|aj = i, ei="lake")
	IBM Model 1 EM	Expectation-Maximization (EM) E: Go through training data to compute counts M: Use the expected counts to compute a new model Check for convergence
	Phrase-based MT	Fundamental units of translation are phrases. Splits target sentence deterministically into phrases and translate each phrase. Reorder foreign phrases with distortion probability.
	Phrase-based MT - Finding the best translation	Since there is an exponential number of possible translations, we can use a heuristic-search algorithm. Score the partial translation based on the cost.
	MT Evaluation	Automatic: BLEU - Uses n-gram precision to grade a translation Manual: Use a human to track fluency and adequacy.
	Semantics	Lexical Semantics: meaning of a word Compositional Semantics: meaning of a sentence Discourse Semantics: meaning of a longer piece of text
	Nouns/Verbs	Nouns: Refers to entities in the world Verbs: define n-ary predicates: depending on the arguments they take, the results can be true or false
	Sentences	Declarative Sentences: can be true or false depending on the state of the world. Consists of a verb + 1 or more NP's
	Principle of compositionally (Frege):	The meaning of an expression depends on the meaning of its parts and how they are put together.
	Predicate logic expressions/ First-order predicate language (FOL)	Terms: refers to entities Predicates: relations between entitites Formulas: Can be true or false
	Predicate Formulas	Atomic Formulas: book(x), eat(x, y) Complex Formulas involve 1) Negation 2) Connectives 3) Quantifiers
	Limitations of FOL	Tense, Aspect, Modals, Quantifiers
	lambda-expressions	Use lambda expressions along with CCG's to construct sentences N = lambda z, chef(z)
	Quantifier Scope Ambiguity	"Every chef cooks a meal": For every chef, there is a meal which he cooks or There is some meal which every chef cooks
	Thematic Roles	Verbs describe events or states. Thematic roles refer to participants of these events: Agent: Person who performs the action Patient: Entity in which the action was performed on Tool: What was used to perform the action
	Proposition Bank (PropBank)	PropBank information classifies arguments needed for verbs.
	Diathesis Alternations	Active/passive alternation Causative alternation Dative alternation Locative alternation
	Discourse	Discourse is a linguistic unit that consists of multiple sentences. Speaker describes some state of the world Speaker attempts to get the listener to construct a similar model of the situation
	Entity-based coherence	How we refer to entities influenced how coherent a discourse is.
	Centering Theory	A linguistic theory of entity-based coherence and salience that predicts which entities are salient at any point during a discourse. Also predicts whether a discourse in entity-coherent based on referring theory. It is not a computational model
	Coherence relations	A sentence is coherent if it provides explanation to context.
	Rhetorical Structure Theory (RST)	Describes coherence relations between utterances. It defines a set of rhetorical relations: Evidence, Elaboration, Attribution, Contrast, List Hold a relation between a nucleus and a satellite Some impose constraints
	How to understand discourse	1) Coreference Resolution: "the cafe" = "Bevande" 2) Identifying discourse relations: "He wanted to buy lunch" is the reason "John went to Bevande"
	Discourse Models	Explicit Representation of the events and entities that a discourse talks about and the relations between them. Captures: Physical Entities, Events, States, Temporal Relations, Rhetorical Relations
	Referring expression ("this book") Co-reference	Refers to some entity which is called the referent Two referring expressions that refer to the same entity
	Indefinite NPs	Indefinites usually introduce a new discourse entity. They can refer to a specific entity or not. No determiner, The indefinite determiner Numeral Indefinite quantifiers Indefinite this
	Definite NPs	Definite NPs refer to an identifiable entity (previously mentioned or not)
	Information Status	Hearer-new vs Hearer-old: Assumes entity is (un)known to the listener. Discourse-new vs. Discourse-old: Speaker introduces new entity into the discourse
	Coref as binary classifier	Represent each NP-NP pair (+ context) as a feature vector. Pass through the text and for each NP go through the previous sentences to find the best antecedent NP
	Features for Coref Resolution	Do the 2 NPs have the same head noun? Do they contain the same modifier? Does the gender and number match? Is one NP a hypernym/synonym of the other? Are both NP's named entities of the same time.
	Evaluation of Coref resolution:	B-Cubed F-Score Precision P: Co and To/Co Recall: Co and To/To F-measure: 2PR/(P+R)
	Natural Language Generation	Automatic production of natural language text usually from underlying semantic representation
	NLG Architectures	1) Text Planning: Content determination and discourser planning 2) Sentence Planning: Sentence aggregation, lexicaliztation and referring expression generation. 3) Linguistic Realization: Syntactic, morphological and orthographic processing
	NLG Tasks - Text Planning	1. Content Determination: What information should be communicated. 2. Discourse Planning: How should the message be structured
	NLG Tasks - Sentence Planning	3. Sentence Aggregation: Which messages should be combined into individual sentences? 4. Lexicalization: In which words should domain concepts be expressed. 5. Referring expression generation: How should entities be referred to?
	NLG Tasks - Linguistic Realization	Linguistic Realization: Generate a grammar and orthographically formed text.
	Summarization	"The process of distilling the most important information from a text to produce an abridged version for a particular task and user"
	Centroid-based content selection	Which sentences are most central in a document? A) Central Sentences = Salient/Informative sentences (contains many salient words) B) Central Sentences = most similar to other s's in doc.
	RST-Based Summarization	Use a discouse parser to identify rhetorical relations between sentences/clauses in a document. This gives a discourse tree that defines a salience ranking
	Dialogue Manager	Controls the architecture and structure of dialogue 1. FS 2. Frame Based 3. Information State 4. AI Planning
	Grounding	Dialogue is a collective act performed by speaker and hearer Common ground: set of things mutually believed by both speaker and hearer.
	NLP Pipeline	1. Tokenizer/Segmenter: to identify words and sentences. 2. Morphological Analyzer/POS tagger: to identify the part of speech and structure of words. 3. Word Sense Disambiguation: To find the meaning of the words. 4. Syntactic/Semantic Parser: To obtain the structure and meaning of sentences. 5. Coreference resolution/discourse model: to keep track of the various entities and events mentioned.
	Core Problems of NLP	Ambiguity: Natural Language is highly ambiguous Coverage (Zipf's Law): Any NLP system will come across words that did not occur during training.
	Statistical Models for NLP	Probabilistic Models (HMM, MEMM, CRFs, PCFGS)
	Deep Learning	Neural networks with several hidden layers Very impressive performanc gains in computer vision and speech recognition Fast Computers have finally made it viable
	Challenges in Neural Networks for NLP	Input is discrete, NN works best with continuous vectores. Used in word embeddings and NL models
	NLG Architecture	Goal Text Planner Sentence Planner Linguistic Realizer Surface Text

Share This Flashcard Set

Set the Language

447 Final Flashcards

Add to Folders

Upgrade to Cram Premium

Card Range To Study

94 Cards in this Set