Alex's Notes

Lexical Semantics (Jurafsky Martin)

As presented in Jurafsky Martin Chapter 06: Vector Semantics and Word Embeddings

How do we represent the meaning of a word? In classical NLP applications, and N-gram models, we do so with a string of letters.

A model of word meaning needs to do more. It should tell us that some words have similar meanings, others are antonyms, some are positive, others negative. It should represent that the words buy and sell offer differing perspectives on the same event.

More generally a model of word meaning should allow us to draw inferences to address meaning-related tasks like QA or dialogue. This section of the book summarizes what we want such a model to do, drawing on the linguistic study of lexical semantics.

Lemmas and Senses

Take the word mouse. A dictionary might define a rodent, or a device for controlling a computer. The form mouse is a lemma or citation form, it is the lemma for mice too. Many languages use the infinitive of a verb as its lemma. In english we might use sing as the lemma for sung or sang. The latter are wordforms.

Each lemma can have multiple meanings; we call each of these aspects of the meaning a word sense. Lemmas are polysemous if they have multiple senses. Such lemmas can be hard to interpret, and the task of determining which sense is used in a particular context is the NLP problem of word sense disambiguation.

Synonymy

When one word has a sense whose meaning is near enough identical to a sense of another word we say that the two senses of the two words are synonyms.

A more formal definition of synonymy (between words rather than senses) is that two words are synonymous if they are substitutable for one another in any sentence without changing the truth conditions of the sentence, the situations in which the sentence would be true. We often say in this case that the two words have the same propositional meaning.

Substitutions may be truth preserving, but the words are still not identical in meaning. A fundamental tenet of semantics, the principle of contrast states that a difference in linguistic form is always associated with some difference in meaning. For example H20 and water might be synonymous but appropriate in different contexts. So synonym is used to describe approximate synonymy.

Similarity and Relatedness

Words don’t have many synonyms, but they do mostly have many similar words. Cat and dog are not synonyms, but are similar. When moving from synonymy to similarity we move from word senses to words, it turns out to simplify things.

The notion of word similarity is very useful in larger semantic tasks. Knowing the similarity of two words can help compute how similar in meaning two sentences are, very important to QA or summarization. Human created databases like SimLex-999 tried to encode word pair similarity.

Words can be related in other ways than similarity. One such class is called word relatedness or association in psychology. Consider coffee and cup these words aren’t similar but they are related.

One kind of relatedness is belonging to the same semantic field. A semantic field is a set of words that cover a particular semantic domain and bear structured relations with each other. For example words might be related by being in the field of hospitals (surgeon, scalpel, nurse, anesthetic, hospital). Semantic fields are related to topic models, like Latent Dirichlet Allocation, LDA which apply unsupervised learning on large text corpora to induce associated words. These are useful tools for discovering topical structure in documents.

Closely related is the idea of a semantic frame: a set of words that denote perspectives or participants in a particular type of event. Take a commercial transaction where Sam buys a book from Ling. The event can be encoded lexically by using verbs like buy (from the buyer’s perspective) or sell (from the seller’s), pay (the monetary aspect), or nouns like buyer. Frames have semantic roles, and words in a sentence can take on these roles.

Knowing that buy and sell have this relation makes it possible for a system to know that “Sam bought the book from Ling” can be paraphrased as “Ling sold the book to Sam”, and that Sam has the role of buyer. This is important for QA, machine translation, and other tasks.

Connotation

Words have affective meanings or connotations. The word connotation here means the aspects of a word’s meaning that is tied to the reader’s emotions, sentiment, opinions, or evaluations. Some words have positive (happy) others negative (sad) connotations. Words with similar meanings can vary in connotation (eg naive vs innocent).

Positive or negative evaluation language is called sentiment, and word sentiment plays an important role in sentiment analysis, stance detection, and applications of NLP to politics and consumer reviews.

Words vary along three important dimensions of affective meaning (Osgood et al, 1957) and can be scored accordingly:

  • valence: pleasantness of the stimulus (high: happy, low: annoyed)

  • arousal: intensity of the emotion provoded by the stimulus (high: excited, low: calm)

  • dominance: degree of control exerted by the stimulus (high: controlling, low: awed)

So we could represent a word’s value on each of the three dimensions:

ValenceArousalDominance
courageous8.055.57.38
music7.675.576.5
heartbreak2.455.653.58
cub6.713.954.24

Osgood et al (in The Measurement of Meaning, University of Illinois Press, 1957) were the first to notice that if you represent the meaning of a word with these three numbers we have a vector in three dimensional space. This is the first expression of Vector Semantics.