top of page

RESEARCH

Just for laughs: Utilizing Machine Learning to
Rate and Generate Humorous Analogies

LIMOR GULTCHIN, Harvard College '17

THURJ Volume 11 | Issue 1

Abstract

This thesis aims to present a procedure for generating humorous 4-word analogies, in the form of: humans : water :: Texans : barbecue. By using a neural word embedding, we created a system that can construct 4-tuples of words of a comedic nature, based on an initial pool of funny analogies, written and rated by Amazon Mechanical Turk (AMT) users. Our procedure involved 4 main steps:
1. Generating a collection of “funny words”, by classifying with a support vector machine (SVM) words from the embedding that are similar to words used in the analogies written by AMT users.
2. Generating funny pairs of words taken from the collection of funny words, and classifying them to obtain more likely funny words. Negative examples were randomly generated pairs from the embedding, and positive examples were pairs from the AMT users’ analogies.
3. Generating matchings of generated pairs by another SVM classifier. Negative examples were random 4-tuples
of words from the embedding; positive examples were complete humorous analogies obtained from AMT. Our method was shown to perform significantly better than the following baselines: random 4 tuples of words from the embedding; random 4 tuples of words from the “funny pool” of words we classified; random matching of funny pairs we generated.

We assessed the performance in terms of the mean scores obtained per analogy in each baseline, and in terms of maxi- mum funniness score obtained in each category (7/10 fully-computer generated, 5/10 random match of pairs, 4/10 random ”funny” words and 3/10 random words). To further establish the usefulness of neural word embeddings to capture humor and generate comedic structures, we introduced a ”funniness” score prediction which showed positive correlation with actual ratings obtained from AMT users, and performed a Turing test, in which 35% of computer-generated analogies were mistaken to be human made.

bottom of page