Quanteda is an R package for managing and analyzing text (https://quanteda.io/). One benefit of using quanteda is its powerful plot tools. You can easily plot a keyness graph for a set of documents, i.e. a corpus.

Here, we will use quanteda’s build-in corpus, the inaugural address corpus to demonstrate how to plot a keyness graph.

Step 1: load quanteda package.


Step 2: create a dfm (Document features matrix) by tokenize the corpus.

my_tokens <- tokens(data_corpus_inaugural, remove_punct = TRUE, remove_symbols = TRUE)
my_dfm <- dfm(my_tokens, remove=stopwords(language = "en"))

Step 3: we use that dfm to create a keyness object.

my_keyness <- textstat_keyness(my_dfm)

Step 4: finally, we plot the keyness object.


There many more options for the plot. Please refer to textplot_wordcloud for more details.

