Reading and Analyzing Text with Voyant

Objectives

By the end of this tutorial, you will be able to:

What You Need

About Voyant

Voyant Tools is an open-source, web-based text reading and analysis environment. When you upload or submit text into Voyant, it generates a corpus with a plethora of different tools built right into the program to help you quickly “see through your data,” as the Voyant catchphrase states.

This tutorial is adapted from documentation found in the Voyant help database. See the references and resources section below for a link to the database.

What We Can Do With Voyant

Getting Started

  1. Using your web browser, navigate to https://voyant-tools.org/.
  2. Input your data using one of the available options (text or URLs). For this tutorial, copy/paste this URL into the text box: https://www.imsdb.com/scripts/Princess-Bride,-The.html.
  3. Select “reveal”. voyantgetstarted1

Exploring Your Voyant Corpus

When Voyant generates a corpus for your data, it automatically shows it using their default skin. In this case, “skin” refers to which tools are displayed. These are customizable based on your needs. For this tutorial, we are going to use the default skin. Let’s explore what we can do with these tools one-by-one.

voyantcorpus

Cirrus

The top left panel of your corpus shows you a cirrus. This is essentially a word cloud that sizes a set number of terms from your data based on the frequency in which they appear, displaying the most frequently used words as the largest.

voyantcirrus1

At first glance, you may notice some odd words appearing larger than you might expect from the script of the movie (if you are familiar with it already). For example, the word, “cut” appears larger than any other word in the cirrus. This seems odd given it most likely is not a word used that often in the actual dialog of the film. Upon another quick look at the original link to the script, you will notice that every camera cut is labeled in the script. For our purposes, we don’t need that word associated with our analysis, so let’s remove it.

  1. Hover your mouse over the header of the cirrus block on your screen. You will notice a few buttons appear. Click the icon that looks like a small switch to “define options for this tool.” This will reveal the different options for filtering out function words and stopwords like “I”, “the”, “a”, etc. voyantcirrus2
  2. To the right of the “Stopwords” dropdown, click the “Edit List” button. voyantcirrus3
  3. Here you will see the full list of words that Voyant identified as stopwords listed one word, number, or symbol per line. On the line directly underneath the last entry of this list, type the word “cut” and click “Save”. Click “Confirm”. “Cut” has now been removed from our cirrus. voyantcirrus4

Summary

Let’s move down to the Summary tool right underneath the cirrus. This tool is a brief summary of some of the key elements in our text like the total word count, average words per sentence, and a list and count of the most frequent words in the corpus.

voyantsummary1

By default, the summary lists the top 5 most frequent words in your corpus. Let’s exapnd that list.

  1. Click and drag the slider in the bottom left corner of the block labeled “Items” to your desired number. This increases in increments of 5.
  2. Slide your items slide up to 30. Give Voyant about 30 seconds to catch up, and you will see your list of most frequent words increase. voyantsummary2
  3. In your list of most frequent words, take turns selecting the yellow highlighted words. Now take a look what this selection does to the rest of your corpus. Using the Summary tool, you can select individual words and that will trigger the rest of the tools within the corpus to adjust and show analysis on that specific word. The most obvious change in this case is the “Trends” block of your corpus. voyantsummary3

The Trends tool shows you a line graph of the frequency of a selected word in your corpus.

voyanttrends1

(This is the default view from the beginning of this tutorial. Yours will look a little different right now if you have been following along.)

  1. Select “westley” in the Summary tool. You should now just see one line on your Trends graph. If you’re familiar with the movie (spoiler alert), you will know that Westley is actually also the Man in Black mentioned frequently in the first third of the movie. With our Trends tool, we can visually show when Westley leaves the picture, becomes the Man in Black, and is then eventually realized to be Westley by using the search syntax to add a line for the Man in Black.
  2. Click within the search bar at the bottom of the Trends block and select the term “westley” from the list that appears. Even though we have the term selected from our Summary tool already, we need to select it again here to make sure we get the results we are looking for. voyanttrends2
  3. In the search bar at the bottom of the Trends block, hover your mouse over the “?” symbol to reveal a cheat sheet of search syntaxes. voyanttrends3

    This little pop up only lasts for a few seconds. Click the “?” symbol to open a static lightbox version of the cheat sheet.

  4. Based on this cheat sheet, if we want Voyant to search for “man in black” as a single term, all we need to do is surround those three words in quotation marks when we type them into the search bar. Now our line graph shows lines for each term and we can visually see the point in the story when The Man In Black is revealed to actually be Westley! (What a twist!) voyanttrends4

    After you press the Enter key, you might see a random word pop into your search bar. Just click the little “x” next to that word to remove it.

  5. In the bottom right corner of the Trends block, click on the “Display” dropdown. Here you can change the appearance of the graph. voyanttrends5

Reader

The Reader tool displays the text from your corpus with some useful features for quickly finding information on a word while showing it in the context of the rest of the text.

voyantreader1

  1. Hover over a few words. The hover text displays the total count of that word. voyantreader2

  2. The search bar also functions in the same way as the search bar in the Trends tool. However, in Reader, it highlights the searched word or term and displays a small, simple line graph to illustrate that word or term’s frequency throughout the text. Click within the search bar and select the word “humperdinck”. You will notice the line graph shift. You can also click on the line graph itself to jump to a part of the text that has a higher frequency of your searched word or term. voyantreader3

Contexts

The Contexts tool shows each occurrence of a keyword with a bit of surrounding text (the context). It can be useful for studying more closely how terms are used in different contexts.

voyantcontexts1

  1. By default, Voyant selects the word with the highest frequency from your corpus in the Contexts block. voyantcontexts2

    Selecting a new word for the Contexts tool can be done a couple of ways:

    • Click the word in the Reader block
    • Click a data point from the line graph in the Trends block
    • Using the search bar at the bottom of the Contexts block
  2. However you want to get there, select the word “fezzik”. Take a look down the list of the occurrences of “fezzick”. Each line of the table shows each individual occurrence of the word with one column showing the text on the left side of the word and another column showing the text that appears to the right of the word. Click the first line of the table and notice how the blocks for the Reader and Trends tools both change to help elaborate on the context of that occurrence of the word. voyantcontexts3
  3. To add additional terms to your Contexts tool, use the search bar by either clicking and selecting a word or by typing in a word or phrase using Voyant’s search syntax. You can find a cheat sheet for the syntax again by clicking the little “?” symbol on the right end of the search bar. For this example, add “inigo” since he and Fezzik are together for most of the movie. voyantcontexts4

Adding multiple terms to your Contexts tool allows you to get a bird’s eye view of any consistencies in the contexts of different terms!

Exporting and Embedding your Corpus

One of the nice features of Voyant is the ability to not only export your corpus but also embed it into an online collection, article, or webpage. This allows readers to interact with your data or research.

  1. Hover your mouse over the blue header at the top of your browser window to expose four icons. Click on the icon that is a rounded rectangle with an arrow sticking diagonally out of it. voyantexport1
  2. You are provided with a few different options here. The default is a URL that, when clicked, will populate your corpus in the exact condition that is at the time of click the “Export” button. If you select this option, Voyant will open that URL in a new tab and you can copy it from there. To find additional options, click the “Export View (Tools and Data)” dropdown to expose your options: voyantexport2

Other Tips and Tricks

Resize Your Tool Blocks

If you need to expand or contract the blocks of any of the tools displayed in your corpus, just hover your mouse over the border of the block until you see the resize icon then click and drag it to your desired size.

Voyant_resize

Changing Your Blocks to Other Tools

We only showed you 5 of the tools available for your corpus. To change one of your blocks, hover over the gray header area of the block you want to change and click the icon with four small rectangles. Check out Voyant’s help database (linked below) to explore all of their available tools!

voyanttools

Other Resources and References

This tutorial was adapted and written by Jane Thaler in 2020.