CWRCshop: Using Voyant for Analyzing Texts

This is a script for a workshop on using Voyant for the CWRC community. Please note that ULRs and resources may no longer be available.

1.0 Introduction

  • The workshop leaders will introduce themselves:
    • Geoffrey Rockwell, University of Alberta, geoffrey (dot) rockwell (at) ualberta (dot) ca, http://www.geoffreyrockwell.com
    • Susan Brown, University of Alberta, University of Guelph, sbrown (at) uoguelph (dot) ca
  • Overview
    Voyant is currently a beta release by Stéfan Sinclair and Geoffrey
    Rockwell. It was previously called “Voyeur” so do not be confused if that name is used. Voyant is the next generation in a series of text analysis
    tools that include HyperPo and TAPoRware. It provides tables and graphs
    related to word use across a single document or a collection. Voyant
    adds, among other things, the ability to handle much larger files than
    the previous tools could.
  • Outline
    In this workshop we will:

    • First, look at how to use a single Voyant tool, Cirrus, with a small corpus of Austen texts.
    • Then learn how to use the normal skin of Voyant with a single text.
    • Finally, show how to load your own text into Voyant.
  • Now make sure you can connect to the wireless.
  • Help
    If you need help, connect to Hermeneuti.ca and explore the resources there. Here are some useful links:

2.0 Using a single Voyant Tool: Cirrus

Voyant Tools has a number of different tools that can be composed into skins or used individually. We will start with just one tool called Cirrus that can then spawn other tools. We will try it with Mary Shelley’s Frankenstein. Click on this link to open.

Cirrus (Frankenstein): http://dev.voyeurtools.org:8080/tool/Cirrus/?corpus=1317355585427.2492&stopList=stop.en.taporware.txt

For a backup go here:http://voyeur.hermeneuti.ca/tool/Cirrus/ and enter text http://www.gutenberg.org/cache/epub/84/pg84.txt

The Cirrus tool shows you a word cloud of high frequency words. Some questions to ask yourself:

  • What words did you expect? What words are missing? What words are interesting.
  • How does the tool arrange words and choose colours? Is there any correspondence between size and frequency?

Try It: Try clicking on a word. It will launch a second tab or window with a list of the texts in the corpus with the frequency of the word you clicked on.

Try It: Now try double-clicking on one of the texts. This should launch another tab or window with a Key Word In Context (KWIC) of the word in that text.

3.0 Using a Reading Skin

Voyant Tools can also be composed into “skins” that combine tools as panels so that they can be used interactively. Here is the same Austen corpus in a simple skin:

Frankenstein: http://dev.voyeurtools.org:8080/?corpus=1317355585427.2492&skin=simple&event=corpusTypeSelected

In this skin clicking in one window will often (but not always) update other windows. Try the following:

  • Triggering: Click on words in the Cirrus word cloud. Then click on a text in the Word Trends and play with the KWIC.
  • Changing Settings: Try changing the settings for the Cirrus by clicking on the small gear icon. Try playing with the Word Trends
  • Showing and Hiding Panels: Try showing and hiding panels using the small up and down arrows in the upper-right of the panels.

When in doubt just restart the session by hitting refresh.

4.0 Using Voyant on You Own Text

Voyant Tools can be used on your own text or corpus. To do that you go to the simple URL for the tool:

Voyant: http://voyeurtools.org

Just the Cirrus tool in Voyant: http://voyeurtools.org/tool/Cirrus/

Backup older version: http://voyeur.hermeneuti.ca

You will get panel that asks you for a text. You can provide:

  • One or more URLs to texts on the web
  • Upload a text or a zipped collection of texts
  • Upload plain text, HTML, or XML texts
  • Upload a PDF (and Voyant will try to extract the text)

Voyant is forgiving, but there are none-the-less bugs.

5.0 Other Stuff

Here are some corpora and skins: