RezoViz visualizes the relationships between people, locations and organizations in a collection of documents.
- Getting Started
- Interface Elements
When you first arrive to the RezoViz tool you will see one of two possible screens:
RezoViz with a pre-loaded corpus. You were probably given a URL that included the corpus, or you’re viewing a page that has an embedded Voyeur tool in it. If you prefer, you can also start without a corpus.
RezoViz includes the standard set of interface elements (see image below). For more help with these see the Voyeur Tools Standard Interface Elements page.
For the relationships RezoViz visualizes links are created between every pair of people, locations and organizations that occur in the same document. Imagine three documents that include the following terms:
|Document 1||Document 2||Document 3|
That would produce the following pairs:
|Document 1||Document 2||Document 3|
|Paris — Montreal
Paris — Mexico
Montreal — Mexico
|Sydney — Montreal
Sydney — Boston
Montreal — Boston
|Tokyo — Montreal
Tokyo — Mexico
Montreal — Mexico
Shown as a network graph this might look something like the following:
This shows that Montreal is central as it appears in all three documents. The placement of the other cities tries to optimize available space by showing more closely linked items in proximity and with the fewest number of crossing lines (as the data become more complex this becomes more difficult as competing priorities are at work). This type of graph is called a force-directed network graph.
Because RezoViz creates links between pairs of terms in a document, it works best with several documents (all terms in a single document will be linked). Depending on the genre of the documents, it may work better with shorter documents (like news articles, where there tends to be a high density of people, locations and organizations) or longer documents (like novels where there tend to be fewer unique people, locations and organizations).
How does Voyant know what are people, locations and organizations? It performs an automated process called named entity recognition, currently using the Stanford Natural Language Processing library. Most named entity recognizers function with a mix of heuristics (capitalized letters for indicating proper nouns, for instance) and a training set where words have already been tagged (New York and John Smith are both double-worded capitalized sequences, but a training set can tell them apart). It’s important to recognize that named entity recognition is powerful and useful, but also subject to a lot of problems, such as the following:
- many taggers are trained with news feeds and may work less well with other corpora
- often instances of names can vary across a document (Mary, Mary Smith, Ms. Smith, Dr. Smith, etc.)
- names can be ambiguous (Johns Hopkins the person or the university)
The links that are produced can be viewed and edited through the collapsible window on the right side of the tool called “Editor”. Edits can be typed directly into the panel. To update the visualization with the changes that are made select “update” at the bottom of the window. To reset the changes made to the list of links select “reset” at the bottom of the tool.
The bottom toolbar provides a number of options to change the parameters of the force directed graph. This includes the number of nodes to be depicted, the category of nodes to be depicted, the mode with which to filter nodes, and a number of force parameters that determine how the nodes will orient themselves.
- The slider determining the number of nodes to be depicted is fairly self obvious. The more nodes are present the more the visualization provides an inclusive yet less pointed representation of the corpus.
- The menu to select categories allows the user to select between representing people, locations, and organizations.
- The menu to select for mode allows the user to select between four modes “Top frequency links”, “Top links with all connections”, “Top frequencies items”, “Top frequencies items with all connections”.
- The “Tension” parameter affects how closely a link will bind two nodes together. The higher the value the more closely they will be bound.
- The “Repulsion” parameter affects the magnitude that all nodes will push each other away. The higher the value the higher the repulsive force will be.
- The “Friction” parameter affects the duration with which nodes will continue to move around to find a position that balances both the tensile and repulsive forces. Low values for friction will allow the nodes to move around indefinitely, high friction will allow the nodes to move around for only a very short time.
Selecting words in RezoViz will highlight them to allow for easier interpretation of the Visualization. This can be achieved two ways. Hovering over nodes will highlight them red, as well as highlighting those nodes that are directly connected to them in the graph. Typing a word, place, or organization in the search box in the bottom toolbar will highlight the keyword blue.
Like all Voyeur tools, RezoViz can be reused in a variety of ways:
- create a link that is specific to the corpus and options that are currently being used
- embed the current corpus and options as a tool in an external page
For more information see exporting and reusing Voyeur Tools.
[iframe style=”width: 100%; height: 400px;” src=”http://voyant-tools.org/tool/RezoViz/?useReferer=true” width=”320″ height=”240″]