many eyes many eyes
Can't delete this data set? That's because it has either visualizations or comments attached to it.
Can't delete this? That's because it has comments attached to it.
Sign in
  • explore
    • visualizations
    • data sets
    • comments
    • topic hubs
  • participate
    • register
    • create visualization
    • upload data set
    • create topic hub
  • learn more
    • quick start
    • visualization types
    • data format & style
    • about Many Eyes
    • FAQ
    • blog
  • contact
    • contact us
    • report a bug
  • legal
    • terms of use

Word Tree Guide

When to use a Word Tree

A word tree is a visual search tool for unstructured text, such as a book, article, speech or poem. It lets you pick a word or phrase and shows you all the different contexts in which it appears. The contexts are arranged in a tree-like branching structure to reveal recurrent themes and phrases.

The image below is a word tree made from Martin Luther King's famous "I have a dream" speech, using the search term "I." Font sizes show frequency of use, so you can see that among King's many uses of "I," the most frequent context is the phrase "I have a dream."

How the word tree works

Unlike most of our visualizations, the word tree starts with a blank slate instead of a full visualization of the data. It's waiting for the user to choose a search term. Once a word or phrase is typed, the computer then finds all of the occurrences of that term, along with the phrases that appear after it. For instance, the phrase "if love" in Romeo and Juliet occurs three times.

You'll notice that in the words following "if love" there are many repeated phrases. For instance, "be" follows "if love" in all three cases. And "be blind" follows in two cases. To create a word tree, the computer merges all the matching phrases, as in this diagram:

You can manipulate the tree in several ways. Clicking on a word in the diagram will zoom into that particular branch. If you control-click on a word, the diagram will use that new word as the main search term. And if you wish to see the context occurring before rather than after a phrase, click the "End" radio button. As you navigate the word tree, you can use the "Back" and "Forward" button just as you would in a browser to quickly step through your history of views.

Highlighting

If you want to point out particular words in the tree, select "Highlight Mode" from the menu at the upper right. You can then click to highlight words, or control-click to highlight multiple words. Once you're done with highlighting, you can choose to the "Clicks Will Zoom" menu option so that you can once again zoom by clicking.

Punctuation Matters

Unlike like many text visualization methods, such as tag clouds, the word tree does not ignore punctuation. In fact, it treats periods, commas and the like as separate words in the text. The reason is that within the flow of a text, punctuation can be critical to the meaning and rhythm of the phrases.

Branch Order

By default, the tree branches are ordered from top to bottom by order of occurrence in the text. For instance, if the phrase "we saw" first occurs before "we conquered," then the "saw" branch will be above the "conquered" branch in the word tree for "we". You can choose two other modes: alphabetical, or ordered by overall branch size.

Data requirements

The word tree accepts free (unstructured) text data. It can handle documents with up to about a million words.

Here is a sample free text dataset:

Whose woods these are I think I know.
His house is in the village though;
He will not see me stopping here
To watch his woods fill up with snow.
My little horse must think it queer
To stop without a farmhouse near
Between the woods and frozen lake
The darkest evening of the year.

Expert Notes

A word tree is a visual version of a traditional concordance. (For computer scientists: it is a visual version of a suffix tree.) This is one of our more experimental techniques, and we're interested in feedback! The display method is relatively straightforward, but there are some subtleties. One is that not everything in the tree is displayed on the screen. We "prune" the tree for legibility and rapid interaction by not drawing every possible branch: if you type the search term "the" or "a", for example, there may be hundreds or thousands of items too tiny to read.