In the last week I've completed reports for two separate clients that included word clouds as a way of showing the results of text responses from surveys. One client was delighted. One client said it was - in so many words - useless. The wordles might as well have looked like these:
So who is right? Are they beautifully insightful? Or are they mere fluff that provides little insight into data? What are word clouds?
Nearly all research we conduct includes some sort of open-ended text question where respondents are asked to type an answer to a question. This could be the name of a store they shop at most (brand mentions), what they like or dislike about a new product concept (likes/dislikes), or more lengthy descriptions of their educational experience at their alma-mater. Coding of open ended data has been, and continues to be, a very reliable way to understand themes quantitatively. However, it comes at a price.
Word clouds present a low-cost alternative for analyzing text from online surveys, plus it's much faster than coding. Essentially, word clouds generators work by breaking the text down into component words and counting how frequently they appear in the body of text. Next, the font point size is assigned to words in the cloud based on the frequency that the word appears in the text: the more frequently the word appears, the larger the word is shown in the cloud. Instead of discussing the technicalities of how word clouds are created, here are some of my thoughts as to the pros and cons of using word clouds to represent your research data.
The Pros
- It reveals the essential. Brand names pop and key words float to the surface.
- They delight and provide emotional connection. Both the creation of a word-cloud and the observation of one help to provide an overall sense of the text. The same visceral response doesn't happen when staring at a page of text.
- They're fast. Poring over text to develop themes from research takes time.
- They're engaging. Visual representation of data tends to have an impact and generates interest amongst the audience. For your client, it may stimulate more questions than it answers, but that's often a good entry point to discussion. In addition, Christy Ransom recently mentioned how word clouds are a great way to show themes for engaging your community panel members. Word clouds can allow you to share back results from research in a way that doesn't require an understanding of the technicalities.
The Cons
- Size isn't everything. Although the Word Cloud is designed to make words stand out according to their size based on their frequency of occurrence, other factors can affect the visual 'decoding' of the data from the observer's perspective. For example, the length of the word and the white space around the glyphs (letters) can make it look more or less important relative to others in the cloud. This can mislead your interpretation.
Consider this example where "titillate" and "erroneous" both have nine letters and the same weight in the word cloud. They appear to be very different sizes because of the shape of the glyphs.
- Colour me silly. In a recent blog post, Noah Marconi noted how colour should communicate not confuse. Most word cloud generators randomly assign colours to words from a pallet. If it were available these generators could improve the understanding of the text in at least three ways:
- Use similar colours for words that tend to be near each other in a sample of text
- Highlight words that appear in one sample of text but not in another
- Show two sample texts in the same wordle, each with its own colour
- Fonts shouldn't decorate. Although they're fun, decorative fonts often sacrifice communication. Avoid fonts that over-complicate the data and always aim for legibility.
- Counting is not comparing. Showing the frequency that a word shows up can be somewhat basic. Many cloud creators have ways of removing common joiners like "the", "and", "it" and some let you customize this list further to clarify remove noise. However, any sort of comparison to normative text can heighten the understanding further. Consider the example of asking survey respondents likes and dislikes of product concepts. You could compare the lists of likes on one concept to likes on another and then create a cloud of the unique words.
For something that has famously been called the mullet of the internet, and even harmful, it's important to understand what word clouds do and don't tell you. Certainly they are engaging, fun and offer some insight into textual data. My view is that neither of my clients was particularly right or wrong in what they saw in the word clouds. Like all research data, skilled interpretation is what provides the beautiful insight. What would you like a text analysis tool to do for you? Beyond counting, what do you want it to unveil in your text data? Share your examples, successes, and challenges with us.
For a more in-depth look at Wordle, I recommend you check out Wordle creator, Jonathan Feinberg's , discussion in this book: Feinberg, J. (2010). Worlde. In J. Steele, & N. Llinsky, Beautiful Visualization: Looking at Data Through the Eyes of Experts (pp. 37-58). Sebastopol: O'Reilly.