Word Association Spectrums

By: Jeff Clark    Date: Fri, 09 May 2008

Chris Harrison has a wonderful collection of visualizations one of which I featured recently in More Color Name Graphics.

Chris recently posted a set of beautiful Word Association Spectrums based on an extremely large dataset from Google containing word bigram distributions. The example shown below is for the words 'war' and 'peace'. The horizontal position of the various words indicate whether they more frequently follow 'war' or 'peace' in the analyzed text. So the word 'memorial' is positioned very close to the left (at the bottom) because the bigram 'war memorial' occurs much more often (normalized by overall counts) than 'peace memorial'. The vertical position is random.

My own Document Contrast Diagrams also stretch out words along a horizontal axis based on the strength of association between two poles. My diagrams try and express a lot more information as well - probably too much. Chris's Word Association Spectrums carry less information. This simplicity allows for a much more elegant design. He has generated spectrums for other interesting word pairs like 'kids:adults' , 'good:evil', and 'american:chinese'. I might like to see versions that don't show the common prepositions so that the nouns, verbs, and adjectives stand out more.

Word Association Spectrum for War and Peace (click to visit Chris Harrison's Post)


May 6th Speech Contrast Diagram
Disease Gene Map