New Testament Word Clouds

By: Jeff Clark    Date: Sun, 12 Jul 2009

The word cloud below was created from the text of the four gospels of the New Testament of the Christian Bible. I used the King James Version from the wonderful Project Gutenberg. The primary words of emphasis are not surprising - 'jesus' , 'son', 'father', 'lord', and 'god'.

Lately I have been exploring the idea of using clouds built from relative word frequency counts to emphasize the differences between a text and some baseline text. I'm leaning toward calling these accentuated word clouds.

I have created four separate accentuated word clouds for each of the gospels and show them below. The baseline text was all four gospels together so each cloud shows which words are used frequently and proportionally more often in that text versus the overall collection. This kind of cloud illustrates the unique aspects of that particular text.

Let's look at a word that is very prominent in one of the clouds. In the gospel of John, the word 'jews' seems central but it either doesn't appear or is very small in the other three. The number of times it appears in the four gospels is 5, 6, 5, and 67 for Matthew, Mark, Luke, and John respectively. If you calculate the number of occurrences per 1000 lines to account for the different sizes of the various texts then you get 1.4, 2.6, 1.3, and 23.2 times/1000 lines.

These accentuated word clouds appear to be doing a good job of highlighting the terms that are characteristic of the various gospels. It is certainly possible to design a visualization that more directly shows the relative frequency of the key words in different texts but the visual simplicity of these accentuated word clouds have some advantages.


