Advised by Brian Kernighan
May 31, 2008
As part of my AISES Broadening Participating in Computing summer project, my brother and I decided to do an examination of religious texts. The goals of the project were several-fold. The first of these goals was to generate a dataset of as many holy books as we could reasonably digitize and then modify these digital books into forms that would allow easy comparisons. This aspect of the project proved more difficult than originally anticipated, as many of those involved in the creation of these holy texts likely had no idea this sort of project would be their eventual fate (and thus discourteously did not spend a great deal of time making them computer-friendly.)
After a considerable amount of difficult and tedious work getting different holy books into shape, we succeeded in converting three texts to a reasonable format: The Bible, The Koran, and The Book of Mormon.
The first attempt made to examine their inner-workings was to see which words were conserved between each of the texts. Using GraphViz, we were able to create the illustration shown at right. This energy-minimizing figure demonstrates the conserved words between the three books.
Related files: 6
Holy Book Viz
GraphViz demonstration of word frequency relatedness between Koran, Bible, and Book of Mormon