Religious Text Processing

»

«

Advised by Brian Kernighan

May 31, 2008

As part of my AISES Broadening Participating in Computing summer project, my brother and I decided to do an examination of religious texts. The goals of the project were several-fold. The first of these goals was to generate a dataset of as many holy books as we could reasonably digitize and then modify these digital books into forms that would allow easy comparisons. This aspect of the project proved more difficult than originally anticipated, as many of those involved in the creation of these holy texts likely had no idea this sort of project would be their eventual fate (and thus discourteously did not spend a great deal of time making them computer-friendly.)

After a considerable amount of difficult and tedious work getting different holy books into shape, we succeeded in converting three texts to a reasonable format: The Bible, The Koran, and The Book of Mormon.

The first attempt made to examine their inner-workings was to see which words were conserved between each of the texts. Using GraphViz, we were able to create the illustration shown at right. This energy-minimizing figure demonstrates the conserved words between the three books.

Related files:
6

Related links:
0


Holy Book Viz
GraphViz demonstration of word frequency relatedness between Koran, Bible, and Book of Mormon

Bible Arc Diagram
Demonstration of appearance and span of proper nouns found in the Bible

Mormon Arc Diagram
Demonstration of appearance and span of proper nouns found in the Book of Mormon

Bible Rib Diagram
Frequency of word use over the span of the bible

Koran Rib Diagram
Frequency of word use over the span of the Koran

Mormon Rib Diagram
Frequency of word use over the span of the Book of Mormon