This work is accessible only to Andrews University faculty, staff, and students. Off-Campus Andrews University users should click the "Off-Campus Download" button below, then enter your Andrews University username and password when prompted.
Article Title or Book Review Reference
Application of the Term Frequency-Inverse Document Frequency Weighting Scheme to the Pauline Corpus
Abstract (For book reviews see instructions below)
The term frequency--inverse document frequency (TF-IDF) weighting scheme is applied to the text of the thirteen epistles traditionally associated with the Apostle Paul. The data for the analysis is the morphologically tagged text of the Society for Biblical Literature’s Greek New Testament. The TF-IDF scheme is then used to construct the Document Term Matrix (DTM) for a corpus under consideration. The DTM allows each document to be represented by a multi-dimensional document vector. A query document is then chosen and a vector representation of it is constructed. The cosine similarity between the query document and documents in the corpus is calculated. The following pairs of documents are consistently found to have the highest similarity: (1) Romans and Galatians, (2) Ephesians and Colossians and (3) First Timothy and Titus. It is shown that computational methods may be applied to the thirteen epistles and that the results are in accordance with those obtained from theological or literary analysis.
Van der Ventel, Brandon and Newman, Richard.
"Application of the Term Frequency-Inverse Document Frequency Weighting Scheme to the Pauline Corpus."
Andrews University Seminary Studies (AUSS)
Available at: https://digitalcommons.andrews.edu/auss/vol59/iss2/4