Network Pattern Recognition in Large Humanities Corpora
A new grant to the Electronic Cultural Atlas Initiative will support the application of techniques developed for the analysis of very large science datasets to newly available very large textual datasets in the humanities. In collaboration with Tina Eliassi-Rad (Rutgers) and Christos Faloutos (Carnegie-Mellon & Google), we will focus on recent developments in network analysis that focus on complex problems including visual query systems, topic discovery, anomaly detection, and rapid mining of complex time-stamped data as a means for extending these approaches to noisy Humanities data using Buddhist Canonic texts (Chinese and Sanskrit); Irish studies journals (English and Gaelic); and Danish folklore (English and Danish). We propose to begin by tuning the visual query system for large graphs (GRAPHITE).