Proposed syllabus here.
How is the rapid development of global computer networks, digital media, and massive data archives changing the way we study history and culture? We now have access to unprecedentedly large and rich bodies of information generated from the digitization of older materials and the explosion of new content through social media. Machine learning and natural language processing make it possible to answer traditional research questions with greater rigor, and tackle new kinds of projects that would once have been deemed impracticable. At the same time, scholars now have many more ways to communicate with one another and the broader public, and it is becoming both easier – and more necessary – to collaborate across disciplines.
Students in this course will begin by learning about some of the core concepts and practices of traditional literary, cultural, and historical analysis, and then consider how they might be transformed. They will explore tools and techniques that include data curation, named-entity extraction, part-of-speech tagging, topic modeling, sentiment analysis, machine and crowd-source translation, social and citation network analysis, and text visualization. The course will take shape as an intensive workshop, where we will gain and share methodological expertise, and begin to think big about digital archives, information architectures, live data, and large-scale textual corpora.
Expect to form groups led by graduate and faculty researchers, to work collaboratively, and to actively shape the trajectory of the course. Case studies could include digital editions, collaborative annotation and editing practices, the curation of archives and libraries. There will be methodological excursions into the fields of anthropology, economics, bio-informatics, and design. The course is open to students at all levels of technical skill and with a variety of research interests. But they should be open to applied training on the basics of database discovery, natural language processing, graph theory, image analysis, text visualization, and network analysis.