Tag Archives: text analysis

Teaching with Technology Innovators Series: As Easy As ABC: Digital Humanities in the Classroom

Topic: As Easy as ABC: Digital Humanities in the Classroom
Speakers: Bill (William) Gleason (Professor & Chair, Department of English), Andrea Immel (Curator, Costsen Children’s Library), Ben Johnston (Manager, Humanities Resource Center, OIT), Clifford Wulfman (Coordinator, Library Digital Initiatives)

Time: Tuesday, April 29, 4:30pm – 6:00pm
Location: 330 Frist Campus Center, McGraw Center Conference Room

Refreshments will be provided! To register for this session: http://bit.ly/TT-ABC
(Registration is not required for attendance, however refreshments may be limited.)

The collaborators behind the new Interactive Digital Archive of Rare ABC Books, featuring selections from the Cotsen Children’s Library, will discuss the vision, planning, and work of the project, which was supported with a course development grant from the Digital Humanities Initiative and has been integrated into ENG 385: Children’s Literature. They will also describe a special course component in which students receive training in the methods and materials of the digital humanities, including text encoding.

Bill Gleason is Professor and Chair of the Department of English. A specialist in American literature and culture, his research and teaching interests range from the 18th century to the present, with particular emphasis on the late 19th/early 20th century, and include popular culture, material culture, environmental studies, and the history of the book.

Andrea Immel, Curator of the Cotsen Children’s Library since 1995, organizes international conferences, gallery and virtual exhibitions, and acquires materials for the collection.  She contributed chapters to volumes 5 and 6 of the Cambridge History of the Book in Britain, and co-edited Childhood and Children’s Books in Early Modern Europe, and the Cambridge Companion to Children’s Literature.

Ben Johnston is manager of OIT’s Humanities Resource Center in East Pyne.  Since 2005, Ben has worked with Princeton educators, students, and researchers across the Humanities and Social Sciences to facilitate the use of digital assets, technology tools, databases, and digital video in teaching and research. Ben is also an active member of Princeton Digital Humanities Initiative.

Clifford Wulfman is coordinator of Library Digital Initiatives and Director of the Blue Mountain Project. In addition to many years’ experience with text encoding, Cliff has published numerous articles on topics in the digital humanities and is co-author, with Robert Scholes, of Modernism in the Magazines: An Introduction.


The Productive Scholar: Risk in Media Discourse: An Introduction to Topic Modeling with R and Python

Topic: Risk in Media Discourse: An Introduction to Topic Modeling with R and Python461972367(1)
Speaker: Manish Nag

Time: Thursday, April 10, 12:00pm – 1:00pm
Location: New Media Center (NMC), 130 Lewis Library, First Floor

Lunch will be provided. To register for this session: http://bit.ly/Risk-TM
(Registration is not required for attendance, however refreshments may be limited.)

Amidst global concerns over financial markets, terrorism, and outbreaks of disease, the term “risk” pervades contemporary Western media discourse. Manish Nag’s dissertation is interested in the overall landscape of risk in contemporary news media discourse, using the full text of the New York Times from 1987-2006. What are the predominant threads of discourse related to risk, how does this discourse grow and change over time?  Manish’s presentation presents how topic modeling can be used to help answer these questions.

Manish Nag is a Doctoral Candidate in Sociology. His research seeks to understand the global landscape of media discourse on global risk, as well as change and resilience in global networks of people, goods and ideas. His research utilizes mapping, data visualization, the analysis of text, and social network analysis. Manish received his BA in Computer Science from Brown University, and has worked as a software engineer, entrepreneur, and manger prior to his graduate work.

Presentation co-sponsored with Digital Humanities Initiative at Princeton (DHI).

The Productive Scholar: Tools for Text Analysis in the Humanities

Text Analysis with NLTK Cheatsheet

Topic: Tools for Text Analysis in the Humanities170192449
Speaker: Ben Johnston

Time: Thursday, April 3, 12:00 PM – 1:00 PM
Location: New Media Center, 130 Lewis Library, First Floor


A sequel to last semester’s ‘Tools for Text Analysis in the Humanities’, this session will give participants a brief yet hands-on introduction to NLTK, the Natural Language Toolkit. This extension to the popular Python programming language is geared specifically toward computational work with written human language data. In this introduction, we will use tools from this library to tokenize a corpus into sentences, n-grams, and words, create word frequency lists, view concordances, and do part-of-speech tagging. In doing so, this session will also serve as a very gentle introduction to the Python programming language. Absolutely no experience with Python or with programming is expected or required.

SESSION RECAP: Presenter Ben Johnston started by providing a contextual framework for this session which focused on Natural Language Toolkit (NLTK) and Python. He emphasizing the impossibility of actually learning Python in an hour, and the importance of those who have developed a sincere enthusiasm for the applications of digital tool with which they’ve become familiar to engage in ‘knowledge sharing,’ with peers and others. Knowledge sharing requires knowledge but not at the expert level. Digital humanists should be encouraged to share knowledge even while they themselves are still learning (as you will likely never stop learning). Doing so reinforces learning and helps build community–both important aspects of gaining competency in the digital humanities. Here’s an excerpt from Ben’s introduction: Continue reading

Introduction to Text Encoding and TEI

Time: Wednesday, January 29, 2:00pm – 4:00pm
Location: HRC Classroom, East Pyne Room 012, Lower Level
Instructors: Clifford Wulfman and Ben Johnston

What’s with all the pointy brackets???

Screen Shot 2014-01-14 at 3.17.17 PM

A diary entry from poet Robert Graves, “Getting started using TEI” http://tei.oucs.ox.ac.uk/GettingStarted/html/in.html

Text encoding involves rendering transcriptions of documents (books, newspapers, magazines, manuscripts, engravings, and so on) into machine-readable form, so that they may be processed by computers in a variety of ways. Most of us are familiar with word-processing programs that create encoded texts for printing; and many of us have heard about HTML, a way of marking up, or annotating, a text for display on the World Wide Web.

What most people don’t know is that text markup has uses far beyond simple presentation (formatting and print layout). It can be used to support fundamental scholarly practices like glossing, annotation, linking, and other kinds of semantic analysis and interpretation, making the scholar’s intellectual work readable by machines.

(To register for the workshop click here, or access the QR code)qrcode Continue reading for more information

The Productive Scholar: Intro to Basic Text Analysis

Topic: Introduction to Basic Text Analysis
Speaker: Ben Johnston

Time: Thursday, October 17, 12noon – 1pm
Location: HRC Classroom, Room 012, Lower Level, East Pyne

This hands-on workshop will introduce participants to several tools useful for the analysis of text. AntConc, an easy-to-use but quite powerful concordance program, allows one to run sophisticated and detailed searches over a corpus, make comparisons of the textual characteristics of one text versus another, and to run collocation analyses. The online Voyant Tools offers a spectacular suite of tools for text analysis and includes visualizations of the results, providing an excellent entry point to text analysis. Tools for basic topic modeling and named-entity recognition will also be presented. For those curious about topic modeling, MALLET provides an easy way to get started. Finally, the Stanford Named-entity Recognizer (NER) is a tool for recognizing and tagging proper nouns, or entities, such as people, place names, or organizations in a text.

Speaker: Ben Johnston is Senior Educational Technologist and Manager at OIT’s Humanities Resource Center (HRC) in East Pyne, and Consultant for the Digital Humanities Initiative (DHI). Ben has been involved with educational technology for over twelve years in positions at Columbia University, Bryn Mawr College, and Princeton University. While at Princeton, Ben has worked with educators and researchers across the Humanities and Social Sciences to facilitate the use of digital assets, technology tools, databases, and digital video in teaching and research.