Tag Archives: Python

The Productive Scholar: Risk in Media Discourse: An Introduction to Topic Modeling with R and Python

Topic: Risk in Media Discourse: An Introduction to Topic Modeling with R and Python461972367(1)
Speaker: Manish Nag

Time: Thursday, April 10, 12:00pm – 1:00pm
Location: New Media Center (NMC), 130 Lewis Library, First Floor

Lunch will be provided. To register for this session: http://bit.ly/Risk-TM
(Registration is not required for attendance, however refreshments may be limited.)

Amidst global concerns over financial markets, terrorism, and outbreaks of disease, the term “risk” pervades contemporary Western media discourse. Manish Nag’s dissertation is interested in the overall landscape of risk in contemporary news media discourse, using the full text of the New York Times from 1987-2006. What are the predominant threads of discourse related to risk, how does this discourse grow and change over time?  Manish’s presentation presents how topic modeling can be used to help answer these questions.

Manish Nag is a Doctoral Candidate in Sociology. His research seeks to understand the global landscape of media discourse on global risk, as well as change and resilience in global networks of people, goods and ideas. His research utilizes mapping, data visualization, the analysis of text, and social network analysis. Manish received his BA in Computer Science from Brown University, and has worked as a software engineer, entrepreneur, and manger prior to his graduate work.

Presentation co-sponsored with Digital Humanities Initiative at Princeton (DHI).

The Productive Scholar: Tools for Text Analysis in the Humanities

Text Analysis with NLTK Cheatsheet

Topic: Tools for Text Analysis in the Humanities170192449
Speaker: Ben Johnston

Time: Thursday, April 3, 12:00 PM – 1:00 PM
Location: New Media Center, 130 Lewis Library, First Floor


A sequel to last semester’s ‘Tools for Text Analysis in the Humanities’, this session will give participants a brief yet hands-on introduction to NLTK, the Natural Language Toolkit. This extension to the popular Python programming language is geared specifically toward computational work with written human language data. In this introduction, we will use tools from this library to tokenize a corpus into sentences, n-grams, and words, create word frequency lists, view concordances, and do part-of-speech tagging. In doing so, this session will also serve as a very gentle introduction to the Python programming language. Absolutely no experience with Python or with programming is expected or required.

SESSION RECAP: Presenter Ben Johnston started by providing a contextual framework for this session which focused on Natural Language Toolkit (NLTK) and Python. He emphasizing the impossibility of actually learning Python in an hour, and the importance of those who have developed a sincere enthusiasm for the applications of digital tool with which they’ve become familiar to engage in ‘knowledge sharing,’ with peers and others. Knowledge sharing requires knowledge but not at the expert level. Digital humanists should be encouraged to share knowledge even while they themselves are still learning (as you will likely never stop learning). Doing so reinforces learning and helps build community–both important aspects of gaining competency in the digital humanities. Here’s an excerpt from Ben’s introduction: Continue reading

Interactive Web Tool To Clean Data for Analysis and Visualization

I recently came across a free web interactive tool by Stanford Visualization Group called DataWrangler. This tool is great if you are downloading public data sets or have data sets of your own and want to clean them up before putting them into a data anaylsis program or data visualization program.What’s great about this tool is that DataWrangler suggests different operations to perform to clean up the data or you can manipulate the operations to reflect how you want to data cleaned up. Some of the operations also give you a preview of what the data will look like before you commit to that operation. You can export the cleaned up data for DataWrangler in Python or Java or can choose CSV, TSV, Row-Oriented JSON, Column-Oriented JSON, or Lookup Table. If you want to try out this tool, click on the link below to get started: http://vis.stanford.edu/wrangler/

Enhanced by Zemanta