Classic Posts

Can you believe Overview has existed for more than five years already? Here are some classic posts from our archives.

Using Overview

Overview’s search syntax — Booleans, fuzzy matches, and more.

Keyboard shortcuts in Overview — Speed up your work!

Dealing with massive PDFs by splitting them into pages — a super handy trick

Import, edit, and create document metadata — using Overview’s “fields” feature

Comparing text to data by importing tags. How to use the Topic Tree look for correlations between topic clusters and structured data.

View the same documents in different ways with multiple trees — how to make new Topic Trees that zero in on what you want to see.

Doing journalism

What did private security contractors do in Iraq? — The very first story done with Overview, an analysis of 4,500 pages of “escalation of force reports” for the Associated Press. See also how it was done.

Some other completed stories.

The different kinds of document driven stories — Sometimes you want to search, sometimes you want to categorize and count, sometimes you want to remove the junk.

The document mining Pulitzers. There were plenty of document-driven stories in the 2014 Pulitzers.


What do journalists do with documents? — A talk (video) and paper which reports on 15 different stories done with Overview, plus uses of other NLP techniques in journalism.

Algorithms are not Enough: lessons bringing computer science to journalism  — What we learned applying NLP techniques to journalism, but useful for anyone designing software.

VIDEO: Text analysis in transparency — A talk at Sunlight Labs, 2013. Old but good.

How Overview can organize thousands of documents for a reporter — how the Topic Tree works, in detail

What is xkcd about? Text mining a web comic — A comparison of Overview’s clustering vs. LDA topic modeling.

Who will bring AI to those who cannot pay? Some reflections in the barriers to applying advanced technology in journalism.

The development of Overview

VIDEO: What the Overview Project does — A presentation from a conference in Berlin, 2014.

Overview: The Design, Adoption, and Analysis of a Visual Document Mining Tool For Investigative Journalists. Paper accepted to IEEE InfoVis 2014 that describes the evolution of the system

VIDEO: Document mining with the Overview prototype — NICAR conference, March 2012. The prototype was so awkward, but people did good stories with it anyway.

VIDEO: Investigating thousands (or millions) of documents by clustering — Demonstration of document clustering methods at NICAR conference, February 2011.

A full-text visualization of the Iraq war logs — December 2010.  This is where it all began, with a hacked-together proof of concept based on document clustering.