How To – Political Ad Sleuth, Part 1

We’re going to teach ourselves a new skill today. Step by step.

We’ll download a raw set of fresh data and analyze it (using excel and google refine) for fodder for a print story – deadline Friday (Today is Wednesday; This is real time). I’ll break this tutorial down into smaller posts to leave space for questions along the way.

I don’t want to lead you into the dark. Let me shine a little light backward for a minute.

Public Inspection Files

The FCC started requiring that broadcast television stations affiliated with the top networks in the top fifty markets upload their public inspection files to a website, where the public can access them. The new rule went into effect on August 2, 2012.

I wrote a story about the process. It was a bitch. Before August 2, the stations kept their public files – including the political advertising file – in the office on paper. You could come and look and pay a quarter per page if you wanted them to run copies. It was hard to keep track of everything and even more difficult to figure out details, like where the money is coming from. Details about the contents of the public file here.

Political Ad Sleuth

The Sunlight Foundation launched a new project called Political Ad Sleuth. They gather political advertising contracts from the FCC’s website, which is extremely time consuming if you’re doing it all yourself because each contract is in PDF format and folders, and volunteers who submit scans.

Political Ad Sleuth saves you days or maybe weeks of work and gives you a good starting point. They bring it all together and make it available, but it still needs to be cleaned and analyzed. You’ll soon see what I mean.

The Hypothesis

They sent out a press release today saying that Cincinnati was No. 7 in the nation last week when being ranked by the number of documents uploaded… which should be a contract, usually purchased/renewed on a weekly basis. That makes us the most heavily saturated Ohio market. Cleveland was the top market when I wrote that earlier story.

Is it true? Who is doing the advertising? What are they running? When did things change, exactly? Where are they buying the most time? Why did things change? Can you think of any other questions?

I stumble down the path and find new things that I never even thought of while I work.

Now that that’s out of the way, let’s test the theory.

Download the data

Click [CSV of all files]. It’s big – 28,951 records.. Name it and save it somewhere that it won’t get lost.

Have a look around.  Let me know when you’ve made it this far. Ask any questions as we go along. I respond pretty quickly.

Next, we’ll start cleaning it up and refining things.

Public Records

 

This post clarifies some thoughts from my last. When I speak of public records, I’m not talking about personal information – necessarily.

I picked up a list of registered voters from the Hamilton County Board of Elections. It lists your name, address, phone number, party affiliation (within the confines of our current political structure) and the date of your last vote. Sure. It’s election season. You knew that information was public.

What about the information attached to the registration of your dog?

That list has your name and address as well as the name and breed of your dog. Even though this is public information available to anyone, it has been (weakly) argued that the information should not be made available.

You can find the information neatly displayed in a map on the Cincinnati Enquirer’s website. The argument against making this information THAT public was that a burglar may enter your home without fear of your beloved protector because he would be able to call them by name.

You can search through other types of public records on the Enquirer’s website, but there are many, many more. Most can be found with a little digging in the right corners of the web, but not all.

The Ohio Department of Education releases a lot of data, but it maintains much more than it releases. They keep track of everything about everybody – literally.

Every student is assigned a student ID number in an attempt to maintain their privacy and tracked throughout their entire school career. Every teacher is graded on the performance of those students and every school is evaluated on the performance of their teachers. And it’s all about testing.

When suspicions of cheating swirled, according to a local reporter that covers education, USA Today fought for more than a year to obtain the information they needed.

So public records inevitably contain information about individual members of the public that can also be useful to the public (and watchdog reporters). What are your feelings on public records? Should it be required that all public records be easily accessible?

My mind explodes when I think about how much information is out there.

Open Missouri lists all of the public records that are not easily accessible, just for their state. David Herzog, the founder of the organization, said it took a small team one year to determine what information is maintained by most of their agencies, which isn’t available online. Not all agencies responded to their requests.

They didn’t fight for the information like USA Today because, he said, he didn’t think they should be doing the government’s job.

 

Link

Unredacted’s flow chart breaks down the process of getting information. Click on the image for the full article.

Remember: You have the right to request the information in the format in which it is stored. If the information is on a computer, you need it electronically. NEVER take a PDF. XLS, XML, TXT… even word. Anything but a PDF!

I prefer Excel.