How did it go? Did you have any problems opening the file? If so, you may want to make it smaller by getting rid of information you don’t really need. If your ‘puter didn’t choke on that, you can skip the next step and move on to cleaning it up and refining things.
Reduce File Size
Leave your original file alone and save a second copy as _edit or something.
The first thing I do is right click on the number 1 so the entire row is highlighted. Delete. *Tip: Go all the way to the top in Office 2010. Click View>Freeze Panes>Freeze Top Row so the top row will always be visible.
Every data set is different. Let’s see what we want to keep and what we want to get rid of. Less data makes it easier to focus on what’s important and reduces the file size:
- source_file_url is useful, but not right now. The column is too big and gets in the way.
- tv_market-id is not necessary. We’ll use the tv_market.
- fcc_folder can go.
- file_name can go.
- and everything else to the right of advertiser_name can be deleted.
It’s trimmed. Time to polish.
Download. Extract all files. Run the .exe file. Now you’ve got Refine. It opens in a browser window.
Browse to your file and upload it into Refine.
The Art of Refining
Google has some great videos to get you started. I’m starting with the ad_type. I went to the little box at the top of the column. Facet>Text Facet. The list on the left is populated with each unique name. At the top of that box, sort by count.
Look at each group and try to get it down to as few groups as possible. Watch the first tutorial at the link above to see how to do this.
Here’s my list of ad_types.
Non-Candidate Issue Ads 10999
US Senate 4993
US House 4860
President 4642
State 2739
Local 461
US Congress 100
Terms and Disclosures 21
Candidate Ads Rate Cards 4
Classes of Time 3
Political Guidelines 3
Station Contacts 3
CRAVAACK715920 (13500717880925) 1
Duckworth 10.01-10.01 C399508 R 1
Duclworth 09.25-09.30 C399804 R 1
flinn for congress 9-12_2012091 1
Foster 10.09-10.14 C396162 Rev0 1
Foster 10.22-10.28C396159 Rev00 1
KNBC tacts (13444398098929)_.p 1
Smith Inv. 94328 (1346341508269 1
Station tacts (13450581429382) 1
(blank) 112
I can’t put any of the remaining contracts into any of the bigger groups for certain. Remember not to get overzealous. Make sure you’re making changes that maintain the integrity of what we started with.
Let me know if you have any questions so far. How’s it going? Are my instructions easy to follow? Is this helpful so far?
By the end, we’ll be able to generate charts and graphs with amazing detail. Just stick with it! Next time, we’ll get even more detailed.