At long last, we're getting ready to create the first set of data files containing detailed entries on campaign committee spending. We're planning to start with data for all 2009-2010 congressional campaigns (including Senate) with data covering the period from January 1, 2009. There have been a couple of problems in preparing these data, but we think we have an approach that can work now.
First, as some of you know, we've had a REALLY hard time dealing with paper filings from Senate committees. This stems mainly from the fact that amendments to these reports can be complete resubmissions of filings but they can also be just a few pages with some changes, or in extreme cases even just a letter that tells us a committee would like to add or change something from an earlier report. This creates very complex problems when you try to create automated systems to identify only the most recent material for disclosure purposes.
By sheer force of will our folks have overcome these problems, so we're now in a position to provide data (hand entered from the paper filings) for the 2010 Senate campaigns.
The other challenge we've faced is that the process we've used for the first few files in our data catalog (which I continue to think is really very nice - combining search and download in a straightforward way - if you haven't done so already please check it out and let us know what you think) won't work with files of this size. These files will be a lot bigger than the ones we've posted so far (right now the full candidate disbursements file is about 260,000 rows, but it will probably grow to several times this size by the end of the cycle). This is too much data for the query and file generation processes we've used so far, so we're working on changes that will allow us to accomplish the search and download functions more quickly in the future.
In the meantime we're planning to make available soon (by that I mean by the end of April) a set of downloadable files in XML and CSV formats that contain all of the disbursements for the candidates in any given 2010 race. We'll make it possible to select everything or all Senate or all House spending or all spending in a particular state or a particular race. I've posted a small excel file on our ftp server to give you an example of the file format; there is also a file with an explanation of the columns in the file. Take a look and get back to us with any questions or suggestions for how we can make this data as useful to you as possible.