Federal Election Commission, United States of America (logo). Link to FEC Home Page
Federal Election Commission

Disclosure Data Weblog

We're ready to start providing files in the data catalog that contain detailed information about the specific receipts and disbursements for candidates and committees.  This will include, for example, details for all contributions from individual people where the aggregate amount the person has given to a committee exceeds $200.  Similarly, all payments by committees to specific vendors will be available once those payments have exceeded $200 to that vendor.  Obviously, these files will contain lots of data - potentially millions of rows.

One problem we're having, therefore, is deciding what groupings of candidates or other committees we should use as the starting point for these files. (We're also keeping in mind the need to search these data based on information about the donor or the entity being paid - if you have ideas about how these might be grouped in more manageable sets of information, let us know.) We've got lots of ideas and we'd like to know what you think. Use the comments section to tell us your thoughts on these or other ways of organizing the information that would be helpful for you. 

We're beginning with data from the 2009-2010 time period, and when we've settled on a process we'll expand with more historical information.

First, we're thinking about placing the largest sets of itemized data in XML (if its not too big) and CSV files on our FTP server so if you choose something like "all 2010 candidate receipts" from a listing in the catalog you would be redirected to a zip file in the format you choose.  These would be updated once a day.  Is there a different file format we should consider because these files are so large?

We might do this for a number of groups - e.g.:

  • All 2010 Senate campaign receipts
  • All 2010 Senate campaign disbursements
  • All 2010 campaign receipts in [STATE}
  • All 2010 campaign disbursements in [STATE]
  • All receipts for PACs
  • All disbursements for PACs
  • All receipts for Democratic national party committees
  • All receipts for Democratic state and local party committees
  • Etc.

Do these look like the right groupings to work with? Are there others that would be helpful to you?

No matter what, we'll offer some "customize" options that will allow for more specific requests - and we're working on a process that would allow you to choose a specific candidate or committee and get a package of two files - one for receipts and one for disbursements, with just one click. Is there anything else that would be useful?

Thanks


Comments:

As a consumer of nearly every type of electronic data the FEC produces, I'm really excited about these additions to the data catalog. My only concern is about about the usability of a 1,000,000 row XML file. You might consider Sqlite database files as an alternative to schema-less CSV or memory hungry XML. SQLite is reasonably fast, free, cross platform, and databases consist of a single file. No matter the file format I think these alternative groupings will be very interesting. I especially like the idea of not having to parse 200+ 3x filings to find all the disbursements made by a party @ the local level, for example. Keep up the great work. jjh

Posted by Jason Holt on January 16, 2010 at 08:25 AM EST #

Bob, for those of us currently using legacy data, is it possible that whatever groupings and format you decide, that you have two files that currently mimic the itoth and icont files by election cycle?

Posted by Tony Raymond on January 21, 2010 at 08:53 PM EST #

No more COBOL characters? Thanks for everything you're doing.

Posted by Matt Stiles on January 28, 2010 at 03:53 PM EST #

I would like groupings by industries. Some that come to mind are: health insurance, life insurance, defense industry, fast food, agricultural, pharmaceutical, etc. It would be beneficial if donations could be tracked by industry.

Posted by Joseph Sparks on February 05, 2010 at 03:36 PM EST #

This is really exciting, thanks for all the work on this. I really appreciate the open dialog that this blog represents.

Regarding the size of the (formerly) icont files, I think as XML or a csv they will compress nicely, and you're not going to get a much more efficient method of storing the data. SQLite would be interesting as well, and then we could imbed your data directly into an iPhone app (partially kidding).

Do you have a sense of when you will start making any of this data available? I'm gearing up for a reporting project for the 09-10 cycle, and am trying to hold off until this data is available. Thanks!

Posted by Adam S on February 18, 2010 at 11:13 AM EST #

Adam, Thanks for the feedback - we're having a meeting this afternoon with the folks who will work on this and we'll talk about these comments. We're working on the files now and we expect to post files within the next quarter. We're probably going to try to do campaign expenditures first, since that's the data we've failed to provide up to now - followed (quickly I hope) by the other sets we talked about in this post. We'll have more on this soon.

Posted by Bob Biersack on February 18, 2010 at 11:37 AM EST #

If a commenter has clicked 'notify me by email comments' they will receive an email notification for all comments, even those caught and discarded by your spam filter. I'm getting 10+ emails a day from disclosureblog@fec.gov about cheap meds available over the internet. Could somebody take a look at this? Or at the very least remove the 'notify' button for the time being. I couldn't find a contact email so I figured it was easiest to post here. thanks-jjh

Posted by Jason Holt on March 31, 2010 at 10:32 AM EDT #

For Montana House, the webl10.zip file puts REHBERG, DENNIS R and GERNANT, TYLER REED in District 1, while all of the othe rcandidates are in District 0. Same is true in North Dakota, where CRAMER, KEVIN is in District 1 while the others are in District 0. Should we avoid using that?

Posted by Dan Keating on April 29, 2010 at 11:31 AM EDT #

Thanks for your efforts, Bob. What's the latest on individual contribution data in XML or CSV format?

Posted by Alo Konsen on June 15, 2010 at 02:47 AM EDT #

Alo, We're still working on the new versions of contributions data. I'll get back to you as soon as I know what specific time frame we're looking at.

Posted by Bob Biersack on June 15, 2010 at 01:29 PM EDT #

Thanks, Bob!

Posted by Alo Konsen on June 16, 2010 at 06:23 PM EDT #

Any progress to report? Election season starts in earnest on Tuesday.

Posted by Alo Konsen on September 02, 2010 at 03:57 AM EDT #

Bob, any progress on individual contribution data in XML or CSV format?

Posted by Alo Konsen on December 28, 2010 at 10:06 PM EST #

Post a Comment:
  • HTML Syntax: Allowed