With PlainSite I wanted to see if it was possible to build a business model around public information that is actually available to the public, unlike the traditional model that Lexis, West and Bloomberg use. The dockets are all available for free. The documents are all available for free. Cleaned-up USPTO data is available for free. What isn't free is analytics on the data of the kind that generally only lawyers would care about.
Furthermore, the data is being uploaded to the Internet Archive, which PlainSite then re-downloads. Anyone can use it. If you don't like what I'm doing with it, you can do something else.
Aaron was an entrepreneur as well as someone who cared deeply about open access to data. So no, I don't think there's much irony.
"Furthermore, the data is being uploaded to the Internet Archive, which PlainSite then re-downloads. Anyone can use it. If you don't like what I'm doing with it, you can do something else."
Excellent idea (UK resident so the actual information is of no use to me but the model is good)
In all humility I did not know who Carl Malamud was before reading this post and the comments, and still I had to look him up. As a former corporate attorney I used the SEC's EDGAR database regularly never realizing that if it were not for one persons efforts that system would not exist. But what speaks to me even more is his current effort with law.gov to bring online all primary legal materials (including legal codes and case law) for open public access.
It is eye opening to someone whose reality was subscriptions to westlaw and lexisnexis, that could be in the thousands of dollars, for access to case law, codes, statutes, rules and regulations (or in other words, public material). I am going to see if I can find some of his talks on YouTube, but it would be awesome to be able to interact with someone like this.
For those unfamiliar with Carl Malamud, he's a national treasure cataloging and open-sourcing the nation's legal codes, government videos, legal filings and other public documents.
He had Aaron's back many times, including when the FBI was investigating the Pacer liberation. If you want to support the kind of work that Aaron believed in, resource.org takes donations in many denominations.
Well, in the PACER case, Carl talked to the FBI and offered to take the fall for Aaron if it came to that. That's even though Aaron used archive.org servers without Carl knowing what had transpired.
There were other data issues, including USPS zip code data stuff that Carl counseled Aaron on how to make public legally. Carl was generous in cautioning Aaron, ordering gov files for him, digitizing files, getting Aaron server space, etc.
Carl also wrangled free legal help for Aaron in setting up a non-profit.
I hope they post the video of Carl's memorial speech somewhere online, it was passionate and inspiring! Text on a webpage conveys his message, but everyone jumped up applauding when he finished speaking tonight in SF.
A good talk, but I feel it's not smart to use the 'Army' definition, espacially because you're non-violent. You're definitely not an army, see Wikipedia: An army (from Latin "arma" “arms, weapons”) in the broadest sense is the land-based military branch, service branch or armed service of a nation or state.
The way this 'game' is played it's not even possible to defend yourselves (by design). And with regards to you're rights, I'm reminded of George Calin: http://www.youtube.com/watch?v=Kgj4ARfAqI0 (from 4:23).
I see your point, but "Army" in context here doesn't seem threatening. I.E. Salvation Army. Of course, that wouldn't stop those who insist on taking things out of context to fight dirty, which is inevitable.
This was a lot of pent up frustration with not only lawmakers, but also with the rest of us who were blissfully obvlivious.
You are absolutely correct, but this is bigger than people with computerknowledge. The use of language is a (PR) tool of the how you are being framed and how the public will see you. Words like 'hacker', 'Lone wolf', 'Army', 'Anonymus'play right in to the 'other side's' cards.
I've always seen a clear distinction in hacktivism between legal fighters (EFF, GNU, democracynow) and crackers. We could not think of EFF doing anything illegal, for example.
This article/speech is interesting because it seems to be dropped right in the middle. I can't help but think this army vocabulary is precisely aimed at making both joining.
Well, I don't like the army thing much myself. But, government does like a "war" on this, that and everything. Perhaps its language government might understand, hell, may even respect.
Maybe too, its sort of beefs up the geeks. I get the impression that the government / corporate cartel treat tech folk like harmless meek geeks. Perhaps language like this helps change that image a bit?
That aside, what a brilliantly put article. It sums up in words what so many feel but cant fully articulate.
I see what you're saying. I remember the different treatment by the police between people protesting/marching in Washington. One was peaceful and non-violent (Occupy), the other peaceful and 'don't-tred-on-me' (Teaparty). Hackers build this world and they can fight back as well and they know it.
It seems strange to me that Google, Amazon, or any other cloud provider hasn't partnered with a non-profit organization to secure grants for a modern, comprehensive public data repository. You might call it something like "gov.org". One requirement, of course, is to have an open process for modifying data representation. Crowd-sourced data formats, heh.
(Out of curiosity, why can't we consider the content of .gov websites to constitute this archive and simply a) petition that all public datasets be available on a .gov domain (format to be sorted later) and b) that all future datasets start out life open on .gov.)
We're trying.
It's not as easy as you think.
The incentives for government to care or want to work with you really aren't there.
That said, we've had success in some limited areas.
For example, voting information (which was 7+ figure data for the US) is now online. This was originally a partnership I helped create with Pew and Google back in 2008 (now expanded to include MS and others as well, wonderfully):
After 4+ years, we now have a large number of states voting information online and free.
A large number of people at various states also put their asses on the line to help make this happen over the years . I wish I could give them medals.
:)
There are other example, like patent data, etc. To be honest, i'd rather us stay behind the scenes and just have the info released, even if it means people never know we were involved. It prevents a lot of issues from people who make large amounts of money off data that should be public and open. Of course, there are times/cases where it makes sense to use our name and brand to help, and when necessary, we do that.
We also fund plenty of non-profit orgs, including folks like Carl. But getting traction is simply not that easy. A lot of government agencies make revenue from data they publish by selling subscriptions to it or otherwise charging. They don't want to give it to you if your plan is to open it up, even if you are willing to pay large sums. I can't often blame them. Congress cares more about seeing agencies budget neutral than they do about "open data".
There were also mandates that public datasets be cataloged sanely and released. This led to data.gov. However, because of the way it worked some agencies had some perverse incentives, like "release the most datasets".
This led to humorous things like every single separate piece of data being published as its own dataset on data.gov, which had no good search, making parts of it entirely useless.
Anyway, the short answer is: We're trying. We've been trying.
Good! Thanks for your comment. I thought of an amusing way to think of a rather unamusing situation, so I figured I'd share it. Maybe you'll laugh, I hope.
Let's say you have a large extended family and elderly grandparents that have boxes of family photos. You know that the family would love to flip through these, and digitizing them and putting them online is beyond your grandparents' abilities.
So you offer to help them digitize and share the files. On your time and dime.
But your grandparents surprise you by saying "no". The first thing they're worried about is that, mixed in with those photos, are some risque photos of grandma when she was younger. She doesn't want those shared. But also - and this is the part that really drops your jaw - unknown to you for many years your grandparents have been charging family members a small fee to access the photographs, and it's quite a little side income for them, especially around the holidays when they need it most!
From these past couple of weeks, I have picked up on the basics of the PACER incident. Is the archive out there anywhere to be found? Maybe at Wikileaks or an onion address?
How about the mass of data Aaron got from JSTOR? Surely someone else must have a copy for safe keeping.
Seems this particular subset of data deserves to be liberated. Not that the archives in their entirety do not, but since a subset is already out there, why hasn't some group released it yet?
As someone who's had to pay PACER fees for their own court concerns, I find this entire paywall mentality offensive.
If you haven't seen it already, please participate in Operation Asymptote, and tell others to as well:
http://www.plainsite.org/asymptote/
I'd like to have every U.S. Attorney's full case history on PlainSite by March 31, 2013. I paid for Ortiz [1] and Heymann [2]. There are a lot more.
[1] http://www.plainsite.org/flashlight/attorney.html?id=69049...
[2] http://www.plainsite.org/flashlight/attorney.html?id=73864...
Also, help us with extending RECAP:
http://www.plainsite.org/aaronsw/