I am deeply afraid of the impact of this law. The amount of meta-work required to consolidate and annotate data we collect, in order to prepare it for public consumption, seems likely to hurt government efficiency.
In addition to the administrative burden, it appears to ignore the fact that non-sensitive information, in sufficient quantity and correlation, becomes sensitive information.
Perhaps my skepticism is misplaced, but my initial reaction is that this sounds better in the abstract than it will turn out to be in practice.
Part of my wife's job is to research Medicaid billing codes for every state (yes, this is a state thing, but I'm just making an example). Once in a while she can get their codes in a form as "advanced" as an excel spreadsheet. But more likely she'll get a PDF doc that she's got to run through an OCR programming to convert it to a spreadsheet, and has to check for errors. Or for some states, nothing is published at all - she's got to piece it together from partner hospital billing records.
There's no doubt that getting this data into a sane format will take the states some extra resources.
But when you consider how much more efficient this will make my wife's company, and every other provider of Medicaid services, it's bound to be a huge win on net. And improving efficiency of delivering healthcare should be important.
The government is big, but the private sector is still much larger. So there's great leverage to make our overall systems more efficient because an investment in efficiency on the government side will be multiplied many times over as seen by the many private entities that the government is overseeing.
Hospitals routinely spend huge sums of money on new equipment that significantly improves their competitive edge on diagnoses and outcomes. They are also willing to spend money on drop-in solutions that lessen the need for paperwork that eats up admin, nurse, or doctor time.
However, you're right that they are notoriously stingy on buying new things if the economics aren't immediately apparent, and will never buy into something that demands radical workflow/org changes.
By hard to sell to, I mean, it's hard to get in front of the right person. The people who control the purse strings are not necessarily the most knowledgeable about the problems at-hand either.
Inefficiency and high expense is the primary burden of open governments, representative governments, and democratic governments. If you want a cheap, efficient government, you want an absolute monarchy. This is why corporations tend to be structured into rather strict hierarchies that bear no small resemblance to feudal kingdoms. That's also why they're terrible at meeting worker demands.
Your reply leans pretty heavily on the assumption that all government agencies consist of bureaucrats. I think you should re-examine that assumption. A small minority of government workers are involved in issuing regulations at places like the EPA. Most are military service members, Homeland Security workers, DOJ law enforcement, etc.
Yeah, I think the proximal issue is that many govt. bureaucrats struggle to keep up with documentation requirements as-is. In some (many?) cases, it takes an exceedingly long time, else does not get done at all.
But--it's possible that better access to open data and a stronger culture in govt. around using data-driven decision and policymaking could improve on this in the long run. Not to mention the fact that much of the pre-existing paperwork requirements could be automated (if implemented carefully).
>The amount of meta-work required to consolidate and annotate data we collect, in order to prepare it for public consumption, seems likely to hurt government efficiency.
I (briefly) thought the same thing when I first read about it, but I think the efficiency gained from having digital, standardized formats will eventually outweigh the inefficiencies of the initial conversion to that format.
I'm also happy that otherwise "dead" data (e.g. papers sitting in boxes in a basement somewhere) could now be used more effectively in aggregate to further increase operating efficiencies. Imagine trying to put together a comparison between a specific subset of finance reports across departments when Department A uses one digital format, Department B uses another digital format, and Departments C through Z all have them in boxes. What would have otherwise been a beaurocratic headache _before you even get to data munging_ now becomes an ordeal that's easier on all fronts, and that data can then be used to fight back against otherwise unknown inefficiencies.
>The amount of meta-work required to consolidate and annotate data we collect, in order to prepare it for public consumption, seems likely to hurt government efficiency.
As a data analyst working for a state government, not consolidating or creating metadata really hurts my efficiency. I've gotten too comfortable with munging tables in PDFs.
>it appears to ignore the fact that non-sensitive information, in sufficient quantity and correlation, becomes sensitive information
This is something we're trying to figure out. The problem is, I doubt many agencies are actually maintaining privacy with their publications. The Census is adopting differential privacy strategies [0], but my own agency relies on practices from the days of printed reports. I know for a fact some of them don't work, but government is slow to adapt.
I am a big supporter of government and do not consider efficiency a primary objective (a good one, but secondary).
To make a cartoony analogy: flight security would be more efficient if everyone flew naked with no hand luggage, but that would defeat the purpose of people traveling from place to place for their own reasons.
Likewise: the government has collected or generated that info; let's put it into a reasonably clean and accessible format so others (who, in the US, have funded its collection/generation anyway) can build upon it.
The inefficiencies of correctly recording and distributing data will be, I think, greatly outweighed by the increased efficiencies of having standardized machine-readable data that's easy to access and use. I work at a think tank that uses various government data sets across agencies and jurisdictions, and the cleaning that goes into analysis is a nightmare. Some agencies have their own quirky conventions--I've seen "-1" used as a flag for "no data" before, which as you can imagine returned some strange results on analysis. A regulation that says "publish data and do it precisely this way" will be a welcome one for me.
I agree when bolted onto data, but if data collections are properly designed to be findable, accessible, interoperable, and reusable from the start, I think long term data management drops due to more efficient processes.
I think closed data or data in pdf masks a lot of technical debt that causes manual labor, expensive proprietary licenses (looking at you SAS for archived data sets).
with the right tools most of it could be very automatic and eventually it will be a non concern. it is getting up to speed that is painful and getting everyone on board
In addition to the administrative burden, it appears to ignore the fact that non-sensitive information, in sufficient quantity and correlation, becomes sensitive information.
Perhaps my skepticism is misplaced, but my initial reaction is that this sounds better in the abstract than it will turn out to be in practice.