Hacker News new | past | comments | ask | show | jobs | submit login

Just format the identifier cells as text. I've also had this problem, this is how I solved it.



That's fine for an individual working on a specific set of data in a specific sheet, but this isn't just a problem for one individual or a small team that can solve it and move on. It's a systemic problem throughout many vast organisations, with continuous influxes of new personnel and constantly changing requirements. When you get an XLS sheet sent over to you from another team that already made this mistake, it's too late and this happens all the time.


> Just format the identifier cells as text

CSV is text. If you mean in Excel, if you opened it in Excel (rather than importing and choosing non-default options), you've already lost the data so formatting doesn't help you.


Yes, I mean Excel. We have CSV to XLS import scripts/forms that format identifier cells as text. The data format is standardised. Using templates to do the imports was the dumb part. Microsoft has a Power BI tool if ones doesn't want to write or use scrips. Use that. I assume a government agency has the resporces to pay for it and for data scientists.

https://powerbi.microsoft.com/en-us/

Thanks for bothering to respond instead of downvoting.


> I assume a government agency has the resporces to pay for it and for data scientists

First, the upthread commented said "healthcare" not "government agency".

Second, as someone who has worked in public sector healthcare: HA HA HA!

I mean, sure we have the resources to pay for data scientists (of which we have quite a few) and could conceivably probably afford to develop custom scripts for any CSV subformat that we decided we needed one for (though if its a regular workflow, we're probably acquiring it an importing it into a database without nontechnical staff even touching it, and providing a reporting solution and/or native Excel exports for people who need it in Excel.)

The problem is that when people who aren't technical staff or data scientists encounter and try to use CSVs (often, without realizing that's what they are) and produce problems, its usually well before the kind of analysis which would go into that. If its a regular workflow that's been analyzed and planned for, we probably have either built specialized tools or at least the relevant unit has desk procedures. But the aggregate of the stuff outside of regularized workflows is...large.


Nearly all end-users open a CSV like this:

1. They see a file (they have file extensions turned off, which is the default, so they probably don't even know what a CSV is)

2. They double click it

Excel now corrupted the data. That is the problem. Good luck teaching all end-users how to use Excel properly.


> They have file extensions turned off, which is the default, so they probably don't even know what a CSV is

And if, also by default, Excel is setup with an association with CSVs, the CSV file will, in addition to not having an extension to identify it, will have an icon which identifies it with Excel.


> I assume a government agency has the resources to pay for it and for data scientists.

Bold strategy there, let's see how that plays out.

Having been in and around military / DoD usages for a long time, I can tell you it's always an uphill battle to get processes to work well, instead of defaulting to whatever the original spec happened to get included as a result of some incompetent who wasn't even aware of good practice.




Consider applying for YC's Summer 2025 batch! Applications are open till May 13

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: