Hacker News new | past | comments | ask | show | jobs | submit login

I would love to have a command-line tool that reads CSV and has a ton of features to cover different quirks and errors, which can output cleaner formats that I can pipe into other command-line tools.

csvkit [0] might be that tool; I discovered it after my last painful encounter with CSV files and haven't used it in anger yet. Among other things, it translates CSV to JSON, so you can compose it with jq.

[0] https://csvkit.readthedocs.io/en/latest/index.html




I love these TSV utilities: https://github.com/eBay/tsv-utils Granted they're for "T"sv files, not "C"sv, but there's a handy `csv2tsv` utility included.

There's also `xsv`, from the author of `ripgrep`: https://github.com/BurntSushi/xsv



Seconding the recommendation for xsv; I've used it extensively and it works great.


Use miller and never look back.

https://miller.readthedocs.io/en/latest/10min.html

It so much faster than csvkit


At my last employer, I built a filter program, creatively called CSVTools[0], to do something like this. One piece of the project parses CSVs and replaces the commas/newlines (in an escaping- and multiline-aware manner, of course) with ASCII record/unit separator characters[1] (0x1E and 0x1F); the other piece converts that format back into well-formed CSV files. I usually used this with GNU awk, and reconfigured RS[2] and FS[3] appropriately. Or you can just set the input separators (IRS/IFS) and produce plaintext output from AWK.

[0]: https://bitbucket.org/rbr/csvtools

[1]: https://en.wikipedia.org/wiki/Delimiter#ASCII_delimited_text

[2]: https://www.gnu.org/software/gawk/manual/html_node/awk-split...

[3]: https://www.gnu.org/software/gawk/manual/html_node/Field-Sep...


Good idea! Looks similar to something I wrote called csvquote https://github.com/dbro/csvquote , which enables awk and other command line text tools to work with CSV data that contains embedded commas and newlines.


csvtool is also nice.[0][1] csvkit is very flexible and can certainly be used in anger, but is a bit finicky; you almost always want to use the -I (--no-inference) option. Additionally, I wrote a tiny Perl script for quick awk-like oneliners.[2]

[0] https://github.com/Chris00/ocaml-csv [1] https://colin.maudry.com/csvtool-manual-page/ [2] https://github.com/gpvos/csved/blob/master/csved


"q" is the tool you're looking for http://harelba.github.io/q/ . Impossible to Google for, indispensable for CSV manipulation


> As of version 2.0.9, there's no need for any external dependency. Python itself (3.7), and any needed libraries are self-contained inside the installation, isolated from the rest of your system.

Oh, sh*t. I will look for something else.


Is your concern that you don't want to use a library that depends on python?


Nope. I don't want to use the tool which depends on an unsupported copy of Python.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: