Got to love awk. My weapon of choice for ad hoc arbitrary text processing and data analysis. I’ve tried to replace it with more modern tools time and again but nothing else really comes close in that domain.
Completely agree. Even Python, which has a very low barrier to entry to "read file, possibly csv, do something" has a barrier to entry. Column projections are one-liners in AWK, and aggregates and/or some stats can be a couple of lines in an AWK script proper.
I've been replacing some ad-hoc bash scripts (nothing fancy, just a few if conditions and some formatting of outputs for a deployment) with some AWK, and it's so much handier to write (after 10 years I still can't remember if syntax) and read (it's a proper programming language) than bash
Interesting Ruby (MRI anyway) has command line options to make it act pretty similar to awk:
-n adds an implicit "while gets ... end" loop. "-p" does the same but prints the contents of $_ at the end. "-e" lets you put an expression on the command line. "-F" specified the field separator like for awk. "-a" turns on auto-split mode when you use it with -n or -p, which basically adds an implicit "$F = $_.split to the while gets .. end loops.
So "ruby -[p or n]a -F[some separator] -e ' [expression gets run once every loop]'" is good for tasks that are suitable for "awk-like" processing but where you may need access to other functionality than what awk provides..
You probably know it, but in case not, and for others who might not know: Ruby was influenced by Perl, and Perl was influenced by awk. (Both Ruby and Perl were influenced by other languages too.) And (relevant to this thread) Perl was influenced by C, sed, and Unix shell too.
>(Even though the one-liner turned out to be a bit more difficult to write than I thought at first due to '\r' characters in the input file)
Although you solved it, another way is that one can always pipe the input through a filter like dos2unix first. Very easy to write and versions can be found/written for/in many languages. Essentially, you just have to read each character from stdin and write it to stdout, unless it is a '\r', a.k.a. Carriage Return a.k.a. ASCII character 13, in which case you don't write it.
I've often found that beginners these days don't know what carriage return, line feed, etc. are, and their ASCII codes. Basic but important stuff for text processing.