Some more tips from someone who does this every day. 1) Be careful with CSV file...

collyw · on Aug 28, 2014

Actually I would say Perl is more appropriate. I went back to Perl after 4 years for this sort of task, as it has so many features built into the syntax. Plus it can be run as a one liner.

etrain · on Aug 29, 2014

I'm reminded of the old joke, "python is executable pseudocode, while perl is executable line noise."

But seriously, I've got some battle scars from the perl days, and hope not to revisit them. Honestly, there's very little I find I can do with perl and not python, and it's just as easy to express (if not quite as concise) and much simpler to maintain.

But, use the tool that works for you!

collyw · on Aug 29, 2014

I use Python and Django most of the time, and its true, you can do pretty much the same thing in each language. But for quick hacky stuff manipulating the filesystem a lot, Perl has many more features built into the language. Things like regex syntax, globing directories, back ticks to execute Unix commands, and the fact you can use it directly from the command line as a one liner. You can do all these (except the last one?) in Python, but Perl is quicker.

vram22 · on Aug 30, 2014

>But for quick hacky stuff manipulating the filesystem a lot, Perl has many more features built into the language. Things like regex syntax, globing directories, back ticks to execute Unix commands

All good points.

>you can use it directly from the command line as a one liner. You can do all these (except the last one?) in Python

You can use Python from the command line too, but Perl has more features for doing that, like the -n and -p flags. Then again, Python has the fileinput module. Here's an example:

http://jugad2.blogspot.in/2013/05/convert-multiple-text-file...

ars · on Aug 28, 2014

> you almost invevitably want to run sort before you run uniq

And then you don't actually want uniq anyway since sort has a -u switch that removes duplicate lines.

potatosareok · on Aug 28, 2014

What if you want uniq -c? Any simple way to replicate that functionality better then...sort | uniq -c?

ars · on Aug 28, 2014

Then you run uniq -c (which I do all the time).

But for the examples in the main article sort -u would be fine.