Are there any pipeline tools for command line stream processing? Because when you have several terabytes of data you can't exactly afford to restart due to a stray comma in your CSV file.
If you have a stray comma in your multi-TB CSV file, you probably don't _want_ it to keep going. You risk misinterpreting the mistake and having a grossly malformed output... There's no way to reliably and elegantly recover from something like that. Validation should preferably happen before processing