To the extent the command-line program returns proper exit codes, or gives other hints via its output/side-effects as to whether it did anything useful, I could see it working up-to-a-point.
Discovering exclusive options, or options that must be specified together, sound like exactly the sorts of things it could deduce, when every invocation with/without certain pairs of options fails.
Perhaps wrapper-utilities like 'Gooey' or some sort of automated-probing could encourage a greater formalization for specifying de-facto command-line APIs. Then, command-line programs might want to offer not just the human-interpretable help texts and docs, but more rigorous usage-specs that could drive automatic wrappers.
You'd of course do it inside a controlled environment, via virtualization or similar technologies which let you monitor, veto, or roll-back anything the analyzed program does. (And if it's malicious/sneaky enough to escape that, then it wasn't safe to use in a manual fashion, either.)