with gemini-cli and claude-cli you can now prompt while it prompts ffmpeg, and i...

conradev · 2025-08-22T16:13:56 1755879236

Yeah, you can give an LLM queries like “make this smaller with libx265 and add the hvc1 tag” or “concatenate these two videos” and it usually crushes it. They have a similar level of mastery over imagemagick, too!

turnsout · 2025-08-22T16:53:21 1755881601

Yeah, LLMs have honestly made ffmpeg usable for me, for the first time. The difficulty in constructing commands is not really ffmpeg's fault—it's just an artifact of the power of the tool and the difficulties in shoehorning that power into flags for a single CLI tool. It's just not the ideal human interface to access ffmpeg's functionality. But keeping it CLI makes it much more useful as part of a larger and often automated workflow.

lukeschlather · 2025-08-23T19:34:45 1755977685

It's funny because GPU stuff like what this article is about is where the LLMs totally fall apart. I can make any LLM produce volumes hallucinations at the drop of a hat by asking it how to construct ffmpeg commands that use hardware acceleration.

profsummergig · 2025-08-22T16:57:31 1755881851

Just seeking a clarification on how this would be done:

One would use gemini-cli (or claude-cli),

- and give a natural language prompt to gemini (or claude) on what processing needs to be done,

- with the correct paths to FFmpeg and the media file,

- and g-cli (or c-cli) would take it from there.

Is this correct?

logicalmind · 2025-08-22T20:14:51 1755893691

Another option is to use a non-cli LLM and ask it to produce a script (bash/ps1) that uses ffmpeg to do X, Y, and Z to your video files. If using a chat LLM it will often provide suggestions or ask questions to improve your processing as well. I do this often and the results are quite good.

RedShift1 · 2025-08-22T17:12:46 1755882766

Yes. It works amazingly well for ffmpeg.

profsummergig · 2025-08-22T17:14:56 1755882896

Thank you.

NSUserDefaults · 2025-08-22T15:52:21 1755877941

Curious to see how quickly each LLM picks up the new codecs/options.

stevejb · 2025-08-22T16:20:17 1755879617

I use the Warp terminal and I can ask it to run —-help and it figures it out

baq · 2025-08-22T15:58:27 1755878307

the canonical (if that's the right word for a 2-year-old technique) solution is to paste the whole manual into the context before asking questions

xnx · 2025-08-22T17:15:08 1755882908

Gemini can now load context from a URL in the API (https://ai.google.dev/gemini-api/docs/url-context), but I'm not sure if that has made it to the web interfaces yet.