Hacker News new | past | comments | ask | show | jobs | submit login
A tool for capturing captions and transcripts from online videos (simonwillison.net)
123 points by mzs on Sept 30, 2022 | hide | past | favorite | 28 comments



Wow wish it could integrate with this:

https://filmot.com/

I'm constantly finding myself remembering a part of a video I saw a long time ago (e.g. Robert Sapolsky's hour-long lectures on human behavioral biology) and I've wasted more than a few days just trying to figure out which of his two dozen lectures it was. Filmot solved this for me, but I still have this problem with badly captioned videos or non-yt videos. This tool seems perfect for building a personal, searchable collection


Wow that's cool. I've been using Whisper from a script I wrote which reads my Dropbox videos, transcribes them, and uploads both to Notion. If anyone's interested feel free to reach out. [0]

I may pivot to this Github Action so my CPU doesn't explode.

[0] jack at koptional dot com


why not put it on a gist/github repo and share it here?



There is a severless machine learning course that includes GH actions to implement serverless feature pipelines and serverless batch inference pipelines.

https://github.com/featurestoreorg/serverless-ml-course

Disclaimer: I am involved in it.


youtube-dl has --embed-subs and --convert-subs (currently supported: ass, lrc, srt, vtt)

The automatic thing is interesting, and i'll have to check how to use whisper, i have a ton of old DVD rips that either have no subtitles or opensubtitles that have severe timing issues that make them unusable with a dumb player (like roku).


I just use youtube-dl to get the youtube ttf captions and run them through a python script to clean up the timestamps and format somewhat readably. There are tons of transcription errors but it's still a lot faster than watching the stupid video.


Would this give you a similar but quicker result?

https://you-tldr.com/

(No affiliation).


If it was a DirectShow filter, we could just connect: containder demuxer -> audio decoder -> transcriber -> translator -> subtitle renderer :)


any activity that places a burden on our servers, where that burden is disproportionate to the benefits provided to users (for example, don't use Actions as a content delivery network or *as part of a serverless application*, but a low benefit Action could be ok if it’s also low burden); or

Not a lawyer but pretty sure that is a violation of their ToS


I'm very confident that what I've built here fits the set of things that you are allowed to do with Actions.

The workflow I've written here is a shortcut for writing content directly to the repository. You could go and run the commands on your laptop and copy-and-paste the extracted captions into a file and push them to the repo... but Actions are specifically designed to automate that kind of process.

(Also: I've shown this to GitHub people who have worked on Actions and they thought it was really cool.)


Being confident/cool is irrelevant if GH legal decides that this isn't a valid use of their ToS.

I would have reached out to GH to ask for permission instead of asking for forgiveness.


I doubt GitHub have the support capacity to handle everyone pinging them to ask permission any time they want to do something interesting with Actions.

I'll take my chances. If they tell me it's not a supported use-case, I'll update the project to tell people they shouldn't use it.


Keep building and ignore the haters. I'm sure Github deals with actual abuse issues with Github Actions (like trying to mine crypto) on a regular basis. This is neat and interesting and at most they'll rate limit it if it gets too popular. Plus you're connecting to a hosted paid service for the GPU backend side so it's not all CPU time.


I'm not a hater, I'm a realist. Services like this have a free tier to encourage paid accounts. When people abuse that free tier, everyone else suffers. It is not much effort to ask the support team for permission. I've also been on the devops team of having to run services like this and it really isn't fun when people abuse it. It is a lot of extra work.


Exactly, they would probably say no since that is the easiest answer.

Now that you're top of HN and they might see more abuse of their systems, it'll just come more quickly.

Great work on the actions though, it is a pleasure to read the source code. Learning a few tricks in there.


I think

> if using GitHub-hosted runners, any other activity unrelated to the production, testing, deployment, or publication of the software project associated with the repository where GitHub Actions are used.

is far more pertinent, and can be solved by self-hosting a runner.


Can you self-host a runner outside of GH enterprise?

EDIT: TIL you can! That's wild.


Also check out https://text-generator.io which is over 8x cheaper than Google for speech to text, 5.5 hours free every month too


You should disclose that this is your business, as the wording makes it seem like an unbiased recommendation.


In the repo he says $0.20/min which seems quite high to me even roll your own like this. But I noticed that Otter.ai have downscaled their paid and free tier audio>txt min/month allotments as of last week so what used to get you 6000 now gets you 1200. They also capped the video file transcriptions so I wonder if costs are going up for some reason?


This is an amazing "misuse"/hack of GitHub Actions and probably something that will cause major headaches for us in the future if they decide to crack down on it. I love it.


why is it a misuse?


See the above comment: https://news.ycombinator.com/item?id=33037494

Kinda breaks the spirit of GHA, IMO. I like it, but I think it's a bad path to start down. Entirely IMO, keep in mind.


This is the wrong link - this is just to a demo of the system.

https://simonwillison.net/2022/Sep/30/action-transcription/ is my full write-up of the project

https://github.com/simonw/action-transcription is the project repository.


Ok, we've changed to the first link from https://github.com/simonw/action-transcription-demo. Thanks!


Thanks!


Sorry I can't seem to edit the submission anymore, but you edited the readme thankfully.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: