Hacker News new | past | comments | ask | show | jobs | submit | siscia's comments login

Are ffmpeg user interested in a cloud base solution?

You push the input files, the command, and fetch the output when done.


There's quite a few cloud services built around it already, but not usually so loose or general. I can see it being expensive to run.

The plan would it be to charge it at cost plus. Whatever the user is consuming plus a markup.

Would you, or anyone else, be interested in ffmpeg in the cloud?

Connect credit card, open a web UI, send the command, the files, and eventually get the output?


I would SO love it ! I regularly take a look at the existing offerings, and there's a few options for "transcode video as API". However it's pretty costly, i regularly have batches of videos that would set me back 30 to 80 bucks if i were to transcode them in the cloud. I don't think it can be done at any price point i'd be happy with for this kind of personal project - especially considering that the alternative is just to max out my CPU for a day or two.

I see few blind spots from the write up.

1. Traffic for a new version was loaded up too quickly. I usually lobby for releasing updates slowly. This alone would have prevented the issue.

1. Tasks cannot fail under load. Load Shedding should be in place exactly for this reason. You don't take more than you can chew. If more arrives you slowly and politely refuse the request. You need to be both, slow and polite, so that the client will slowly retry and you won't incur in the herding issue.

1. The monitoring issue should have triggered (most likely) an increase of latency. That should have been enough to not complete the deployment and rollback carefully.

I am sure engineers in canva had their reason, and that the write up does not account for everything. Just some food for thought for other engineers.


I am very biased on this topic.

I have started using AI coding assistant and I am not looking back.

This comes from an engineer that KEEP telling the junior on his team to NOT use GenAI.

The reality is that those tools are POWER TOOLS best used by engineers very well versed in the domain and in coding itself.

For them, it is really a huge time saving. The work is more like approving PR for a quite competent engineer than writing the PR myself.

My tool of choice is Cline, that is great, but not perfect.

And the quality is 100% correlated to:

1. The model

2. The context window

3. How well I prompt it.

In reverse order of importance.

Even an ok model, well prompted gives you a satisfactory code.


> The reality is that those tools are POWER TOOLS best used by engineers very well versed in the domain and in coding itself.

I'm starting to get a feeling of dread that our entire engineering organization is digging itself into a hole with lots of buggy code being written which no one seems to understand, presumably written with heavy LLM assistance. Our team seems to be failing to deliver more, and quality has seemingly worsened, despite leaning in to these tools.

Reading hacker news gives me the idea that LLMs are a miracle panacea, a true silver bullet. I think that the positive stories I hear on hacker news goes through a big selection bias. It has always been the motivated people who always utilized their tools to their best ability.

I definitely don't consider myself to be good in this regard either and struggle to use LLM tools effectively. Most of the time I would be happy with myself if I could just have a solid mental understanding of what the codebase is doing, never mind be a 10x AI enhanced developer.


In my experience, coding with AI is much more mentally taxing than coding without.

But it is much faster.

When I use AI I need to continuously review, direct, and manage the AI.

I go through every change and I agree with them, updates nits and regenerate code that is not up to par with a better or more specific prompt.

Not doing this exercise is disastrous for the codebase.

It really explodes in complexity in no time.

Moreover it always try to fix error with more code. Not with better code.


> have a solid mental understanding of what the codebase is doing

I think this is what truly matters no matter how or even if you're slinging code. I think this is what makes highly effective folks and also cleanly explains why high performers in one team or org can fail to deliver in another company or position.


Sorry for the late (and now in wrong thread) reply; is https://news.ycombinator.com/item?id=42318876 still active? If so, happy to have a chat on it. If you want, let me know how to contact you best (feel free to send me mail to d10.pw).

Matches my observations, having used Github Copilot for several months. Its POWER TOOL.

I wrote down the same concept as a more structured substack.

https://slowtechred.substack.com/publish/posts/detail/154138...


I am getting quite deep into coding with AI and cost of tokens is a bit of an issue indeed.

Trivial issue because it saves me A LOT of time, but it could be an issue for new people testing it.

I would love to test this approach. Are you guys fine tuning for each codebase?


Yes, we fine-tune for each codebase. Now we are focusing on larger enterprise codebases that would: 1. benefit from the fine-tuning the most. 2. have the budget to pay us for the service. For smaller projects that are price-sensitive we are probably not a good fit at this point.

>>cost of tokens is a bit of an issue indeed

Their cost is $0.7 per 1M token.

DeepSeek is $0.14 / 1M tokens ( cache miss)


DeepSeek is an amazing product but has few issues:

1. Data is used for training

2. Context window is rather small and doesn't fit as well large codebase

I keep saying this over and over in all the content I create, the valu of coding with AI will come from working on big, complex, legacy codebases. Not from flashy demo where you create a to-do app.

For that you need solid models with big context and private inference.


DeepSeek is open source and has a context length of 128k tokens.

Commercial service have a context of 64k tokens, which I find quite limiting.

https://api-docs.deepseek.com/quick_start/pricing

Running it locally is quite a bit beyond the scope of being productive while coding with AI.

Beside that 128k is still significantly less than Claude


Shouldn't we be comparing with other open source model? In particular since this is about llama3.3 then they have the exact context limit which is 128k [1]. Also

[1] https://huggingface.co/meta-llama/Llama-3.3-70B-Instruct


Why?

Whenever using a model to be more effective as a developer I don't particularly care if the model is open source or closed source.

I would love to use open source models as well, but the convenience to just plug an API against some endpoints in unbeatable.


Most of our white collar jobs are about knowledge sharing and synchronization between people.

And surprisingly this is an aspect in which I see very very little progress.

The most we have are tools like confluence or Jira that are actually quite bad in my opinion.

The bad part is how knowledge is shared. At the moment is just formatted text with a questionable search.

LLMs I believe can help in synthesize what knowledge is there and what is missing.

Moreover it would be possible to ask what is missing or what could be improved. And it would be possible to continuously test the knowledge base, asking the model question about the topic and checking the answer.

I am working on a prototype and it is looking great. If someone is interested, please let me know.


knowledge is power and people don't always want to share. Maybe it's more reflective of my company culture but I've seen knowledge effectively hoarded and used strategically as a weapon at times.

It is visible everywhere. Some people hoard knowledge so that they stay important in the company. Some people hoard knowledge so that they can get more money from bug bounties. It is almost always about personal gain.

Of course there is no upside for spending time updating documentation unless it actually is part of your job description or there is legal requirement for company.

If you put knowledge in wiki, no one will read it and they will keep asking about stuff anyway.

Then if you put it there and keep it up to date you open yourself to a bunch of attacks from unhappy coworkers who might use it as a weapon nagging that you did not do good job or find some gaps they can nag about.


I write documentation because I enjoy it and I see it as a tool for consolidating/solidifying my own knowledge

Of course, but at least in my personal case is more about the lack of tooling.

> LLMs I believe can help in synthesize what knowledge is there and what is missing.

How could the LLM help?

Given that it is missing the critical context and knowledge described in the article, wouldn’t it be (at best) on par with a new developer making guesses about a codebase?


The open domain frame problem is simply the halting problem.

https://philarchive.org/rec/DIEEOT-2

While humans and computers both suffer from the frame problem, the LLMs do not have access to symantic properties, let alone the open domain.

This is related to why pair programming and self organizing cross functional teams work so well btw.


As engineers we often aim to perfection, but oftentimes it is not really needed. And this is such case.

Knowledge is organised into topic, and each topic has a title and a goal. Topics are made of markdown chunks.

I see the model being able to generate insightful questions about what is missing to the chunks. As well as synthesise good answer for specific queries.


I think companies have a lot of data in systems like confluence and JIRA and their chat solution which is hard to find and people in the company don't even know that it might be there to search for it.

An LLM that was trained up on these sources might be very powerful at helping people not to solve the same problem many times over.


The problem isnt the interface it's the access, having everything in one place vs fragmented across different systems, different departments

I built a chatbot under the same assumption you have for a large ad agency in 2017, an "analyst assistant" for pointing to work that's already been done, offering to run scripts that were written years ago so you don't have to write them from scratch

Through user testing the chat interface was essentially reduced to drop-down menus of various categories of documentation, but actually it was the hype of having a chatbot that justified the funding to pull all the resources together into one database with the proper access controls.

I would expect after you went through the trouble of training an LLM on all that data, people using the system would just use the search function on the database itself instead of chatting with it, but be grateful management finally lifted all the information silo-ing.


Some of these companies aren't delightedly eager to make it cheap to access the data you have entered into their systems. It's like they own your data in a sense and want to make it harder to leave.

I love your point about the chatbot being the catalyst for doing something obvious. I curate a page for my team with all the common links to important documentation and services and find myself nevertheless posting that link over and over again to the same people because nobody can be bothered to bookmark the blasted thing. Sometimes I feel it's pointless making any effort to improve but I think you have a clever solution.

The other aspect of it, IMO is that searching for the obvious terms doesn't always return the critical information. That might be my company's penchant for frequently changing the term it likes to use for something - as Architects decide on "better terminology". I imagine an LLM somehow helping to get past this need for absolute precision in search terms - but perhaps that's just wishful thinking.


Wouldn't be simpler to just use a linter then?


It would be simpler, but it wouldn't be as effective.


How so? The build fails if linter checks don't pass. What does an ai add to this?


Much more nuanced linting rules.


What exactly is gained from these nuanced linting rules? Why do they need to exist? What are they actually doing in your codebase other than sucking up air?


Are you arguing against linters? These disingenuous arguments are just tiring, you either accept that linters can be beneficial, and thus more nuanced rules are a good thing, or you believe linters are generally useless. This "LLM bad!" rhetoric is exhausting.


No. And speaking of disingenuous arguments, you've made an especially absurd one here.

Linters are useful. But you're arguing that you have such complex rules that they cannot be performed by a linter and thus must be offloaded to an LLM. I think that's wrong, and it's a classical middle management mistake.

We can all agree that some rules are good. But that does not mean more rules are good, nor does it mean more complex rules are good. Not only are you integrating a whole extra system to support these linting rules, you are doing so for the sake of adding even more complex linting rules that cannot be automated in a way that prevents developer friction.


If you think "this variable name isn't descriptive enough" or "these comments don't explain the 'why' well enough" is constructive feedback, then we won't agree, and I'm very curious as to what your PR reviews look like.


Linters suffer from a false positive/negative tradeoff that AI can improve. If they falsely flag things then developers tend to automatically ignore or silence the linter. If they don't flag a thing then ... well ... they didn't flag it, and that particular burden is pushed to some other human reviewer. Both states are less than ideal, and if you can decrease the rate of them happening then the tool is better in that dimension.

How does AI fit into that picture then? The main benefits IMO are the abilities to (1) use contextual clues, (2) process "intricate" linting rules (implicitly, since it's all just text for the LLM -- this also means you can process many more linting rules, since things too complicated to be described nicely by the person writing a linter without too high of a false positive rate are unlikely to ever be introduced into the linter), and (3) giving better feedback when rules are broken. Some examples to compare and contrast:

For that `except` vs `except Exception:` thing I mentioned, all a linter can do is check whether the offending pattern exists, making the ~10% of proper use cases just a little harder to develop. A smarter linter (not that I've seen one with this particular rule yet) could allow a bare `except:` if the exception is always re-raised (that being both the normal use-case in DB transaction handling and whatnot where you might legitimately want to catch everything, and also a coding pattern where the practice of catching everything is unlikely to cause the bugs it normally does). An AI linter can handle those edge cases automatically, not giving you spurious warnings for properly written DB transaction handling. Moreover, it can suggest a contextually relevant proper fix (`except BaseException:` to indicate to future readers that you considered the problems and definitely want this behavior, `except Exception:` to indicate that you do want to catch "everything" but without weird shutdown bugs, `except SomeSpecificException:` because the developer was just being lazy and would have accidentally written a new bug if they caught `Exception` instead, or perhaps just suggesting a different API if exceptions weren't a reasonable way to control the flow of execution at that point).

As another example, you might have a linting rule banning low-level atomics (fences, seq_cst loads, that sort of thing). Sometimes they're useful though, and an AI linter could handle the majority of cases with advice along the lines of "the thing you're trying to do can be easily handled with a mutex; please remove the low-level atomics". Incorporating the context like that is impossible for normal linters.

My point wasn't that you're replacing a linter with an AI-powered linter; it's that the tool generates the same sort of local, mechanical feedback a linter does -- all the stuff that might bog down a human reviewer and keep them from handling the big-picture items. If the tool is tuned to have a low false-positive rate then almost any advice it gives is, by definition, an important improvement to your codebase. Human reviewers will still be important, both in catching anything that slips through, and with the big-picture code review tasks.


How do you handle wrote concurrency?

If you different processes write on the same file at the same time, what do I read after?


All connected file system clients see read-after-write consistency, so you see the up to date file data!


I heard you about the "limited hands, infinite wishlist" but nowadays when I see someone making bold claims about transactions and consistency over the network, I grab my popcorn bucket and eagerly await the Jepsen report about it

The good news is that you, personally, don't have to spend the time to create the Jepsen test harness, you can pay them to run the test but I have no idea what kind of O($) we're talking here. Still, it could be worth it to inspire confidence, and is almost an imperative if you're going to (ahem) roll your own protocol for network file access :-/


We've actually been thinking about getting Jepsen to do this, so I'm happy to hear that you also think that it would inspire confidence!


That's exactly right!


> I grab my popcorn bucket and eagerly await the Jepsen report about it

I am the same, as distributed consensus is notoriously hard especially when it fronts distributed storage.

However, it is not imposssible.. Hunter and I were both in the EFS team at AWS (I am still there), and he was deeply involved in all aspects of our consensus and replication layers. So if anyone can do it, Hunter is!


Thank you for the kind words, Geert!


Congrats on the launch!

A bit unrelated question, but what is the fastest way to obtain a similar looks and feel for the UI? Is there a framework?


I am the author of https://getgabrielai.com

Basically LLM to filter and manage emails.

I am not sure I understand why you are using a multimodal model. Why rendering and reading back the email?

How the visual aspect helps?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: