Hacker Newsnew | past | comments | ask | show | jobs | submit | hxtk's commentslogin

If you're referring to the post from yesterday, they actually relicensed it as Apache 2.0: https://news.ycombinator.com/item?id=45196173

No, I had in mind different recent announcements when companies selected closed licenses that let you look at the code but not actually use it, then bragged about open sourcing their project.

Announcement title and actual license divergence has made reading these announcements a bit of a chore on HN since you're required now to read the full post. Good on these guys for not open washing their project.

And of course it doesn't help the tedium of reading HN that there's 5 very vocal commentators who want to the world to know that "OSI doesn't own the definition of open source", even though when asked will define open source as "can be commercially restricted".


We definitely really appreciate how open source leads to real innovation and actually useful code; no intention of openwashing here. Thank you for noticing that :-)

I know AWS in particular does not because they do not increment the bill for every request. I don't know exactly how they calculate billing, but based on what I do know about it, I imagine it as a MapReduce job that runs on Lambda logs every so often to calculate what to bill each user for the preceding time interval.

That billing strategy makes it impossible to prevent cost overruns because by the time the system knows your account exceeded the budget you set, the system has already given out $20k worth of gigabyte-seconds of RAM to serve requests.

I think most other serverless providers work the same way. In practice, you would prevent such high traffic spikes with rate limiting in your AWS API Gateway or equivalent to limit the amount of cost you could accumulate in the time it takes you to receive a notification and decide on a course of action.


The algorithm for BB(N) is (with some hand waving) “for each N-state Turing machine, check if it halts on an initially empty tape. If it does not, skip it. If it does, run it until it halts and count the number of steps, k. BB(N) is the maximum value of k you find.”

The problem is that the naive algorithm above requires an oracle for the Halting Problem, which every CS student learns is impossible in general for a computational model that isn’t more powerful than a Turing Machine.

So if a function can be expressed in terms of a Turing machine, busy beaver has to grow faster because eventually for some value of N, you can just encode the candidate function as a Turing machine and have Busy Beaver simulate it.

Busy Beaver hunters on the low end (trying to find the next number) have to exhaustively prove whether each Turing machine eventually halts or not at that number of states so that they can evaluate the longest running one with finite execution.


> against a key-value store no less, which probably fundamentally limits what optimizations can be done in any general way

I would disagree with this assumption for two reasons: first, theoretically, a file system is a key value store, and basically all databases run on file systems, so it stands to reason that any optimization Postgres does can be achieved as an abstraction over a key-value store with a good API because Postgres already did.

Second, less theoretically, this has already been done by CockroachDB, which stores data in Pebble in the current iteration and previously used RocksDB (pebble is CRDB’s Go rewrite of RocksDB) and TiDB, which stores its data in TiKV.

A thin wrapper over a KV store will only be able to use optimizations provided by the KV store, but if your wrapper is thick enough to include abstractions like adding multiple tables or inserting values into multiple cells in multiple tables atomically, then you can build arbitrary indices into the abstraction.

I wouldn’t tend to call a KV store a bad database engine because I don’t think of it as a database engine at all. It might technically be one under the academic definition of a database engine, but I mostly see it being used as a building block in a more complicated database engine.


CS is sort of unique in that regard. I value my university degree, but when I think about the classes that helped me the most, only one of them was a degree requirement. Not because the degree was useless, but because the information was accessible enough that I already knew most of the required content for an undergraduate degree when I got there.

I took lots of electives outside my major, and I know that I could have easily loved chemistry, mathematics, mechanical engineering, electrical engineering, or any number of fields. But when you're 12 years old with a free period in the school computer lab, you can't download a chemistry set or an oscilloscope or parts for building your next design iteration. You can download a C compiler and a PDF of K&R's "The C Programming Language," though.

CS just had a huge head-start in capturing my interest compared to every other subject because the barrier to entry is so low.


Streaming services were great back when they were separate from content producers and IP holders.

Once every media company became a streaming company and started using anticompetitive licensing practices in an attempt to drive viewership to their own platforms, the market fractured too much for it to be profitable.

Something smells “prisoner’s dilemma” about it: the best move for any individual streaming service is to have exclusive content (and the best-positioned players to do that are the studios), but when everyone does that, it decreases the overall profit available in the market more than it increases their slice of the pie.


> more than it increases their slice of the pie.

That's the part that might not be true, unfortunately. If each individual content producer sees more return on their own streaming service than they did sharing revenue from one of the independent services, then that's better for them, even if the total pie got smaller. If that wasn't the case, you'd think we'd see some of them shut their services down and go back to independent services once their income drops.

Sacrificing a wide audience to extract more from the most dedicated portion of the fanbase isn't an entirely new concept, and it financially makes sense short-term (until you start losing some of those dedicated fans over time and don't have the mindshare outside your bubble to attract new ones).


I think we will see this eventually.

Once Netflix isn't the only one that doesn't share their monthly subscriber numbers anymore, we'll know that they're beginning to at least question why they own everything instead of license their content out


They just have to out-survive the competition, selling theme park tickets and merch. Oh, and putting hit movies in theaters.

The streaming service itself doesn’t need to be profitable.


A Markov process is any process where if you have perfect information on the current state, you cannot gain more information about the next state by looking at any previous state.

Physics models of closed systems moving under classical mechanics are deterministic, continuous Markov processes. Random walks on a graph are non deterministic, discrete Markov processes.

You may further generalize that if a process has state X, and the prior N states contribute to predicting the next state, you can make a new process whose state is an N-vector of Xs, and the graph connecting those states reduces the evolution of the system to a random walk on a graph, and thus a Markov process.

Thus any system where the best possible model of its evolution requires you to examine at most finitely many consecutive states immediately preceding the current state is a Markov process.

For example, an LLM that will process a finite context window of tokens and then emit a weighted random token is most definitely a Markov process.


I thought that the point of replaceState was precisely to avoid appending elements to the history, and instead replace the most recent one, so I think I must be missing something if that line causes lots of additional history items.


The main “why” that I find is that it allows you to intentionally design your API types and know when a change is touching them.

I worked on a project with a codebase on the order of millions of lines, and many times a response was made by taking an ORM object or an app internal data structure and JSON serializing it. We had a frequent problem where we’d make some change to how we process a data structure internally and oops, breaking API change. Or worse yet, sensitive data gets added to a structure typically processed with that data, not realizing it gets serialized by a response handler.

It was hard to catch this in code review because it was hard to even know when a type might be involved in generating a response elsewhere in the code base.

Switching to a schema-first API design meant that if you were making a change to a response data type, you knew it. And the CODEOWNERS file also knew it, and would bring the relevant parties into the code review. Suddenly those classes of problems went away.


Off by one errors strike again, unless you EDC a machine pistol?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: