agreeing with the point the question makes here; the game theory of global politics does not work with the same morals that we prescribe individual people
Personally, I think everything should be hackable, however...
Limiting the ability to _easily_ modify what's running on a system is more about public cyber-health than the individual's freedom. Viruses + malware much more easily infect systems when they are running outside of a sandbox.
The main reason why we can't do that now is because we require models to be digitally reproducible (IMHO, but also read Geoffrey Hinton's mortal computing).
The energy cost come from error correction as much as training algorithms.
Would be really cool to convert it's predictive model into a computer program that predicts written in like python/C/rust/whatever, and I think that would better serve our ability to understand the world.
We don't need to; dynamical meteorology is an incredibly mature field and our understanding of the fluid dynamics of the atmosphere grossly exceeds the resolutions and limitations of coarse, 0.25 degree global numerical models.
This reads solely as a sales pitch, which quickly cuts to the "we're selling this product so you don't have to think about it."
...when you actually do want to think about it (in 2024).
Right now, we're collectively still figuring out:
1. Best chunking strategies for documents
2. Best ways to add context around chunks of documents
3. How to mix and match similarity search with hybrid search
4. Best way to version and update your embeddings
We agree a lot of stuff still needs to be figured out. Which is why we made vectorizer very configurable. You can configure chunking strategies, formatting (which is a way to add context back into chunks). You can mix semantic and lexical search on the results. That handles your 1,2,3. Versioning can mean a different version of the data (in which case the versioning info lives with the source data) OR a different embedding config, which we also support[1].
Admittedly, right now we have predefined chunking strategies. But we plan to add custom-code options very soon.
Our broader point is that the things you highlight above are the right things to worry about, not the data workflow ops and babysitting your lambda jobs. That's what we want to handle for you.
reply