A pretty boring and stable year in SQLite land, which is just how I like it.
The JSON -> and ->> operators introduced in 3.38 are my personal favorites, as these really help in implementing my "ship early, then maybe iterate as you're starting to understand what you're doing" philosophy:
1. In the initial app.sqlite database, have a few tables, all with just a `TEXT` field containing serialized JSON, mapping 1:1 to in-app objects;
2. Once a semblance of a schema materializes, use -> and ->> to create `VIEW`s with actual field names and corresponding data types, and update in-app SELECT queries to use those. At this point, it's also safe to start communicating database details to other developers requiring read access to the app data;
3. As needed, convert those `VIEW`s to actual `TABLE`s, so INSERT/UPDATE queries can be converted as well, and developers-that-are-not-me can start updating app data.
The interesting part here is that step (3) is actually not required for, like, 60% of successful apps, and (of course) for 100% of failed apps, saving hundreds of hours of upfront schema/database development time. Basically, you get 'NoSQL/YOLO' benefits for initial development, while still being able to communicate actual database details once things get serious...
As a counterpoint I find that nutting out the schema upfront is an incredibly helpful process to define the functionality of the app. Once you have a strict schema that models the application well nearly everything else just falls into place. Strong foundation
I worked at a startup who did the whole "every table schema is "id, data" where data is a JSON blob with all the data." I loved it at first. For initial development, it felt so freeing and so productive.
Then one of the devs left and I inherited his code, and started to realize exactly how much the rigid schema helps make an app more robust and more maintainable. I will never, ever take that shortcut again. The production outages/downtime caused by database issues that never should have been an issue in the first place (like a corner case row that missing data? Spreading rampant defensive nil checks out across every function that checks any field of a record since any at any time might be inconsistent or nil) contributed to killing the company by pissing off customers.
Elixir/Phoenix does help a lot though because I still treat most rows like JSON objects, but under the hood it's a normal rigid postgres schema. Best of both worlds IMHO.
With JSON blobs in simple tables a very large part of designing a database schema becomes designing a JSON schema, with different opportunities to make mistakes but (potentially) the same strictness.
We are talking about the internal database of some application: its schema is something that should be designed well, not enforced defensively like a schema for validating undependable inputs.
Similarly one of my most important projects reads a lot of api calls returning serialized json. Those calls are expensive so I have, over time, tried many complicated cacheing mechanism.
These days though it's _so_much_simpler_and_cleaner_ to just wrap the call in a decorator that caches the request to sqlite3 and only makes the call if the cache is stale.
I don't worry about parsing the results or doing any of the heavy lifting right away - just cache the json.
Sqlite is so good at querying those blobs and is so fast it's just not worth munging them. Nice.
And using something like datasette to prototype queries for some of the more complicated tree structures is a breeze.
I've been hacking together one-off python scripts to parse out bits of saved API responses, flatten the data to csv's, and load it to SQLite tables. Looks like I can skip all of this and go straight to querying the raw JSON text.
Finally! I've had good experience following similar process with Postgres (start with obvious key fields and a json "data" then promote components of "data" to columns; maybe use expression indexes to experiment with optimizations). Good to know this is now possible in sqlite without ugly function calls.
This seems awful but maybe I'm underestimating the number of failed apps you're dealing with. And at that rate, maybe the app ideas should be vetted a bit more.
I don't understand the use case? From the docs [https://sqlite.org/wasm/doc/trunk/demo-123.md], the database disappears on page reload on top of requiring using workers for longer running processes.
Even without persistence it's absurdly useful. In 2023 loading even a 50MB+ database file into a browser is feasible (that's only 10 heavy React webpage loads) and now you can run SQL queries directly against it in the browser.
Plenty of interesting databases fit into less than a MB even.
Additionally, SQLite in WebAssembly with a HTTP-Range-request based virtual file system greatly simplifies interactive visualisation of large datasets https://observablehq.com/@mjbo/sqljs-httpvfs (my own demo)
The intention is to make use of the new "Origin privet file system" api, this provides the website with a sandboxed block level file system on the users device that can be used for efficient access and writes. It will be possible for WASM SQLite, or any other DB engine ported to WASM, to have full ACID compliance.
The SQLite team have been working with browser developers to ensure the new API is sufficient to enable all this.
Honestly, and I keep going on about it, SQLite in the browser via WASM is the missing piece to make "offline first" PWAs a serious contender when deciding an architecture for an app.
2023 is going to be the year of SQLite in the browser.
You can still persist them; you just have to wire it up yourself. Using workers is a best practice for any CPU-bound task, so that's not a drawback by itself in my mind.
It's good for single-page applications. Many datasets are relatively small -- pushing them to the client is reasonable. In exchange, you get zero-latency querying and can build very responsive UIs that can use SQL, versus writing REST APIs or GraphQL APIs.
Taken to an extreme, it permits publishing datasets that can be queried with no ongoing server-side expenses.
A wild example: Datasette is a Python service that makes SQLite databases queryable via the web. It turns out that since you can compile Python and SQLite to WASM, you can run Datasette entirely in the user's browser [1]. The startup time is brutal, because it's literally simulating the `pip install`, but a purpose-built SPA wouldn't have this problem.
The original plan from Google for the file system access API was to allow read/write of real files, but both Mozilla and Apple said that was too dangerous, and the OPFS was created as a compromise.
Yeah, it's going to be confusing for users when they want to actually want to use one of these files outside of the application that created it. But it's better than nothing!
well it makes sense that the browser can't work outside of the opfs boundary, but shouldn't there be a way to talk to the opfs from outside the browser?
But I would prefer using something Sqlite officially maintains and not to maintain anything myself. I also don't prefer using libraries from random people unfortunately. Working at a fintech makes me paranoid...
Unfortunately I don't think that can be done natively (like defining a function before using it in a query via the CLI). Only by writing and compiling some C
The JSON -> and ->> operators introduced in 3.38 are my personal favorites, as these really help in implementing my "ship early, then maybe iterate as you're starting to understand what you're doing" philosophy:
1. In the initial app.sqlite database, have a few tables, all with just a `TEXT` field containing serialized JSON, mapping 1:1 to in-app objects;
2. Once a semblance of a schema materializes, use -> and ->> to create `VIEW`s with actual field names and corresponding data types, and update in-app SELECT queries to use those. At this point, it's also safe to start communicating database details to other developers requiring read access to the app data;
3. As needed, convert those `VIEW`s to actual `TABLE`s, so INSERT/UPDATE queries can be converted as well, and developers-that-are-not-me can start updating app data.
The interesting part here is that step (3) is actually not required for, like, 60% of successful apps, and (of course) for 100% of failed apps, saving hundreds of hours of upfront schema/database development time. Basically, you get 'NoSQL/YOLO' benefits for initial development, while still being able to communicate actual database details once things get serious...