Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This particular demo page actually makes queries against an almost 700 MB large (fills one CD!) SQLite database. Because the amount of data read is almost negligible (few hundred kB), performance is limited by latency (as you say). However, high-latency links also tend to be slower, so downloading the entire database a-priori would almost always be much slower.

For example, on a 1 megabit/s link with 300 ms RTT, one example would take about 2 seconds for the data transfer itself while spending another 3 seconds or so on waiting. Downloading the entire file would take around an hour and a half.

For your 10 MB database, transferring it as a whole would take 80 seconds. Assuming this solution instead needs to read e.g. 250 kB (taking 2 seconds to transfer), it could still bounce around 250 times to the database before those 10 MB are fully downloaded. (This would be a really odd query, since it would only read on average two pages per read request)



Right but that is an artificially created demo by the author to justify the solution being presented (no offense). The question is how common are ~GB large SQLite databases in the real world relative to databases that are ~MB large?

In my experience SQLite databases of millions of rows of raw tabular data tend to compress very well into dozens of megabytes. Indeed SQLite is often touted as a file format for applications.


I'm running magnetico (https://github.com/boramalper/magnetico) on my VPS. I currently have an index of 1.6M magnet links stored in a 5GB database.

SQLite is most interesting not when the database is small, but when there are very few writes and all you do is reading. You can also look at https://datasette.io/ and see how SQLite is perfect for representing a lot of datasets and querying them


Even if you compress it, you need it in client memory which can also be a dealbreaker. Some people might need it for some projects, I know I did.


> you need it in client memory which can also be a dealbreaker

Most workstations have GBs of available memory. If not you can dump it in indexeddb as a raw data store.

I never disputed that it would be useful for some use cases. I only said it would be unusable with high latency links. If you have a low latency link and aren’t running very complex queries with lots of random seeks, then this should work fine for you.


My team has a few TB of data in SQLite files that are themselves dozens of GB each.

We're using them as a replacement for leveldb's sstables, but with the structure of full SQL. It is highly effective.


Do you think your team’s usage of SQLite is representative of the average SQLite user?


This is the fundamental flaw of 80% thinking. The fact that SQLite continues to reach for more users is what has made it such a successful general-purpose tool.


You didn’t answer the question.

> The fact that SQLite continues to reach for more users is what has made it such a successful general-purpose tool.

I never disputed this. You’re responding to a straw man.


Where has it been suggested that this is the best solution for "the average SQLite user", instead of a tool you can use if it fits your requirements? To take your 10MB number, the article starts by mentioning you can probably just download the entire thing if you aren't above that exact same number.


I made two claims:

> this would be unusable over high latency links.

That is objectively true

> SQLite databases of pure data usually aren’t over 10MB in size.

No one here has refuted this point.

Any other counterargument is addressing a claim I did not make.


It's not anymore unusable over high latency links than most website. Also worth noting that the caching is very smart, so once things are downloaded it's very fast.

But most high latency links are very slow (so downloading large databases is a horrible experience) and (more importantly) are often priced by the size of downloads.


> It's not anymore unusable over high latency links than most website.

That’s false. Not all web applications suffer equally from high latency links. Depends on how reliant the web application is on independent requests. Making one request and receiving a single bulk download is much less bad than making many dependent requests on a high latency link.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: