Hacker News new | past | comments | ask | show | jobs | submit login

> Once again, SQLite happily stored my UUIDs on this column and even indexed them, but to get these across to PostgreSQL I needed to fix the schema and properly declare these keys as strings.

In PostgreSQL you're leaving performance on the table if you store UUIDs as strings instead of as the built in UUID type (128-bit value under the hood)




Plus PG and SQLite3 have fundamentally different approaches to types.

SQLite3 has very few types, you cannot create new types, and is duck-typed.

PG has many built-in types, you can create new types, and is (mostly) statically typed.

You really have to know this going in. If you have to support both, you'll probably have to write two versions of your SQL -- or you'll end up using an ORM that supports the two RDBMSes, and then you may get stuck with the ORM.

SQLite3 is amazing, but it's mainly amazing because of how easy it is to embed in apps. If that's what you need, use SQLite3. If you need more power, if you're building a server, then think long and hard about why not just go straight to PG.

EDIT: SQLite3 is amazing for other reasons too.


> As of version 3.37.0 (2021-11-27), SQLite provides STRICT tables that do rigid type enforcement

https://www.sqlite.org/datatype3.html


Thanks! Here's the link to the actual details

https://www.sqlite.org/stricttables.html


I'm aware of this, and it's a very welcomed addition, though the types supported are still very limited, but...

> though new types might be added in future releases of SQLite

Intriguing! Can't wait for SQLite3 to have `CREATE TYPE`!


This is so 'on brand' for HN: author for once actually uses the main selling point of ORM concept and changes the underlying database by updating one config variable and nothing else in application code. One of the first comments warns "you may get stuck with the ORM".


I work with (and occasionally have made contributions to) a proprietary Rails (ORM) app that in principle supports SQLite3 and PG but, in practice, left SQLite3 behind long ago. The main problem is that the app in question needed something like SQLite3's `ANY` type in PG but all there was back then was `hstore` (and even now, all there is is `hstore`, `json`, and `jsonb`), so `hstore`-using code metastasized and no one kept the SQLite3 bits working.

Nowadays, of course, SQLite3 has JSON support, and PG has JSON and JSONB, and so it should be possible to fix that, but it would be quite the migration, so no one bothers.

This is real life with a multi-RDBMS ORM: the promise of RDBMS neutrality always fails in some way as the differences between the RDBMSes bleed through the ORM's abstractions, and that bleeding occurs the moment you need more expressive power than the ORM provides out of the box for dead-simple apps.

It's the same old story: dev meets ORM, ORM makes things easy (at first), dev needs more power, dev bumps up against ORM's limitations, dev works around those limitations, dev now has a frankenapp, dev wants to switch RDBMSes but can't, or dev wants to ditch the ORM but it's a lot of work.

ORMs are a nice crutch to get started, but that's it.


You're right. Calling out tradeoffs transparently in the comment section is 'on brand' and that's precisely why I keep coming back to HN and the comment section, specifically.

Blog articles are helpful but most often are a single person's perspective and no one person is perfect. I love HN because I can get some great anecdotes (or sometimes response articles) from many perspectives.


tbf GP went, You really have to know this going in. If you have to support both, you'll probably have to write two versions of your SQL.

That is, the problem isn't just the ORM but the very different characteristics of the databases behind the ORM.

Any serious production app and I bet no one is touching that config. Especially, if it has been running for months with one db or the other.

And now, you're stuck with an extra database and a ORM.


And SQLAlchemy made the process of migrating from one to another quite simple too.

https://www.sqlalchemy.org/


Thank you for highlighting this. I learned this about a year ago when I tried porting my storage layer from postgres to sqlite.. loads of little small things that don't quite work the way you expect it to or require a bit more code as workarounds.


Normally you expect the reverse migration, from SQLite3 to PG!


Is it really duck typed if no static type checking is performed on the SQL? Duck typing to me seems to imply checking quacks and waddles at "compile" time. I would think it's at most loosely typed and at worst dynamically typed.

EDIT: Without the new strict mode of course, where it is not duck typed but typed by name.

EDIT 2: Actually maybe duck typing is fine to use for dynamic languages too. The affinity of INT keys and stuff is sort of duck typing I guess.


You're getting downvoted because the most well known duck-typed languages, and probably where people first learned the term from, are all examples of very dynamic languages like python, ruby, and even js.

I'm sure there are others but the main language anyone knows static duck-typing for is go. In go at least, the interfaces get checked against implementations based what instances are bound to what interface variables or parameters at compile time.


Just to add more detail. SQLite was designed for Tcl interop. Its typing model reflects that.


TypeScript is another mainstream programming language that allows, or even encourages, static duck typing.


[dead]


I'd say that structural typing is a type system feature/paradigm that enables you to use duck typing in a statically-verifiable way. Duck typing is a "pattern" which you can do in dynamic languages (because there's nobody checking), or in static structurally typed languages.

I'm not sure why you'd call duck typing "dogma"? It's just a way of writing code.

https://www.typescriptlang.org/play?#code/JYOwLgpgTgZghgYwgA...


There's also no word about indexing or analysis (and postgres is very sensitive to having up-to-date statistics).

And on sqlite you might be leaving memory / storage on the table if you use a UUID as PK and don't remove the implicit ROWID.


I did not feel that was a relevant topic for this article. I have spent a good amount of time over the life of this application optimizing indexes, though. But the article is about how I designed a test to benchmark these two databases, not a general discussion on all the optimizations you can make.


In postgres there usually no good reason to use varchar unless you specifically want to enforce size limits. If you don't care use TEXT, the performance is the same.


For this particular application I discuss in the article it would make no difference, since there is only one not heavily used table that uses UUIDs. All the other tables use integer primary keys.

Also this may be not a very popular opinion, but I prefer to design the database without using database-specific features. Things such as using SQLAlchemy and sticking to basic types. Sure it may affect performance some, but thanks to having done that I was able to port the app from SQLite to PostgreSQL in like an hour.


> Also this may be not a very popular opinion, but I prefer to design the database without using database-specific features.

The problem with that is that SQL as a standard is so patchy and inconsistently implemented that effectively everything has database-specific features. You discovered that yourself when you inadvertently relied on SQLite's "feature" of letting you store UUID strings in an integer column. No matter how portable you imagine your schema is, it will work very, very differently between any two SQL implementations.

I can't see any value whatsoever in voluntarily tying one hand behind your back performance-wise, honestly. It's not like changing "text" to "uuid" would have noticeably increased the time this port-job took you.

As a wise man once said, if you're doing a job manually while the tool designed for that job is sitting idle beside you, the tool is smarter than you are.


> The problem with that is that SQL as a standard is so patchy and inconsistently implemented that effectively everything has database-specific features.

Noting that when it comes to such patchiness, sqlite's developers specifically prefer to follow PG's example. When they write new SQL features, their basis for comparison is literally always "do we get the same answer as PG?" Yes, there are differences between the two, but porting between sqlite and PG, provided no fancy PG features are involved, is very likely simpler than porting between any two other arbitrary SQL engines. (That's ignoring any performance factors - i'm only talking about semantic compatibility.)


> Also this may be not a very popular opinion, but I prefer to design the database without using database-specific features.

I prefer to leverage the DB schema to do as much as possible for me. Generally with PostgreSQL that means using a lot of domain types etc.


> Sure it may affect performance some, but thanks to having done that I was able to port the app from SQLite to PostgreSQL in like an hour.

This is unpopular because it's a bad tradeoff. In the real world, performance matters all day, every day, while switching from Postgres to SQLite is an incredibly unusual scenario (especially because their hosting models are completely different).

Tuning a database for performance (or just taking the hit and scaling the database) is far more expensive than the likely-not-going-to-happen migration from one RDBMS to another. The vast majority of companies will never do it. It makes no sense to optimize for it at great cost elsewhere.


That's ignoring the specific use case of TFA though, which is not "I'm setting out to build something for my bigcorp to scale infinitely" but "I built a little tool that ended up handling more than I thought it would, good thing I used an ORM so that I could switch to X based on actual metrics and A/B tests."

I'm as anti-ORM as the next jaded dev but this is a picture-perfect use case for it: use an ORM to rapidly prototype and get the job done and leave a path for an industrial solution. The adage "you never switch your RDBMS" does not apply here.


> this may be not a very popular opinion, but I prefer to design the database without using database-specific features

I used to be in this camp, having started my dev life in Rails when it ran Sqlite in dev. This turned out to be an anti-pattern for Rails because dev/prod parity was never quite there. Over time I also realized it just wasn't worth the effort to keep things agnostic. Sure a hypothetical migration will be easier later on, but that migration very, very, very (very) rarely occurs in my experience.


You can't completely avoid using database-specific features - even the most basic types such as strings effectively require database-specific treatment at least if you want strings that are long and/or unicode.


Prefer this approach too. Makes unit testing easier and makes data more portable. Stick to the commonly supported set of SQL data-types.


> Makes unit testing easier

In what language? Unit testing uses whatever DB client comes with your stack, and those clients have generic interfaces.

For example, they will transparently decode UUIDs into strings when you read them, and then they will write them back as the database's UUID type when writing. As far as your code is concerned, they were always just strings.

The same is true of dates for mainstream, mature stacks.

> Stick to the commonly supported set of SQL data-types.

Postgres is among the most vanilla SQL implementations. UUID/GUID/uniqueidentifier is available in every modern SQL-based RDBMS.

Refusing to use the idioms of your database is a guaranteed way to lose performance for no reason.


In languages like Go, Java and C++. The product in question is an integration product where customer can configure choice of database in public/private cloud.

Using basic SQL types that have common support across all database vendors simplified development, testing and support. I have seen folks waste weeks (to months!) in data migration/porting and still lose time troubleshooting issues across databases when they used db specific features. Too much tech debt incurred.

Loss of minimal performance weighed against simplicity and maintainability and no vendor lock-in. When you have the same tests and same schema and same DB layer code working against multiple databases, life becomes wonderful and hassle free.


> The product in question is an integration product where customer can configure choice of database in public/private cloud.

I've seen this type of thing before, but it's not a common use case (and I'm not sure why people still do it).

It makes more sense to just say that the product uses a FOSS RDBMS (like Postgres) and leave it at that. There's no vendor lock-in.


Because many customers have their own expertise in the database system of their own choice. A shop that has hired Oracle DBA's doesn't want to hire Postgres folks. A MySQL based place doesn't want to hire Oracle folks. And they already have built extensive data-management processes around their choice of database systems and have the in-house experience to manage their database system in a wide variety of platforms.

If you to sell a packaged, deployable product to the widest market, yet want to minimise support, you leverage the expertise that the customer already possesses.

This is probably more of a consideration in the business enterprise space than some simple normal end-consumer SaaS, so I am not saying this approach makes sense everywhere.


I've done this before, used a sqlite as a unit testing stand-in for a read-only MSSQL database my app used. Allowed me to unit test the data transform code conveniently. But that's about as far as I'd go. Running tests against one database and production against other is a great way to discover bugs in production, like the text length constraint covered in the article. And if speed is of concern but data storage is not, just launch the test database with tmpfs backed storage.


Just wait until you learn that SQLite stores everything as a string :(


True, but there’s a difference between representing a UUID as a sequence of bytes in a string and as the ASCII encoded hex characters representing those bytes.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: