In my experience a huge portion of the userbase who loves CTEs are analysts or d...

epgui · on Jan 12, 2023

Note that there are basically no special performance considerations when using CTEs in recent versions of postgres… unless your CTE is recursive or does some weird side-effect (which is unlikely).

srcreigh · on Jan 12, 2023

This is incorrect. If you use a CTE more than once in a query, Postgres will materialize that data into a temp table that has no indexes. If that CTE gets joined a lot, it will incur many O(N) searched through unindexed data.

I've seen cases where a few GBs of product data gets pulled into a CTE this way. Queries take 10s to complete from processing GBs of unindexed data

Zanfa · on Jan 12, 2023

Wow, this is amazing. Just the other day I was staring at a query with multiple CTEs that took 20+ seconds to run. Ended up adding an additional WHERE clause on an indexed column to the slowest CTE, which took the execution time down to 100s of ms.

Just tried the original query, but with explicitly not materialized CTEs and the runtime was basically as fast as the manually pre-filtered solution.

srcreigh · on Jan 12, 2023

Not materializing lets Postgres push extra where clauses and join conditions into the CTE so this makes sense!

Now you know why I spam all the “CTEs are amazing” threads about materialization (there have been 5 or so in the last few months)

btown · on Jan 12, 2023

Is there a way to force the temp table to have indices within the CTE definition?

srcreigh · on Jan 12, 2023

No, but this would make an incredible Postgres extension.

(I’d also love one that lets Postgres use a covering index for primary storage. This can reduce data size and load less data off disk due to locality. MySQL has this by default and SQLite has it via WITHOUT ROWID tables.)

chaps · on Jan 12, 2023

Strange take you have there in assuming they don't understand performance possibilities. Many of the problems that analysts and such work on aren't performance bound because focusing on the performance would be a net waste of time. "Don't optimize early" and all that. For example, I had to share SQL recently where the absolute most important thing, beyond all else was to make it readable. CTEs fit the role.

srcreigh · on Jan 12, 2023

Postgres has materialization options for CTEs. You can ask it to rewrite the CTE as a join via "WITH foobar AS NOT MATERIALIZED (...)". Alternatively you can force materialization by writing "WITH foobar AS MATERIALIZED (...)"

emodendroket · on Jan 12, 2023

Readability over optimization is often a reasonable compromise, don't you think?

rkagerer · on Jan 12, 2023

So could an optimizer be built that let's you have your cake and eat it too? CTE syntax but without the performance compromise?

emodendroket · on Jan 12, 2023

Based on other comments, this has been done for psql, or at least to some extent. So that's the other thing... pre-emptively writing code in a non-intuitive way for performance reasons often leads to practices sticking around long past the rationale for them even being true.

hobs · on Jan 16, 2023

In my experience no, not in a production database that's doing real work.