I was pretty confused by the index performance when getting ~50% of the table.
I think what's happening is that since the tuples in the medium table are larger, less are stored on each page. Just looking at the sizes of the tables:
medium: 977 MB
Large: 25 MB
we can assume that the medium query had to access 39.08 ( 977 / 25 ) * more pages.
medium time: 3198.539
large time: 85.74
3198.539 / 85.74 = 37 which seems to track.
What was confusing me is that I thought that the io would take longer for the large text objects because there's more of it. I think that the reason why this isn't the case is because TOAST is probably optimised to make fewer calls to io, but they tend to be much larger. it's not necessarily the size of the data, but rather how many syscalls you have to make to request the data.
pretty unsure about this, but it seems to make sense now.
I think what's happening is that since the tuples in the medium table are larger, less are stored on each page. Just looking at the sizes of the tables:
medium: 977 MB
Large: 25 MB
we can assume that the medium query had to access 39.08 ( 977 / 25 ) * more pages.
medium time: 3198.539
large time: 85.74
3198.539 / 85.74 = 37 which seems to track.
What was confusing me is that I thought that the io would take longer for the large text objects because there's more of it. I think that the reason why this isn't the case is because TOAST is probably optimised to make fewer calls to io, but they tend to be much larger. it's not necessarily the size of the data, but rather how many syscalls you have to make to request the data.
pretty unsure about this, but it seems to make sense now.