Re: tableam vs. TOAST

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: tableam vs. TOAST
Date: 2019-09-12 13:03:43
Message-ID: CA+Tgmoa0zFcaCpOJCsSpOLLGpzTVfSyvcVB-USS8YoKzMO51Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 2, 2019 at 6:42 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> The obvious problem with this is the slice fetching logic. For slices
> with an offset of 0, it's obviously trivial to implement. For the higher
> slice logic, I'd assume we'd have to fetch the first slice by estimating
> where the start chunk is based on the current suggest chunk size, and
> restarting the scan earlier/later if not accurate. In most cases it'll
> be accurate, so we'd not loose efficiency.

Dilip and I were discussing this proposal this morning, and after
talking to him, I don't quite see how to make this work without
significant loss of efficiency. Suppose that, based on the estimated
chunk size, you decide that there are probably 10 chunks and that the
value that you need is probably located in chunk 6. So you fetch chunk
6. Happily, chunk 6 is the size that you expect, so you extract the
bytes that you need and go on your way.

But ... what if there are actually 6 chunks, not 10, and the first
five are bigger than you expected, and the reason why the size of
chunk 6 matched your expectation because it was the last chunk and
thus smaller than the rest? Now you've just returned the wrong answer.

AFAICS, there's no way to detect that except to always read at least
two chunks, which seems like a pretty heavy price to pay. It doesn't
cost if you were going to need to read at least 2 chunks anyway, but
if you were only going to need to read 1, having to fetch 2 stinks.

Actually, when I initially read your proposal, I thought you were
proposing to relax things such that the chunks didn't even have to all
be the same size. That seems like it would be something cool that
could potentially be leveraged not only by AMs but perhaps also by
data types that want to break up their data strategically to cater to
future access patterns. But then there's really no way to make slicing
work without always reading from the beginning.

There would be no big problem here - in either scenario - if each
chunk contained the byte-offset of that chunk relative to the start of
the datum. You could guess wildly and always know whether or not you
had got it right without reading any extra data. But that's not the
case.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2019-09-12 13:07:57 Amcheck: do rightlink verification with lock coupling
Previous Message Alvaro Herrera 2019-09-12 12:57:55 Re: logical decoding : exceeded maxAllocatedDescs for .spill files