Quick Links

Re: Tid scan improvements

From:	Edmund Horner <ejrh00(at)gmail(dot)com>
To:	tomas(dot)vondra(at)2ndquadrant(dot)com
Cc:	David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Tid scan improvements
Date:	2018-11-24 05:20:25
Message-ID:	CAMyN-kBd0K5ZK6bXd4JvA5aYDa1rVOhRnm8kB1Av0enV3gjkvQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, 24 Nov 2018 at 15:46, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> On 11/24/18 1:56 AM, Edmund Horner wrote:
> > On Fri, 23 Nov 2018 at 07:03, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> >> On 11/22/18 8:41 AM, David Rowley wrote:
> >>> ...
> >>> 3. I'd rather see EnsureTidRangeSpace() keep doubling the size of the
> >>> allocation until it reaches the required size. See how
> >>> MakeSharedInvalidMessagesArray() does it. Doing it this way ensures
> >>> we always have a power of two sized array which is much nicer if we
> >>> ever reach the palloc() limit as if the array is sized at the palloc()
> >>> limit / 2 + 1, then if we try to double it'll fail. Of course, it's
> >>> unlikely to be a problem here, but... the question would be how to
> >>> decide on the initial size.
> >>
> >> I think it kinda tries to do that in some cases, by doing this:
> >>
> >> *numAllocRanges *= 2;
> >> ...
> >> tidRanges = (TidRange *)
> >> repalloc(tidRanges,
> >> *numAllocRanges * sizeof(TidRange));
> >>
> >> The problem here is that what matters is not numAllocRanges being 2^N,
> >> but the number of bytes allocated being 2^N. Because that's what ends up
> >> in AllocSet, which keeps lists of 2^N chunks.
> >>
> >> And as TidRange is 12B, so this is guaranteed to waste memory, because
> >> no matter what the first factor is, the result will never be 2^N.
> >
> > For simplicity, I think making it a strict doubling of capacity each
> > time is fine. That's what we see in numerous other places in the
> > backend code.
> >
>
> Sure.
>
> > What we don't really see is intentionally setting the initial capacity
> > so that each subsequent capacity is close-to-but-not-exceeding a power
> > of 2 bytes. You can't really do that optimally if working in terms of
> > whole numbers of items that aren't each a power of 2 size. This step,
> > there may be 2/3 of an item spare; next step, we'll have a whole item
> > spare that we're not going to use.
>
> Ah, I missed the detail with setting initial size.
>
> > So we could keep track in terms of bytes allocated, and then figure
> > out how many items we can fit at the current time.
> >
> > In my opinion, such complexity is overkill for Tid scans.
> >
> > Currently, we try to pick an initial size based on the number of
> > expressions. We assume each expression will yield one range, and
> > allow that a saop expression might require us to enlarge the array.
> >
> > Again, for simplicity, we should scrap that and pick something like
> > floor(256/sizeof(TidRange)) = 21 items, with about 1.5% wastage.
> >
>
> Probably. I don't think it'd be a lot of code to do the exact sizing,
> but you're right 1.5% is close enough. As long as there is a comment
> explaining the initial sizing, I'm fine with that.
>
> If I could suggest one more thing, I'd define a struct combining the
> array of ranges, numRanges and numAllocRangeslike:
>
> typedef struct TidRanges
> {
> int numRanges;
> int numAllocRanges;
> TidRange ranges[FLEXIBLE_ARRAY_MEMBER];
> } TidRanges;
>
> and use that instead of the plain array. I find it easier to follow
> compared to passing the various fields directly (sometimes as a value,
> sometimes pointer to the value, etc.).

Ok, I've made rewritten it to use a struct:

typedef struct TidRangeArray {
TidRange *ranges;
int numRanges;
int numAllocated;
} TidRangeArray;

which is slightly different from the flexible array member version you
suggested. The TidRangeArray is allocated on the stack in the
function that builds it, and then ranges and numRanges are copied into
the TidScanState before the function returns.

Any particular pros/cons of this versus your approach? With yours, I
presume we'd have a pointer to TidRanges in TidScanState.

My other concern now is that EnsureTidRangeSpace needs a loop to
double the allocated size. Most such arrays in the backend only ever
grow by 1, so a single doubling is fine, but the TID scan one can grow
by an arbitrary number with a scalar array op, and it's nice to not
have to check the space for each individual item. Here's what I've
got.

void
EnsureTidRangeSpace(TidRangeArray *tidRangeArray, int numNewItems)
{
int requiredSpace = tidRangeArray->numRanges + numNewItems;
if (requiredSpace <= tidRangeArray->numAllocated)
return;

/* it's not safe to double the size unless we're less than half MAX_INT */
if (requiredSpace >= INT_MAX / 2)
tidRangeArray->numAllocated = requiredSpace;
else
while (tidRangeArray->numAllocated < requiredSpace)
tidRangeArray->numAllocated *= 2;

tidRangeArray->ranges = (TidRange *)
repalloc(tidRangeArray->ranges,
tidRangeArray->numAllocated * sizeof(TidRange));
}

If you're in danger of overflowing numAllocated with the number of
TIDs in your query, you're probably going to have other problems. But
I'd prefer to at least not get stuck in an infinite doubling loop.

Note that you don't need any single ScalarArrayOp to return a huge
result, because you can have multiple such ops in your query, and the
results for each all need to get put into the TidRangeArray before
de-duplication occurs.

What's a safe way to check that we're not trying to process too many items?

In response to

Re: Tid scan improvements at 2018-11-24 02:46:17 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ron	2018-11-24 06:25:02	Re: could not connect to server, in order to operate pgAdmin/PostgreSQL
Previous Message	Paul A Jungwirth	2018-11-24 02:55:54	Re: SQL:2011 PERIODS vs Postgres Ranges?