Re: Patch: Optimize memory allocation in function 'bringetbitmap'

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: Optimize memory allocation in function 'bringetbitmap'
Date: 2015-12-15 01:41:25
Message-ID: 566F6FC5.1080706@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 10/16/2015 08:00 PM, Jinyu Zhang wrote:
>
> Update the patch_brin_optimze_mem according to your comment.

I've reviewed the last version of the patch, and I do have a few
comments. I haven't done any performance testing at this point, as the
machine I'm using for that is chewing on something else at the moment.

1) bringetbitmap(PG_FUNCTION_ARGS)

+ /*
+ * Estimate the approximate size of btup and allocate memory for btup.
+ */
+ btupInitSize = bdesc->bd_tupdesc->natts * 16;
+ btup = palloc(btupInitSize);

What is the reasoning behind this estimate? I mean, this only accounts
for the values, not the NULL bitmap or BrinTuple fields. Maybe it does
not really matter much, but I'd like to see some brief description in
the docs, please.

This also interacts with our memory allocator in a rather annoying way,
because palloc() always allocates memory in 2^k chunks, but sadly the
estimate ignores that. So for natts=3 we get btupInitSize=48, but
internally allocate 64B (and ignore the top 16B).

2) bringetbitmap(PG_FUNCTION_ARGS)

if (tup)
{
if(size <= btupInitSize)
memcpy(btup, tup, size);
else
btup = brin_copy_tuple(tup, size);
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
}

Is there any particular reason why not to update btupInitSize to the
size value? I mean, if the estimate is too low, then we'll call
brin_copy_tuple() for all BRIN tuples, no?

In other words, I think the code should do this:

if (tup)
{
if(size <= btupInitSize)
memcpy(btup, tup, size);
else
{
btup = brin_copy_tuple(tup, size);
brinInitSize = size;
}
LockBuffer(buf, BUFFER_LOCK_UNLOCK);
}

Another option might be the usual "doubling" strategy, i.e. do something
like this instead:

if (tup)
{
while (btupInitSize < size)
btupInitSize *= 2;

btup = repalloc(btup, btupInitSize);
memcpy(btup, tup, size);

LockBuffer(buf, BUFFER_LOCK_UNLOCK);
}

Possibly with only doing the repalloc when actually needed.

3) brin_new_memtuple(...)

The changes in this function seem wrong to me. Not only you've removed
the code that initialized all the bv_* attributes, you're also using
palloc() so there could easily be random data. This might be OK for our
code as we call brin_memtuple_initialize() right after this function,
and that seems to do the init, but AFAIK brin_new_memtuple() is a part
of our public API at it's in a header file. And these changes surely
break the contract documented in the comment above the function:

* Create a new BrinMemTuple from scratch, and initialize it to an empty
* state.

So I think this is wrong and needs to be reverted.

It's possible that we'll immediately reset the attributes, but that's
negligible cost - we expect to do brin_memtuple_initialize() many times
for the tuple, so it should not make any difference.

Another thing is that the three arrays (bt_values, bt_allnulls and
bt_hasnulls) may be allocated as a single chunk and then sliced.

char * ptr = palloc0(space for all three arrays);
bt_values = ptr;
bt_allnulls = ptr + (size of bt_values)
bt_bt_hasnulls = ptr + (size of bt_values + bt_allnulls)

which is essentially what the original code did with bv_values. This
reduces palloc() overhead and fragmentation.

4) brin_memtuple_initialize(...)

It's possible that the new code needs to reset more fields, but I don't
really see why it should mess with dtuple->bt_columns[i].bv_values for
example.

5) brin_deform_tuple(..)

The comment before the function probably should explain the purpose of
the new parameter (that it can either be NULL or an existing tuple).

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2015-12-15 02:58:12 Re: Using quicksort for every external sort run
Previous Message Michael Paquier 2015-12-15 00:23:53 Re: Function and view to retrieve WAL receiver status