Re: speedup tidbitmap patch: cache page

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speedup tidbitmap patch: cache page
Date: 2014-12-17 11:28:43
Message-ID: CAApHDvrzZgb6SXQXd4xAoOGRQBsP-t1X-Li8LahBCUJxwLYmvg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17 December 2014 at 05:25, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>
> I've been having a look at this and I'm wondering about a certain scenario:
>>
>> In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block,
>> and on
>> the next iteration of the loop we have the same block again, have you
>> benchmarked any caching code to store if tbm_page_is_lossy() returned
>> true for
>> that block on the previous iteration of the loop? This would save from
>> having to
>> call tbm_page_is_lossy() again for the same block. Or are you just
>> expecting
>> that tbm_page_is_lossy() returns true so rarely that you'll end up
>> caching the
>> page most of the time, and gain on skipping both hash lookups on the next
>> loop,
>> since page will be set in this case?
>>
> I believe that if we fall in lossy pages then tidbitmap will not have a
> significant impact on preformance because postgres will spend a lot of
> time on tuple rechecking on page. If work_mem is to small to keep exact
> tidbitmap then postgres will significantly slowdown. I implemented it,
> (v2.1 in attachs) but
> I don't think that is an improvement, at least significant improvement.
>
>
You could well be right, but it would be good to compare the numbers just
so we know this for sure.

There seems to be a problem with the v2.1. The code does not look quite
right, if have expected something more like:

if (lossy_page == blk)
continue; /* whole page is already marked */

if (page == NULL || page->blockno != blk)

where you have the lossy_page check within the 2nd if test. I'd imagine
this will never be true as page->blockno != blk, or perhaps it'll only be
true when tbm_lossify() is called and you set the page to NULL.

There's also something weird going on using your original test case:

postgres=# set work_mem = '256MB';
SET
Time: 0.450 ms
postgres=# select count(*) from t where i >= 0;
count
---------
5000000
(1 row)

That's ok

Time: 1369.562 ms
postgres=# set work_mem = '64kB';
SET
Time: 0.425 ms
postgres=# select count(*) from t where i >= 0;
count
---------
4999998
(1 row)

Time: 892.726 ms

Notice the row counts are not the same.

create table t_result (i int not null);

postgres=# select a.a,a.b,b.a,b.b from (select i a,count(*) b from t group
by i) a full outer join (select i a,count(*) b from t_result group by i) b
on a.a = b.a where
b.b <> a.b;
a | b | a | b
--------+----+--------+---
315744 | 10 | 315744 | 8
(1 row)

postgres=# select ctid from t where i = 315744;
ctid
-------------
(13970,220)
(13970,221)
(13970,222)
(13970,223)
(13970,224)
(13970,225)
(13970,226)
(13971,1)
(13971,2)
(13971,3)
(10 rows)

This only seems to be broken in the v2.1.

Are you seeing the same?

>> It would be nice to see a comment to explain why it might be a good idea
>> to
>> cache the page lookup. Perhaps something like:
>>
>>
> added, see attachment (v2)
>
>
Thanks. That looks better. It would be good to compare performance of v2
and v2.1, but I think the problem with v2.1 needs to be fixed before that
can happen.

>
>> I also wondered if there might be a small slowdown in the case where the
>> index
>> only finds 1 matching tuple. So I tried the following:
>> avg.2372.4456 2381.909 99.6%
>> med.2371.224 2359.494 100.5%
>>
>> It appears that if it does, then it's not very much.
>>
>
> I believe, that's unmeasurable because standard deviation of your tests is
> about 2% what is greater that difference between pathed and master versions.
>
>
Agreed.

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-12-17 12:31:25 Re: Combining Aggregates
Previous Message didier 2014-12-17 11:27:04 Re: WALWriter active during recovery