From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Teodor Sigaev <teodor(at)sigaev(dot)ru> |
Cc: | Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: speedup tidbitmap patch: cache page |
Date: | 2014-12-17 11:28:43 |
Message-ID: | CAApHDvrzZgb6SXQXd4xAoOGRQBsP-t1X-Li8LahBCUJxwLYmvg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 17 December 2014 at 05:25, Teodor Sigaev <teodor(at)sigaev(dot)ru> wrote:
>
> I've been having a look at this and I'm wondering about a certain scenario:
>>
>> In tbm_add_tuples, if tbm_page_is_lossy() returns true for a given block,
>> and on
>> the next iteration of the loop we have the same block again, have you
>> benchmarked any caching code to store if tbm_page_is_lossy() returned
>> true for
>> that block on the previous iteration of the loop? This would save from
>> having to
>> call tbm_page_is_lossy() again for the same block. Or are you just
>> expecting
>> that tbm_page_is_lossy() returns true so rarely that you'll end up
>> caching the
>> page most of the time, and gain on skipping both hash lookups on the next
>> loop,
>> since page will be set in this case?
>>
> I believe that if we fall in lossy pages then tidbitmap will not have a
> significant impact on preformance because postgres will spend a lot of
> time on tuple rechecking on page. If work_mem is to small to keep exact
> tidbitmap then postgres will significantly slowdown. I implemented it,
> (v2.1 in attachs) but
> I don't think that is an improvement, at least significant improvement.
>
>
You could well be right, but it would be good to compare the numbers just
so we know this for sure.
There seems to be a problem with the v2.1. The code does not look quite
right, if have expected something more like:
if (lossy_page == blk)
continue; /* whole page is already marked */
if (page == NULL || page->blockno != blk)
where you have the lossy_page check within the 2nd if test. I'd imagine
this will never be true as page->blockno != blk, or perhaps it'll only be
true when tbm_lossify() is called and you set the page to NULL.
There's also something weird going on using your original test case:
postgres=# set work_mem = '256MB';
SET
Time: 0.450 ms
postgres=# select count(*) from t where i >= 0;
count
---------
5000000
(1 row)
That's ok
Time: 1369.562 ms
postgres=# set work_mem = '64kB';
SET
Time: 0.425 ms
postgres=# select count(*) from t where i >= 0;
count
---------
4999998
(1 row)
Time: 892.726 ms
Notice the row counts are not the same.
create table t_result (i int not null);
postgres=# select a.a,a.b,b.a,b.b from (select i a,count(*) b from t group
by i) a full outer join (select i a,count(*) b from t_result group by i) b
on a.a = b.a where
b.b <> a.b;
a | b | a | b
--------+----+--------+---
315744 | 10 | 315744 | 8
(1 row)
postgres=# select ctid from t where i = 315744;
ctid
-------------
(13970,220)
(13970,221)
(13970,222)
(13970,223)
(13970,224)
(13970,225)
(13970,226)
(13971,1)
(13971,2)
(13971,3)
(10 rows)
This only seems to be broken in the v2.1.
Are you seeing the same?
>> It would be nice to see a comment to explain why it might be a good idea
>> to
>> cache the page lookup. Perhaps something like:
>>
>>
> added, see attachment (v2)
>
>
Thanks. That looks better. It would be good to compare performance of v2
and v2.1, but I think the problem with v2.1 needs to be fixed before that
can happen.
>
>> I also wondered if there might be a small slowdown in the case where the
>> index
>> only finds 1 matching tuple. So I tried the following:
>> avg.2372.4456 2381.909 99.6%
>> med.2371.224 2359.494 100.5%
>>
>> It appears that if it does, then it's not very much.
>>
>
> I believe, that's unmeasurable because standard deviation of your tests is
> about 2% what is greater that difference between pathed and master versions.
>
>
Agreed.
Regards
David Rowley
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2014-12-17 12:31:25 | Re: Combining Aggregates |
Previous Message | didier | 2014-12-17 11:27:04 | Re: WALWriter active during recovery |