Re: Yet another fast GiST build (typo)

From: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Erik Rijkers <er(at)xs4all(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Yet another fast GiST build (typo)
Date: 2020-09-06 18:33:25
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

> 6 сент. 2020 г., в 18:26, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> написал(а):
> On 05/09/2020 14:53, Andrey M. Borodin wrote:
>> Thanks for ideas, Heikki. Please see v13 with proposed changes.
> Thanks, that was quick!
>> But I've found out that logging page-by-page slows down GiST build by
>> approximately 15% (when CPU constrained). Though In think that this
>> is IO-wise.
> Hmm, any ideas why that is? log_newpage_range() writes one WAL record for 32 pages, while now you're writing one record per page, so you'll have a little bit more overhead from that. But 15% seems like a lot.
I do not know. I guess this can be some effect of pglz compression during cold stage. It can be slower and less compressive than pglz with cache table? But this is pointing into the sky.
Nevertheless, here's the patch identical to v13, but with 3rd part: log flushed pages with bunches of 32.
This brings CPU performance back and slightly better than before page-by-page logging.

Some details about test:
MacOS, 6-core i7
psql -c '\timing' -c "create table x as select point (random(),random()) from generate_series(1,10000000,1);" -c "create index on x using gist (point);"

With patch v13 this takes 20,567 seconds, with v14 18,149 seconds, v12 ~18,3s (which is closer to 10% btw, sorry for miscomputation). This was not statistically significant testing, just a quick laptop benchmark with 2-3 tests to verify stability.

Best regards, Andrey Borodin.

Attachment Content-Type Size
v14-0001-Add-sort-support-for-point-gist_point_sortsuppor.patch application/octet-stream 4.2 KB
v14-0002-Implement-GiST-build-using-sort-support.patch application/octet-stream 19.3 KB
v14-0003-Log-GiST-build-with-packs-of-32-pages.patch application/octet-stream 3.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-09-06 18:52:14 Re: Improving connection scalability: GetSnapshotData()
Previous Message Tom Lane 2020-09-06 18:15:28 Re: [PATCH] - Provide robust alternatives for replace_string