Re: Parallel GiST build on Cube

From: Darafei "Komяpa" Praliaskouski <me(at)komzpa(dot)net>
To: Shyam Saladi <saladi(at)caltech(dot)edu>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel GiST build on Cube
Date: 2020-04-27 18:48:47
Message-ID: CAC8Q8t+2YFhgu=Poj0ynz92MqRSP_6UDxhKSf0LQ_YarkA83ow@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

These things for GIST I know that can help:
- Fast sorting GIST build commitfest entry by Andrey Borodin, not parallel
but faster -
https://www.postgresql.org/message-id/flat/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru

- Fast sorting GIST build by Nikita Glukhov, reuses btree code so also
parallel -
https://github.com/postgres/postgres/compare/master...glukhovn:gist_btree_build

- "Choose Subtree" routine is needed, as current "penalty" is very
inefficient -
https://www.postgresql.org/message-id/flat/CAPpHfdssv2i7CXTBfiyR6%3D_A3tp19%2BiLo-pkkfD6Guzs2-tvEw%40mail.gmail.com#eaa98342462a4713c0d3a94be636e259

These are very wanted for PostGIS which also indexes everything by 2-4
dimensional cubes and require improvements in core infrastructure and
opclass.

On Mon, Apr 27, 2020 at 8:57 PM Shyam Saladi <saladi(at)caltech(dot)edu> wrote:

> Hello --
>
> I regularly build GiST indexes on large databases. In recent times, as the
> size of the database has ballooned (30 million rows) along with the build
> time on a column of points in 3- and 8-dimensional space (0-volume cube).
>
> Is anyone working on (or already implemented) a parallel GiST build on
> Cube? If not, I'd like to try contributing this--any pointers from folks
> familiar with the backend of Cube? Any input would be great.
>
> Thanks,
> Shyam
>
> --
> Shyam Saladi <http://shyam.saladi.org>
> NSF Graduate Research Fellow - Clemons Lab
> Biochemistry and Molecular Biophysics
> California Institute of Technology
>

--
Darafei Praliaskouski
Support me: http://patreon.com/komzpa

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-04-27 22:33:42 Re: [HACKERS] Restricting maximum keep segments by repslots
Previous Message Robert Haas 2020-04-27 18:07:47 Re: tar-related code in PostgreSQL