|From:||"Constantin S(dot) Pan" <kvapen(at)gmail(dot)com>|
|Subject:||Re: [WIP] speeding up GIN build with parallel workers|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On Sat, 16 Jan 2016 01:38:39 +0300
"Constantin S. Pan" <kvapen(at)gmail(dot)com> wrote:
> The task of building GIN can require lots of time and eats 100 % CPU,
> but we could easily make it use more than a 100 %, especially since we
> now have parallel workers in postgres.
> The process of building GIN looks like this:
> 1. Accumulate a batch of index records into an rbtree in maintenance
> work memory.
> 2. Dump the batch to disk.
> 3. Repeat.
> I have a draft implementation which divides the whole process between
> N parallel workers, see the patch attached. Instead of a full scan of
> the relation, I give each worker a range of blocks to read.
> This speeds up the first step N times, but slows down the second one,
> because when multiple workers dump item pointers for the same key,
> each of them has to read and decode the results of the previous one.
> That is a huge waste, but there is an idea on how to eliminate it.
> When it comes to dumping the next batch, a worker does not do it
> independently. Instead, it (and every other worker) sends the
> accumulated index records to the parent (backend) in ascending key
> order. The backend, which receives the records from the workers
> through shared memory, can merge them and dump each of them once,
> without the need to reread the records N-1 times.
> In current state the implementation is just a proof of concept
> and it has all the configuration hardcoded, but it already works as
> is, though it does not speed up the build process more than 4 times
> on my configuration (12 CPUs). There is also a problem with temporary
> tables, for which the parallel mode does not work.
I have made some progress on the proposal (see the attached patch):
0. Moved some repeated code to functions (e.g. ginDumpAccumulator,
1. Implemented results merging on backend.
2. Disabled the usage of parallel mode when creating index on temporary
tables. No point in using parallel mode for temporary tables anyway,
3. Added GUC parameter to control the number of workers for GIN
4. Hit the 8x speedup limit. Made some analysis of the reasons (see the
attached plot or the data file).
In order to analyze the performance issues, I have made the following:
create table t (k int, v int);
create or replace
function randarray(width int, low int, high int)
returns int as
select array(select (random()*(high-low) + low)::int
$$ language sql;
insert into t select k, randarray(3000, 0, 100000)
from generate_series(1, 100000) as k;
create index t_v_idx on t using gin (v);
This creates 100000 arrays of 3000 random numbers each. The random
numbers are in range [0, 100000]. Then I measure how long the gin
building steps take. There are two steps: scan and merge.
The results show that 'scan' step is sped up perfectly. But the
'merge' step takes longer as you increase the number of workers. The
profiler shows that the bottleneck here is ginMergeItemPointers(), which
I use to merge the results.
Also, I did encounter the problem with workers deadlocking during
heap_open, but that seems to have been resolved by Robert Haas in his
commit regarding group locking.
Please leave your feedback!
Constantin S. Pan
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
|Next Message||Álvaro Hernández Tortosa||2016-02-17 15:55:24||Re: [HACKERS] Packaging of postgresql-jdbc|
|Previous Message||Catalin Iacob||2016-02-17 15:54:29||Re: proposal: PL/Pythonu - function ereport|