Re: [WIP] speeding up GIN build with parallel workers

From: "Constantin S(dot) Pan" <kvapen(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [WIP] speeding up GIN build with parallel workers
Date: 2016-03-16 14:20:26
Message-ID: 20160316172026.48e63ec2@ppg
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 16 Mar 2016 18:08:38 +0530
Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:

> On Wed, Mar 16, 2016 at 2:55 PM, Constantin S. Pan <kvapen(at)gmail(dot)com>
> wrote:
> >
> > On Wed, 16 Mar 2016 12:14:51 +0530
> > Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > > On Wed, Mar 16, 2016 at 5:41 AM, Constantin S. Pan
> > > <kvapen(at)gmail(dot)com> wrote:
> > > > 3. Tested on some real data (GIN index on email message body
> > > > tsvectors). Here are the timings for different values of
> > > > 'gin_shared_mem' and 'gin_parallel_workers' on a 4-CPU
> > > > machine. Seems 'gin_shared_mem' has no visible effect.
> > > >
> > > > wnum mem(MB) time(s)
> > > > 0 16 247
> > > > 1 16 256
> > > >
> > >
> > >
> > > It seems from you data that with 1 worker, you are always seeing
> > > slowdown, have you investigated the reason of same?
> > >
> >
> > That slowdown is expected. It slows down because with 1 worker it
> > has to transfer the results from the worker to the backend.
> >
> > The backend just waits for the results from the workers and merges
> > them (in case wnum > 0).
>
> Why backend just waits, why can't it does the same work as any worker
> does? In general, for other parallelism features the backend also
> behaves the same way as worker in producing the results if the
> results from workers is not available.

We can make backend do the same work as any worker, but that
will complicate the code for less than 2 % perfomance boost.
This performance issue matters even less as you increase the
number of workers N, since you save only 1/N-th of the
transfer time.

Is backend waiting for workers a really bad practice that
should be avoided at all costs, or can we leave it as is?

Regards,

Constantin S. Pan
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2016-03-16 14:29:37 Re: Declarative partitioning
Previous Message Aleksander Alekseev 2016-03-16 14:10:57 Patch: typo, s/espaced/escaped/