Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.

From: Gunther <raj(at)gusw(dot)net>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Date: 2019-02-24 02:55:00
Message-ID: c8131222-d8dd-54d8-9198-c604ed879ab4@gusw.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 2/23/2019 16:13, Peter Geoghegan wrote:
> On Sat, Feb 23, 2019 at 1:06 PM Gunther <raj(at)gusw(dot)net> wrote:
>> I thought to keep my index tight, I would define it like this:
>>
>> CREATE UNIQUE INDEX Queue_idx_pending ON Queue(jobId) WHERE pending;
>>
>> so that only pending jobs are in that index.
>>
>> When a job is done, follow up work is often inserted into the Queue as pending, thus adding to that index.
> How many distinct jobIds are there in play, roughly? Would you say
> that there are many fewer distinct Jobs than distinct entries in the
> index/table? Is the number of jobs fixed at a fairly low number, that
> doesn't really grow as the workload needs to scale up?

Jobs start on another, external queue, there were about 200,000 of them
waiting when I started the run.

When the SQL Queue is empty, the workers pick one job from the external
queue and add it to the SQL queue.

When that happens immediately 2 more jobs are created on that queue.
Let's cal it phase 1 a and b

When phase 1 a has been worked off, another follow-up job is created.
Let' s call it phase 2.

When phase 2 has been worked off, a final phase 3 job is created.

When that is worked off, nothing new is created, and the next item is
pulled from the external queue and added to the SQL queue.

So this means, each of the 200,000 items add (up to) 4 jobs onto the
queue during their processing.

But since these 200,000 items are on an external queue, the SQL queue
itself is not stuffed full at all. It only slowly grows, and on the main
index where we have only the pending jobs, there are only probably than
20 at any given point in time. When I said 7 jobs per second, it meant
7/s simultaneously for all these 3+1 phases, i.e., 28 jobs per second.
And at that rate it takes little less than 30 min for the index to
deteriorate. I.e. once about 50,000 queue entries have been processed
through that index it has deteriorated to become nearly unusable until
it is rebuilt.

thanks,
-Gunther

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message David Rowley 2019-02-24 06:41:08 Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Previous Message Peter Geoghegan 2019-02-23 21:13:53 Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.