Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.

From: Gunther Schadow <gunther(at)pragmaticdata(dot)com>
To: pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Date: 2019-02-25 03:06:02
Message-ID: 9600aa3c-25a0-e057-e693-216fba4145b6@pragmaticdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Wow, yes, partition instead of index, that is interesting. Thanks Corey
and Justin.

The index isn't required at all if all my pending jobs are in a
partition of only pending jobs. In that case the plan can just be a
sequential scan.

And Jeff James, sorry, I failed to show the LIMIT 1 clause on my dequeue
query. That was an omission. My query is actually somewhat more complex
and I just translated it down to the essentials but forgot the LIMIT 1
clause.

SELECT seqNo, action
FROM Queue
WHERE pending
AND/... other criteria .../
LIMIT 1
FOR UPDATE SKIP LOCKED;

And sorry I didn't capture the stats for vacuum verbose. And they would
be confusing because there are other things involved.

Anyway, I think the partitioned table is the right and brilliant
solution, because an index really isn't required. The actual pending
partition will always remain quite small, and being a queue, it doesn't
even matter how big it might grow, as long as new rows are inserted at
the end and not in the middle of the data file and still there be some
way of fast skip over the part of the dead rows at the beginning that
have already been processed and moved away.

Good thing is, I don't worry about maintenance window.  I have the
leisure to simply tear down my design now and make a better design.
What's 2 million transactions if I can re-process them at a rate of
80/s? 7 hours max. I am still in development. So, no need to worry about
migration / transition acrobatics. So if I take Corey's steps and
envision the final result, not worrying about the transition steps, then
I understand this:

1. Create the Queue table partitioned on the pending column, this
creates the partition with the pending jobs (on which I set the
fillfactor kind of low, maybe 50) and the default partition with all the
rest. Of course that allows people with a constant transaction volume to
also partition on jobId or completionTime and move chunks out to cold
archive storage. But that's beside the current point.

2. Add all needed indexes on the partitioned table, except the main
partial index that I used before and that required all that reindexing
maintenance. Actually I won't need any other indexes really, why invite
another similar problem again.

That's really simple.

One question I have though: I imagine our pending partition heap file to
now be essentially sequentially organized as a queue. New jobs are
appended at the end, old jobs are at the beginning. As pending jobs
become completed (pending = false) these initial rows will be marked as
dead. So, while the number of live rows will remain small in that
pending partition, sequential scans will have to skip over the dead rows
in the beginning.

Does PostgreSQL structure its files such that skipping over dead rows is
fast? Or do the dead rows have to be read and discarded during a table
scan?

Of course vacuum eliminates dead rows, but unless I do vacuum full, it
will not re-pack the live rows, and that requires an exclusive table
lock. So, what is the benefit of vacuuming that pending partition? What
I _/don't/_ want is insertion of new jobs to go into open slots at the
beginning of the file. I want them to be appended (in Oracle there is an
INSERT /*+APPEND*/ hint for that. How does that work in PostgreSQL?

Ultimately that partition will amass too many dead rows, then what do I
do? I don't think that the OS has a way to truncate files physically
from the head, does it? I guess it could set the file pointer from the
first block to a later block. But I don't know of an IOCTL/FCNTL command
for that. On some OS there is a way of making blocks sparse again, is
that how PostgreSQL might do it? Just knock out blocks as sparse from
the front of the file?

If not, the next thing I can think of is to partition the table further
by time, may be alternating even and odd days, such that on any given
day one of the two pending partitions are quiet? Is that how it's done?

regards,
-Gunther

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gunther 2019-02-25 03:06:10 Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.
Previous Message Jeff Janes 2019-02-24 23:39:57 Re: Massive parallel queue table causes index deterioration, but REINDEX fails with deadlocks.