Re: [PATCHES] GIN improvements

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Teodor Sigaev <teodor(at)sigaev(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCHES] GIN improvements
Date: 2008-11-27 20:36:41
Message-ID: 492F04D9.5070404@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

There's a pretty fundamental issue with this patch, which is that while
buffering the inserts in the "list pages" makes the inserts fast, all
subsequent queries become slower until the tuples have been properly
inserted into the index. I'm sure it's a good tradeoff in many cases,
but there has got to be a limit to it. Currently, if you create an empty
table, and load millions of tuples into it using INSERTs, the index
degenerates into just a pile of "fast" tuples that every query needs to
grovel through. The situation will only be rectified at the next vacuum,
but if there's no deletes or updates on the table, just inserts,
autovacuum won't happen until the next anti-wraparound vacuum.

To make things worse, a query will fail if all the matching
fast-inserted tuples don't fit in the non-lossy tid bitmap. That's
another reason to limit the number of list pages; queries will start
failing otherwise.

Yet another problem is that if so much work is offloaded to autovacuum,
it can tie up autovacuum workers for a very long time. And the work can
happen on an unfortunate time, when the system is busy, and affect other
queries. There's no vacuum_delay_point()s in gininsertcleanup, so
there's no way to throttle that work.

I think we need a hard limit on the number of list pages, before we can
consider accepting this patch. After the limit is full, the next
inserter can flush the list, inserting the tuples in the list into the
tree, or just fall back to regular, slow, inserts.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2008-11-27 20:37:48 Re: Visibility map, partial vacuums
Previous Message Simon Riggs 2008-11-27 19:45:34 Re: Distinct types

Browse pgsql-patches by date

  From Date Subject
Next Message Gregory Stark 2008-11-27 22:14:59 Re: [PATCHES] GIN improvements
Previous Message Robert Haas 2008-11-27 16:38:39 Re: Fwd: [PATCHES] Auto Partitioning Patch - WIP version 1