Re: Better tracking of free space during SP-GiST index build

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, emre(at)hasegeli(dot)com
Subject: Re: Better tracking of free space during SP-GiST index build
Date: 2016-09-22 17:37:03
Message-ID: 30822.1474565823@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>> On 08/25/2016 01:45 AM, Tom Lane wrote:
>>> I'll put this in the commitfest queue. It could use review from
>>> someone with the time and motivation to do performance
>>> testing/tuning.

> I've been toying with this patch a bit today, particularly looking at
> (1) sizing of the cache, and (2) how beneficial it'd be to choose pages
> from the cache in a smarter way.

Thanks for reviewing!

> I wonder why the patch only sets the cache size to 100 items, when we
> might fit many more entries into the ~8kB limit.

I chose that because it would still work with the minimum allowed page
size of 1K. We could make the cache size variable depending on BLCKSZ,
but I'm not sure it's worth the trouble. There is some cost to a larger
lastUsedPages array, in that you spend more time memcpy'ing it back and
forth between the metapage buffer and backend local memory; and I was
afraid of that outweighing the incremental gains from a larger cache.

> ... I've tried increasing the cache size to 768
> entries, with vast majority of them (~600) allocated to leaf pages.
> Sadly, this seems to only increase the CREATE INDEX duration a bit,
> without making the index significantly smaller (still ~120MB).

Yeah, that's in line with my results: not much further gain from a
larger cache. Though if you were testing with the same IRRExplorer
data, it's not surprising that our results would match. Would be
good to try some other cases...

> I do think selecting the page with the least free space would save
> additional space. Instead of making SpGistGetBuffer() more complicated,
> I've simply shoved a pg_qsort() calls on a few places, sorting the cache
> by free space, with unused slots at the end. Clearly, this is quite
> expensive and proper patch would have to do that differently (e.g.
> maintaining the order incrementally), but OTOH it'd allow some
> optimizations in SpGistGetBuffer() - the first page with enough free
> space would have the smallest amount of free space (more or less).

> This actually helped a bit, and the index size dropped by ~2MB. So not
> bad, but nowhere close to the initial 132MB -> 120MB improvement.

OK, I'll think about how to do that more efficiently. The smaller
incremental improvement isn't surprising, because in this example the
index would still be 90-something MB if it had no free space at all,
so there's going to be decreasing returns from any additional work
to avoid wasted free space. But if we can do it cheaply, this does
suggest that using pages in order by free space is of value.

> It's worth mentioning the spgist fillfactor is a bit crude, most likely
> thanks to splits - e.g. the 109MB index still contains ~10MB of free
> space on the pages (measures using pageinspect as upper-lower), so
> almost 10%. Perhaps it really is time to increase the spgist default
> fillfactor?

Maybe; I don't think anyone's put much effort into tuning that.

> It seems the patch keeps new/empty/deleted pages in the cache, and thus
> segregated by type. Is that intentional, or should
> SpGistSetLastUsedPage() remove such pages from the cache? Or maybe move
> them into a special category? It's true we'll reuse those pages, as
> allocNewBuffer() allocates the buffer directly, but those pages are
> unlikely to get evicted from the cache due to high freeSpace value
> (despite possibly already reused).

My thought was that once we've found a new/empty/deleted page, putting
it in the cache is a cheap way of making sure it gets used soon. Sure,
we could record it in the FSM instead, but that sounds relatively more
expensive. That intuition could be wrong of course.

> Similarly for completely full pages (with freeSpace==0) - does it make
> sense to keep them in the cache? Although it's probably harmless, as
> those pages will get evicted first if needed.

IIRC, we don't have anything to replace the cache entry with at that
point, so there's no particular value in marking the entry unused
rather than used-but-with-zero-space. It will be first priority for
replacement either way, so there seemed little point in writing
more code to make it unused.

> One thing I'd change is making the SpGistLUPCache dynamic, i.e. storing
> the size and lastUsedPagesMap on the meta page. That should allow us
> resizing the cache and tweak lastUsedPagesMap in the future.

Yeah, probably a good idea. I had thought of bumping SPGIST_MAGIC_NUMBER
again if we want to revisit the cache size; but keeping it as a separate
field won't add noticeable cost, and it might save some trouble.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2016-09-22 17:37:54 Re: Exclude schema during pg_restore
Previous Message Tom Lane 2016-09-22 17:10:08 Re: Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission denied”