From: | Yeb Havinga <yebhavinga(at)gmail(dot)com> |
---|---|
To: | Greg Smith <greg(at)2ndquadrant(dot)com> |
Cc: | Matthew Wakeling <matthew(at)flymine(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, Eliot Gable <egable+pgsql-performance(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org |
Subject: | Re: B-Heaps |
Date: | 2010-06-18 18:30:17 |
Message-ID: | 4C1BBB39.10308@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Greg Smith wrote:
> Matthew Wakeling wrote:
>> This sort of thing has been fairly well researched at an academic
>> level, but has not been implemented in that many real world
>> situations. I would encourage its use in Postgres.
>
> I guess, but don't forget that work on PostgreSQL is driven by what
> problems people are actually running into. There's a long list of
> performance improvements sitting in the TODO list waiting for people
> to find time to work on them, ones that we're quite certain are
> useful. That anyone is going to chase after any of these speculative
> ideas from academic research instead of one of those is unlikely.
> Your characterization of the potential speed up here is "Using a
> proper tree inside the index page would improve the CPU usage of the
> index lookups", which seems quite reasonable. Regardless, when I
> consider "is that something I have any reason to suspect is a
> bottleneck on common workloads?", I don't think of any, and return to
> working on one of things I already know is instead.
>
There are two different things concerning gist indexes:
1) with larger block sizes and hence, larger # entries per gist page,
results in more generic keys of those pages. This in turn results in a
greater number of hits, when the index is queried, so a larger part of
the index is scanned. NB this has nothing to do with caching / cache
sizes; it holds for every IO model. Tests performed by me showed
performance improvements of over 200%. Since then implementing a speedup
has been on my 'want to do list'.
2) there are several approaches to get the # entries per page down. Two
have been suggested in the thread referred to by Matthew (virtual pages
(but how to order these?) and tree within a page). It is interesting to
see if ideas from Prokop's cache oblivous algorithms match with this
problem to find a suitable virtual page format.
regards,
Yeb Havinga
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin Grittner | 2010-06-18 18:44:22 | Re: B-Heaps |
Previous Message | Greg Smith | 2010-06-18 18:11:53 | Re: requested shared memory size overflows size_t |