Quick Links

Re: Yet another abort-early plan disaster on 9.3

From:	Tomas Vondra <tv(at)fuzzy(dot)cz>
To:	pgsql-performance(at)postgresql(dot)org
Subject:	Re: Yet another abort-early plan disaster on 9.3
Date:	2014-10-10 18:13:24
Message-ID:	543821C4.3030301@fuzzy.cz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-performance

On 10.10.2014 19:59, Craig James wrote:
> On Fri, Oct 10, 2014 at 9:53 AM, Tomas Vondra <tv(at)fuzzy(dot)cz
> <mailto:tv(at)fuzzy(dot)cz>> wrote:
>
>
> On 10.10.2014 16:21, Craig James wrote:
> > Our index is for chemical structures. Chemicals are indexed on
> > chemical fragments
> > <http://emolecules.com/info/molecular-informatics>. A search
> > typically starts with 50-200 indexed "columns" (chemical fragments).
> > The query is always flat, "A and B and ... and Z". The indexed
> > fragments are both correlated (the existence of one strongly raises
> > the chances of another) and anti-correlated (certain combinations are
> > very rare).
>
> Maybe I don't understand the problem well enough, but isn't this a
> perfect match for GIN indexes? I mean, you essentially need to do
> queries like "WHERE substance @@ ('A & B & !C')" etc. Which is exactly
> what GIN does, because it keeps pointers to tuples for each fragment.
>
>
> On the day our web site opened we were using tsearch. Before the end of
> the day we realized it was a bad idea, for the very reasons discussed
> here. The early-abort/late-start problem ("offset N limit M") could take
> minutes to return the requested page. With the external
> dynamically-optimized index, we can almost always get answers in less
> than a couple seconds, often in 0.1 seconds.

In the early days of tsearch, it did not support GIN indexes, and AFAIK
GiST are not nearly as fast for such queries. Also, the GIN fastscan
implemented by Alexander Korotkov in 9.4 makes a huge difference for
queries combining frequent and rare terms.

Maybe it'd be interesting to try this on 9.4. I'm not saying it will
make it faster than the optimized index, but it might be an interesting
comparison.

Tomas

In response to

Re: Yet another abort-early plan disaster on 9.3 at 2014-10-10 17:59:52 from Craig James

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bruce Momjian	2014-10-10 18:26:27	Re: Column Redaction
Previous Message	Tomas Vondra	2014-10-10 18:08:38	Re: bad estimation together with large work_mem generates terrible slow hash joins

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Dave Johansen	2014-10-14 17:08:26	Partitions and work_mem?
Previous Message	Craig James	2014-10-10 17:59:52	Re: Yet another abort-early plan disaster on 9.3