Quick Links

Re: Queryplan within FTS/GIN index -search.

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Jesper Krogh <jesper(at)krogh(dot)cc>
Cc:	pgsql-performance(at)postgresql(dot)org, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject:	Re: Queryplan within FTS/GIN index -search.
Date:	2009-10-31 03:11:32
Message-ID:	22673.1256958692@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-performance

Jesper Krogh <jesper(at)krogh(dot)cc> writes:
> I've now got a test-set that can reproduce the problem where the two
> fully equivalent queries (
> body_fts @@ to_tsquery("commonterm & nonexistingterm")
> and
> body_fts @@ to_tsquery("coomonterm") AND body_fts @@
> to_tsquery("nonexistingterm")
> give a difference of x300 in execution time. (grows with
> document-base-size).

I looked into this a bit. It seems the reason the first is much slower
is that the AND nature of the query is not exposed to the GIN control
logic (ginget.c). It has to fetch every index-entry combination that
involves any of the terms, which of course is going to be the whole
index in this case. This is obvious when you realize that the control
logic doesn't know the difference between tsqueries "commonterm &
nonexistingterm" and "commonterm | nonexistingterm". The API for
opclass extractQuery functions just isn't powerful enough to show that.

I think a possible solution to this could involve allowing extractQuery
to mark individual keys as "required" or "optional". Then the control
logic could know not to bother with combinations that haven't got all
the "required" keys. There might be other better answers though.

But having said that, this particular test case is far from compelling.
Any sane text search application is going to try to filter out
common words as stopwords; it's only the failure to do that that's
making this run slow.

regards, tom lane

In response to

Re: Queryplan within FTS/GIN index -search. at 2009-10-30 19:46:37 from Jesper Krogh

Responses

Re: Queryplan within FTS/GIN index -search. at 2009-10-31 06:20:48 from Jesper Krogh
Re: Queryplan within FTS/GIN index -search. at 2009-10-31 08:55:34 from Greg Stark

Browse pgsql-performance by date

	From	Date	Subject
Next Message	Jesper Krogh	2009-10-31 06:20:48	Re: Queryplan within FTS/GIN index -search.
Previous Message	Greg Stark	2009-10-30 22:16:44	Re: database size growing continously