Skip site navigation (1) Skip section navigation (2)

Re: WIP: index support for regexp search

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Erikjan Rijkers <er(at)xs4all(dot)nl>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Pavel Stìhule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: WIP: index support for regexp search
Date: 2013-04-03 09:18:27
Message-ID: CAPpHfdsitdJZNyQk5UCK0sAh1F08147pP7DkDH_Gh_Men8ofxw@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Wed, Apr 3, 2013 at 11:10 AM, Erikjan Rijkers <er(at)xs4all(dot)nl> wrote:

> On Tue, April 2, 2013 23:54, Alexander Korotkov wrote:
>
> > [trgm-regexp-0.15.patch.gz]
>
> Yes, it does look good now; Attached a list of measurements. Most of the
> searches that I put in
> that test-program are now speeded up very much.
>
> There still are a few regressions, for example:
>
> HEAD          azjunk6  x[aeiou]{4,5}q          83  Seq Scan
>  1393.465 ms
> trgm_regex15  azjunk6  x[aeiou]{4,5}q          83  Bitmap Heap Scan
>  1728.319 ms
>
> HEAD          azjunk7  x[aeiou]{1,3}q      190031  Seq Scan
> 16819.555 ms
> trgm_regex15  azjunk7  x[aeiou]{1,3}q      190031  Bitmap Heap Scan
> 21286.804 ms
>
> Not exactly negligible, and ideally those regressions would be removed but
> with the huge
> advantages for other cases I'd say it's worth it.
>

Thank you for testing!
Exploring results more detail I found version 13 to be buggy. This version
is a dead end, we have quite different API now. Could you use v12 instead
of v13 in comparison, please?
Sometimes we have regression in comparison with head in two reasons:
1) We select index scan in both cases but with patch we spent more time for
analysis. It's inevitable disadvantage of any index. We can only take care
of analysis doesn't take too long. Current testing results don't show this
reason to be significant.
2) Sometimes we select index scan while sequential scan would be faster.
It's also inevitable disadvantage until we have a relevant statistics. We
now have similar situation, for example, with in-core geometrical search
and LIKE/ILIKE search in pg_trgm. However,  probably, situation could be
improved somehow even without such statistics. But I think we can do such
conclusion based on synthetical testing, because improvements for
synthetical cases could appear to be an worsening for real-life cases.

------
With best regards,
Alexander Korotkov.

In response to

pgsql-hackers by date

Next:From: Andres FreundDate: 2013-04-03 09:31:21
Subject: Re: regression test failed when enabling checksum
Previous:From: Erikjan RijkersDate: 2013-04-03 07:10:08
Subject: Re: WIP: index support for regexp search

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group