|From:||Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>|
|Subject:||BUG #15285: Query used index over field with ICU collation in some cases wrongly return 0 rows|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
I'm bumping this thread on pgsql-hacker, hopefully it will drag some more
Should we try to fix this issue or not? This is clearly an upstream bug. It has
been reported, including regression tests, but this doesn't move since 2 years
If we choose not to fix it on our side using eg a workaround (see patch), I
suppose this small bug should be documented somewhere so people are not lost
alone in the wild.
Begin forwarded message:
Date: Sat, 13 Jun 2020 00:43:22 +0200
From: Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Роман Литовченко <roman(dot)lytovchenko(at)gmail(dot)com>, PostgreSQL mailing lists
<pgsql-bugs(at)lists(dot)postgresql(dot)org> Subject: Re: BUG #15285: Query used index
over field with ICU collation in some cases wrongly return 0 rows
On Fri, 12 Jun 2020 18:40:55 +0200
Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote:
> On Wed, 10 Jun 2020 00:29:33 +0200
> Jehan-Guillaume de Rorthais <jgdr(at)dalibo(dot)com> wrote:
> > After playing with ICU regression tests, I found functions ucol_strcollIter
> > and ucol_nextSortKeyPart are safe. I'll do some performance tests and report
> > here.
> I did some benchmarks. See attachment for the script and its header to
> It sorts 935895 french phrases from 0 to 122 chars with an average of 49.
> Performance tests were done on current master HEAD (buggy) and using the patch
> in attachment, relying on ucol_strcollIter.
> My preliminary test with ucol_getSortKey was catastrophic, as we might
> expect. 15-17x slower than the current HEAD. So I removed it from actual
> tests. I didn't try with ucol_nextSortKeyPart though.
> Using ucol_strcollIter performs ~20% slower than HEAD on UTF8 databases, but
> this might be acceptable. Here are the numbers:
> DB Encoding HEAD strcollIter ratio
> UTF8 2.74 3.27 1.19x
> LATIN1 5.34 5.40 1.01x
> I plan to add a regression test soon.
Please, find in attachment the second version of the patch, with a
Jehan-Guillaume de Rorthais
|Next Message||Konstantin Knizhnik||2020-07-15 14:28:07||Re: Postgres is not able to handle more than 4k tables!?|
|Previous Message||Peter Eisentraut||2020-07-15 13:47:25||Re: Improve handling of parameter differences in physical replication|