Re: Reverse collations (initially for making keyset pagination cover more cases)

From: David Fetter <david(at)fetter(dot)org>
To: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Reverse collations (initially for making keyset pagination cover more cases)
Date: 2019-11-18 02:29:39
Message-ID: 20191118022938.GB7444@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 17, 2019 at 11:23:23PM +0000, Andrew Gierth wrote:
> >>>>> "David" == David Fetter <david(at)fetter(dot)org> writes:
>
> First, in testing the patch I found there were indeed some missing
> cases: the sortsupport version of the comparator needs to be fixed too.
> I attach a draft addition to your patch, you should probably look at
> adding test cases that need this to work.
>
> David> (a, b, c) < ($1, $2 COLLATE "C_backwards", $3)
> David> ...
> David> ORDER BY a, b DESC, c
>
> That would have to be:
>
> WHERE (a, b COLLATE "C_backwards", c) < ($1, $2, $3)
> ...
> ORDER BY a, b COLLATE "C_backwards", c
>
> Adding the below patch to yours, I can get this on the regression test
> db (note that this is a -O0 asserts build, timings may be slow relative
> to a production build):
>
> create collation "C_rev" ( LOCALE = "C", REVERSE = true );
> create index on tenk1 (hundred, (stringu1::text collate "C_rev"), string4);
>
> explain analyze
> select hundred, stringu1::text, string4
> from tenk1
> where (hundred, stringu1::text COLLATE "C_rev", string4)
> > (10, 'WKAAAA', 'VVVVxx')
> order by hundred, (stringu1::text collate "C_rev"), string4
> limit 5;
> QUERY PLAN
> --------------------------------------------------------------------------------------------------------------------------------------------------------
> Limit (cost=0.29..1.28 rows=5 width=132) (actual time=0.029..0.038 rows=5 loops=1)
> -> Index Scan using tenk1_hundred_stringu1_string4_idx on tenk1 (cost=0.29..1768.49 rows=8900 width=132) (actual time=0.028..0.036 rows=5 loops=1)
> Index Cond: (ROW(hundred, ((stringu1)::text)::text, string4) > ROW(10, 'WKAAAA'::text, 'VVVVxx'::name))
> Planning Time: 0.225 ms
> Execution Time: 0.072 ms
> (5 rows)
>
> and I checked the results, and they look correct now.

Here's that patch with your correction rolled in.

This will need more tests, and possibly more documentation.

Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Attachment Content-Type Size
v2-0001-Reverse-collations.patch text/x-diff 18.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2019-11-18 03:43:14 Re: dropdb --force
Previous Message Thomas Munro 2019-11-18 01:40:50 Re: Invisible PROMPT2