Re: Theory of operation of collation patch

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Theory of operation of collation patch
Date: 2011-03-07 17:52:57
Message-ID: 8480.1299520377@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On mn, 2011-03-07 at 11:43 -0500, Tom Lane wrote:
>> ... I now think that the reason it doesn't
>> obviously fail to fail is that indcollation is dead code, and so is
>> approximately 99% of what you added to the planner, because two
>> expressions that are equal() must necessarily have the same collation
>> property. Tracking the collation as a separate property of a pathkey
>> is thus a useless activity. If this conclusion isn't correct, please
>> explain why not.

> I'll have to check into these details, but here is a test case that
> shows that it's doing something with the collation of an index:

[ pokes at it some more... ] It looks like indcollation is acting as a
substitute for including a CollateClause in the index key expression,
which doesn't seem like a particularly good tradeoff considering all the
overhead you must introduce into the default case.

But more to the point, your examples do *not* work for me. I can
reproduce both failing to use an index that should work, and selecting
an index that doesn't work:

d1u=# CREATE TABLE test (a text);
CREATE TABLE
d1u=# CREATE INDEX test_1 ON test (a);
CREATE INDEX
d1u=# CREATE INDEX test_d ON test (a COLLATE "de_DE");
CREATE INDEX
d1u=# CREATE INDEX test_e ON test (a COLLATE "es_ES");
CREATE INDEX
d1u=# CREATE INDEX test_f ON test (a COLLATE "fr_FR");
CREATE INDEX
d1u=# CREATE INDEX test_fz ON test ((a||'z') COLLATE "fr_FR");
CREATE INDEX
d1u=# explain select * from test order by a;
QUERY PLAN
------------------------------------------------------------------------
Index Scan using test_f on test (cost=0.00..63.90 rows=1310 width=32)
(1 row)

d1u=# explain select * from test order by a collate "fr_FR";
QUERY PLAN
---------------------------------------------------------------
Sort (cost=90.93..94.20 rows=1310 width=32)
Sort Key: ((a COLLATE "fr_FR"))
-> Seq Scan on test (cost=0.00..23.10 rows=1310 width=32)
(3 rows)

d1u=# set enable_seqscan TO 0;
SET
d1u=# explain select * from test order by a collate "fr_FR";
QUERY PLAN
----------------------------------------------------------------------------------
Sort (cost=10000000090.93..10000000094.20 rows=1310 width=32)
Sort Key: ((a COLLATE "fr_FR"))
-> Seq Scan on test (cost=10000000000.00..10000000023.10 rows=1310 width=32)
(3 rows)

d1u=# explain select * from test order by a||'z';
QUERY PLAN
-------------------------------------------------------------------------
Index Scan using test_fz on test (cost=0.00..67.18 rows=1310 width=32)
(1 row)

(This is in a database with encoding utf8 and lc_collate = c)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thom Brown 2011-03-07 17:59:06 Re: Column-level trigger doc typo fix
Previous Message David Fetter 2011-03-07 17:51:06 Re: [HACKERS] Sync rep doc corrections