Re: sketchy partcollation handling

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: sketchy partcollation handling
Date: 2017-06-05 02:18:59
Message-ID: 08b96204-2eea-d083-b6b4-eaf184e1f059@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/06/03 1:31, Robert Haas wrote:
> If you create a partitioned table in the obvious way, partcollation ends up 0:
>
> rhaas=# create table foo (a int, b text) partition by list (a);
> CREATE TABLE
> rhaas=# select * from pg_partitioned_table;
> partrelid | partstrat | partnatts | partattrs | partclass |
> partcollation | partexprs
> -----------+-----------+-----------+-----------+-----------+---------------+-----------
> 16420 | l | 1 | 1 | 1978 | 0 |
> (1 row)
>
> You could argue that 0 is an OK value there; offhand, I'm not sure
> about that.

If you create index on an integer column, you'll get a 0 in indcollation:

select indcollation from pg_index where indrelid = 'foo'::regclass;
indcollation
--------------
0
(1 row)

> But there's nothing in
> https://www.postgresql.org/docs/10/static/catalog-pg-partitioned-table.html
> which indicates that some entries can be 0 rather than a valid
> collation OID.

Yeah, it might be better to explain that. BTW, pg_index.indcollation's
description lacks a note about this too, so maybe, we should fix both.
OTOH, pg_attribute.attcollation's description mentions it:

The defined collation of the column, or zero if the column is
not of a collatable data type.

It might be a good idea to update partcollation's and indcollation's
description along the same lines.

> And this is definitely not OK:
>
> rhaas=# select * from pg_depend where objid = 16420;
> classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
> ---------+-------+----------+------------+----------+-------------+---------
> 1259 | 16420 | 0 | 2615 | 2200 | 0 | n
> 1259 | 16420 | 0 | 3456 | 0 | 0 | n
> (2 rows)
>
> We shouldn't be storing a dependency on non-existing collation 0.
>
> I'm not sure whether the bug here is that we should have a valid
> collation OID there rather than 0, or whether the bug is that we
> shouldn't be recording a dependency on anything other than a real
> collation OID, but something about this is definitely not right.

I think we can call it a bug of StorePartitionKey(). I looked at the
similar code in index_create() (which actually I had originally looked at
for reference when writing the partitioning code in question) and looks
like it doesn't store the dependency for collation 0 and for the default
collation of the database. I think the partitioning code should do the
same. Attached find a patch for the same (which also updates the
documentation as mentioned above); with the patch:

create table p (a int) partition by range (a);
select partcollation from pg_partitioned_table;
partcollation
---------------
0
(1 row)

-- no collation dependency stored for invalid collation
select * from pg_depend where objid = 'p'::regclass;
classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
1259 | 16434 | 0 | 2615 | 2200 | 0 | n
(1 row)

create table p (a text) partition by range (a);
select partcollation from pg_partitioned_table;
partcollation
---------------
100
(1 row)

-- no collation dependency stored for the default collation
select * from pg_depend where objid = 'p'::regclass;
classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
1259 | 16407 | 0 | 2615 | 2200 | 0 | n
(1 row)

create table p (a text) partition by range (a collate "en_US");
select partcollation from pg_partitioned_table;
partcollation
---------------
12513
(1 row)

-- dependency on non-default collation is stored
select * from pg_depend where objid = 'p'::regclass;
classid | objid | objsubid | refclassid | refobjid | refobjsubid | deptype
---------+-------+----------+------------+----------+-------------+---------
1259 | 16410 | 0 | 2615 | 2200 | 0 | n
1259 | 16410 | 0 | 3456 | 12513 | 0 | n
(2 rows)

BTW, the places which check whether the collation to store a dependency
for is the database default collation don't need to do that. I mean the
following code block in all of these places:

/* The default collation is pinned, so don't bother recording it */
if (OidIsValid(attr->attcollation) &&
attr->attcollation != DEFAULT_COLLATION_OID)
{
referenced.classId = CollationRelationId;
referenced.objectId = attr->attcollation;
referenced.objectSubId = 0;
recordDependencyOn(&myself, &referenced, DEPENDENCY_NORMAL);
}

That's because the default collation is pinned and the dependency code
checks isObjectPinned() and does not create pg_depend entries if so.
Those places are:

AddNewAttributeTuples
StorePartitionKey
index_create
GenerateTypeDependencies
add_column_collation_dependency

Removing that check still passes `make check-world`.

Thanks,
Amit

Attachment Content-Type Size
0001-Do-not-store-dependency-on-invalid-and-default-colla.patch text/plain 2.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-06-05 02:38:51 Re: logical replication - still unstable after all these months
Previous Message Mark Kirkwood 2017-06-05 02:01:51 Re: logical replication - still unstable after all these months