Fix to enum hashing for dump and restore

From: Andrew <pgsqlhackers(at)andrewrepp(dot)com>
To: pgsql-hackers(at)lists(dot)postgresql(dot)org
Cc: soumyadeep2007(at)gmail(dot)com, alexandra(dot)wanglei(at)gmail(dot)com
Subject: Fix to enum hashing for dump and restore
Date: 2023-01-25 03:30:06
Message-ID: 765e5968-6c39-470f-95bf-7b14e6b9a1c0@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello,

I have discovered a bug in one usage of enums. If a table with hash
partitions uses an enum as a partitioning key, it can no longer be
backed up and restored correctly. This is because enums are represented
simply as oids, and the hash function for enums hashes that oid to
determine partition distribution. Given the way oids are assigned, any
dump+restore of a database with such a table may fail with the error
"ERROR: new row for relation "TABLENAME" violates partition constraint".

This can be reproduced with the following steps:
************************************************
create database test;
\c test
create type colors as enum ('red', 'green', 'blue', 'yellow');
create table part (color colors) partition by hash(color);

create table prt_0 partition of part for values
with (modulus 3, remainder 0);

create table prt_1 partition of part for values
with (modulus 3, remainder 1);

create table prt_2 partition of part for values
with (modulus 3, remainder 2);
insert into part values ('red');

/usr/local/pgsql/bin/pg_dump -d test -f /tmp/dump.sql
/usr/local/pgsql/bin/createdb test2
/usr/local/pgsql/bin/psql test2 -f /tmp/dump.sql
************************************************

I have written a patch to fix this bug (attached), by instead having the
hashenum functions look up the enumsortorder ID of the value being
hashed. These are deterministic across databases, and so allow for
stable dump and restore. This admittedly comes at the performance cost
of doing a catalog lookup, but there is precedent for this in
e.g. hashrange and hashtext.

I look forward to your feedback on this, thank you!

Sincerely,
Andrew J Repp (VMware)

Attachment Content-Type Size
v1-0001-Change-enum-hashing-to-consider-enumsortorder.patch application/octet-stream 3.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-01-25 03:56:10 Re: Fix to enum hashing for dump and restore
Previous Message David Rowley 2023-01-25 03:26:24 Re: document the need to analyze partitioned tables