Quick Links

Re: pg_restore handles extended statistics inconsistently with statistics data

From:	Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To:	Michael Paquier <michael(at)paquier(dot)xyz>
Cc:	Corey Huinker <corey(dot)huinker(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>
Subject:	Re: pg_restore handles extended statistics inconsistently with statistics data
Date:	2026-06-16 06:36:21
Message-ID:	62F7A896-B40E-4AD4-A08D-497FA8E01C67@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> On Jun 16, 2026, at 11:23, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Tue, Jun 16, 2026 at 11:07:22AM +0900, Michael Paquier wrote:
>> Sharing the check for a STATISTICS DATA TOC entry on table and index
>> names was making me ticking a bit, as this is not entirely
>> collision-proof for the names, but it also looks like we do things the
>> same way with TABLE DATA and INDEX, so.. At the end, applied down to
>> v18 as suggested.
>
> And I am having second thoughts on this one. Take for example this
> case:
> CREATE SCHEMA s1;
> CREATE SCHEMA s2;
> CREATE TABLE s1.foo (id int);
> INSERT INTO s1.foo SELECT generate_series(1,100);
> ANALYZE s1.foo;
> CREATE TABLE s2.bar (id int);
> CREATE INDEX foo ON s2.bar(id);
> INSERT INTO s2.bar SELECT generate_series(1,100);
> ANALYZE s2.bar;
>
> And then this:
> pg_dump --statistics -Fc -f stats.dump mydb
> pg_restore --statistics-only --index=foo -f stats_foo.sql stats.dump
>
> On HEAD, we get relation and attribute we should not in stats_foo.sql,
> getting also some data from the table s1.foo. With the patch
> attached, that strengthens the name check based on the type of the
> depending TOC entries, we only get the relation stats of s2.foo,
> nothing about the table s1.foo. This feels too funky to write a test
> for, wasting cycles compared to the existing coverage.
>
> pg_restore --index is as old as e8f69be054e9, so it's not like we
> could just remove it, but I'd say that with the schema-level restore
> this would be tempting.
>
> Anyway, let's improve this situation with the attached, for HEAD and
> v18.
> --
> Michael
> <0001-Use-dependency-based-matching-for-STATISTICS-DATA-in.patch>

I think the dependency-based matching is better, but what happens if the archive is created with --statistics-only? In that case, the statistics entries still have dependencies, but the referenced parent TOC entries are not present in the archive. See the attached test script. It creates the archive with --statistics-only, with the new patch, neither table nor index stats are restored:
```
% ./test_pgdump_stats_new.sh
CREATE SCHEMA
SELECT 1000
CREATE INDEX
ANALYZE
archive TOC:
3936; 0 0 STATISTICS DATA s1 t
3937; 0 0 STATISTICS DATA s1 idx_expr
pg_restore --statistics-only --table t:
pg_restore --statistics-only --index idx_expr:
```

With HEAD, both table and index stats are restored:
```
% ./test_pgdump_stats_new.sh
CREATE SCHEMA
SELECT 1000
CREATE INDEX
ANALYZE
archive TOC:
3936; 0 0 STATISTICS DATA s1 t
3937; 0 0 STATISTICS DATA s1 idx_expr
pg_restore --statistics-only --table t:
SELECT * FROM pg_catalog.pg_restore_relation_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
pg_restore --statistics-only --index idx_expr:
SELECT * FROM pg_catalog.pg_restore_relation_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
```

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

Attachment	Content-Type	Size
test_pgdump_stats_new.sh	application/octet-stream	908 bytes

In response to

Re: pg_restore handles extended statistics inconsistently with statistics data at 2026-06-16 03:23:46 from Michael Paquier

Responses

Re: pg_restore handles extended statistics inconsistently with statistics data at 2026-06-16 06:59:29 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Shlok Kyal	2026-06-16 06:39:47	Re: synchronized_standby_slots behavior inconsistent with quorum-based synchronous replication
Previous Message	Peter Smith	2026-06-16 06:36:10	Re: Support EXCEPT for TABLES IN SCHEMA publications