| From: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
|---|---|
| To: | Michael Paquier <michael(at)paquier(dot)xyz> |
| Cc: | PostgreSQL-development <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Corey Huinker <corey(dot)huinker(at)gmail(dot)com> |
| Subject: | Re: pg_restore handles extended statistics inconsistently with statistics data |
| Date: | 2026-06-15 09:11:11 |
| Message-ID: | 37E4590F-EB79-4F00-A4B2-18019B03A7EF@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
> On Jun 15, 2026, at 15:55, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Jun 15, 2026 at 11:51:40AM +0800, Chao Li wrote:
>> If the fix direction is to make pg_restore behave consistently with
>> pg_dump, then I think Michael's change is correct.
>
> I have been checking for a few hours, and well.. There was more.
>
>> See the attached v2 for details.
>
> That's nice, unfortunately incomplete. I have found a different
> pattern that behaves incorrectly, reusing your script of upthread:
> pg_restore --statistics --table t \
> -f /tmp/archive-s1-stats.sql /tmp/stats.dump
> pg_restore --statistics-only --table t \
> -f /tmp/archive-s1-stats.sql /tmp/stats.dump
>
> With "--statistics --table t", we restore the definition of table "t"
> and its data (correct), miss the stats (incorrect!). With
> "--statistics-only --table t", we restore no definition and no data
> (correct), and still miss the stats (incorrect!). So we still have a
> borked restore of the stats when selecting an individual table. We
> don't care about extended stats with this case, as extstats are
> objects defined at schema level and we cannot filter them, but I think
> that we definitely care about attribute and relation stats here.
>
> This case requires a second change in _tocEntryRequired() even after
> the first fix so as it is possible to select a STATISTICS DATA that
> depends on a table part of the authorized list close to the list that
> includes the "SEQUENCE SET".
>
> With all that in mind, I have the attached. I have expanded the tests
> with the --table case, using dump and restore commands, and that seems
> to work correctly now with the restores, including the cases with
> extstats.
>
> And I'm planning to apply that down to v18 tomorrow, after a second
> round of lookups.
> --
> Michael
> <v3-0001-Fix-pg_restore-with-schema-table-and-statistics-o.patch>
Ah, I missed the --table case. Your finding made me think about the --index case as well. It looks like --index still misses stats with v3 applied. I created two additional test scripts for the table and index cases, see the attached .sh files.
With v3, the table case works:
```
% ./test_pgdump_stats_table.sh
CREATE SCHEMA
SELECT 1000
CREATE INDEX
ANALYZE
archive TOC:
242; 1259 16609 TABLE s1 t chaol
3930; 0 16609 TABLE DATA s1 t chaol
3937; 0 0 STATISTICS DATA s1 t
3938; 0 0 STATISTICS DATA s1 idx_expr
pg_restore --statistics --table t:
CREATE TABLE s1.t (
SELECT * FROM pg_catalog.pg_restore_relation_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
pg_restore --statistics-only --table t:
SELECT * FROM pg_catalog.pg_restore_relation_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
```
But for the index case, stats are still missing:
```
% ./test_pgdump_stats_index.sh
CREATE SCHEMA
SELECT 1000
CREATE INDEX
ANALYZE
archive TOC:
3937; 0 0 STATISTICS DATA s1 t
3764; 1259 16618 INDEX s1 idx_expr chaol
3938; 0 0 STATISTICS DATA s1 idx_expr
pg_restore --statistics --index idx_expr:
CREATE INDEX idx_expr ON s1.t USING btree (((a + 1)));
pg_restore --statistics-only --index idx_expr:
```
We cannot just add a strcmp(te->desc, "STATISTICS DATA") == 0 check to the "else if (strcmp(te->desc, "INDEX") == 0)" branch, because STATISTICS DATA would already have matched the earlier table branch. So in v4, I pulled STATISTICS DATA into its own branch before the table and index branches.
With v4, stats for the index are restored:
```
% ./test_pgdump_stats_index.sh
CREATE SCHEMA
SELECT 1000
CREATE INDEX
ANALYZE
archive TOC:
3937; 0 0 STATISTICS DATA s1 t
3764; 1259 16624 INDEX s1 idx_expr chaol
3938; 0 0 STATISTICS DATA s1 idx_expr
pg_restore --statistics --index idx_expr:
CREATE INDEX idx_expr ON s1.t USING btree (((a + 1)));
SELECT * FROM pg_catalog.pg_restore_relation_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
pg_restore --statistics-only --index idx_expr:
SELECT * FROM pg_catalog.pg_restore_relation_stats(
SELECT * FROM pg_catalog.pg_restore_attribute_stats(
```
Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/
| Attachment | Content-Type | Size |
|---|---|---|
| test_pgdump_stats_index.sh | application/octet-stream | 997 bytes |
| test_pgdump_stats_table.sh | application/octet-stream | 969 bytes |
| v4-0001-Fix-pg_restore-with-schema-table-and-statistics-o.patch | application/octet-stream | 5.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2026-06-15 09:16:24 | Re: Report bytes and transactions actually sent downtream |
| Previous Message | Richard Guo | 2026-06-15 09:08:36 | Re: Reduce LEFT/FULL JOIN to ANTI JOIN in more cases |