BUG #16444: Vacuum writes the table twice

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: kurt(at)roeckx(dot)be
Subject: BUG #16444: Vacuum writes the table twice
Date: 2020-05-17 14:54:12
Message-ID: 16444-4c2f66fed04080b8@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16444
Logged by: Kurt Roeckx
Email address: kurt(at)roeckx(dot)be
PostgreSQL version: 12.3
Operating system: Debian
Description:

I had a large query fail, where most of the data wasn't commited, and a very
small amount was. So I decided to run vacuum on the table. The size of the
table is:
table_name | table_size | indexes_size | total_size
-----------------------------+------------+--------------+------------
"public"."ct_entry_chain" | 27 GB | 39 GB | 66 GB

The table looks like:
Table "public.ct_entry_chain"
Column | Type | Collation | Nullable | Default
----------------+--------+-----------+----------+--------------------------------------------
id | bigint | | not null |
nextval('ct_entry_chain_id_seq'::regclass)
ct_entry_id | bigint | | not null |
certificate_id | bigint | | not null |
Indexes:
"ct_entry_chain_pkey" PRIMARY KEY, btree (id)
"ct_entry_chain_ct_entry_id_certificate_id_key" UNIQUE CONSTRAINT, btree
(ct_entry_id, certificate_id)
Foreign-key constraints:
"ct_entry_chain_certificate_id_fkey" FOREIGN KEY (certificate_id)
REFERENCES raw_certificates(id)
"ct_entry_chain_ct_entry_id_fkey" FOREIGN KEY (ct_entry_id) REFERENCES
ct_entry(id)

This has resulted in 205 GB being read from disk, and 93 GB being written.
While I only expects 66 GB to be rewritten.

Looking at the verbose output of the vacuum, it seems that it needed to scan
the indexes 4 times. The amount of data read is about what I expect.

Looking with strace what happens, it seems that the first time it reads the
data, it also writes everything back. I assume it marks the rows as dead.
Then when indexes are cleaned up, it reads+writes the whole table again, to
actually vacuum it. Can that first write be avoided?

Browse pgsql-bugs by date

  From Date Subject
Next Message Kurt Roeckx 2020-05-17 15:32:17 Re: BUG #16443: Too much memory usage on insert query
Previous Message PG Bug reporting form 2020-05-17 14:42:09 BUG #16443: Too much memory usage on insert query