Re: base backup vs. concurrent truncation

From: Aleksander Alekseev <aleksander(at)timescale(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: base backup vs. concurrent truncation
Date: 2023-05-01 16:54:27
Message-ID: CAJ7c6TMGG=7AYkPtmo7jagwXS4_R=hDoOUejusnDLfB6j_fURQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

> I think that to reproduce the scenario, you want the truncate to happen in
> its own checkpoint cycle.

OK, let's try this again.

In order to effectively disable the checkpointer I added the following lines
to postgresql.conf:

```
checkpoint_timeout = 3600
max_wal_size = 100G
```

I'm also keeping an eye on `logfile` in order to make sure the system doesn't
do anything unexpected.

Then:

```
-- Just double-checking
show checkpoint_timeout;
checkpoint_timeout
--------------------
1h

show max_wal_size;
max_wal_size
--------------
100GB

create table truncateme(id integer, val varchar(1024));
alter table truncateme set (autovacuum_enabled = off);
select relfilenode from pg_class where relname = 'truncateme';

relfilenode
-------------
16385

-- takes ~30 seconds
insert into truncateme
select id,
(
select string_agg(chr((33+random()*(126-33)) :: integer), '')
from generate_series(1,1000)
)
from generate_series(1,2*1024*1024) as id;

delete from truncateme where id > 1024*1024;

select count(*) from truncateme;
count
---------
1048576

-- Making a checkpoint as pg_basebackup would do.
-- Also, making sure truncate will happen in its own checkpoint cycle.
checkpoint;
```

Again I see 3 segments:

```
$ ls -lah 16385*
-rw------- 1 eax eax 1.0G May 1 19:24 16385
-rw------- 1 eax eax 1.0G May 1 19:27 16385.1
-rw------- 1 eax eax 293M May 1 19:27 16385.2
-rw------- 1 eax eax 608K May 1 19:24 16385_fsm
```

Making a backup of .2 as if I'm pg_basebackup:

```
cp 16385.2 ~/temp/16385.2
```

Truncating the table:

```
vacuum truncateme;
```

... and killing postgres:

```
$ pkill -9 postgres

```

Now I see:

```
$ ls -lah 16385*
-rw------- 1 eax eax 1.0G May 1 19:30 16385
-rw------- 1 eax eax 147M May 1 19:31 16385.1
-rw------- 1 eax eax 0 May 1 19:31 16385.2
-rw------- 1 eax eax 312K May 1 19:31 16385_fsm
-rw------- 1 eax eax 40K May 1 19:31 16385_vm
$ cp ~/temp/16385.2 ./16385.2
```

Starting postgres:

```
LOG: starting PostgreSQL 16devel on x86_64-linux, compiled by
gcc-11.3.0, 64-bit
LOG: listening on IPv4 address "0.0.0.0", port 5432
LOG: listening on IPv6 address "::", port 5432
LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
LOG: database system was interrupted; last known up at 2023-05-01 19:27:22 MSK
LOG: database system was not properly shut down; automatic recovery in progress
LOG: redo starts at 0/8AAB36B0
LOG: invalid record length at 0/CE9BDE60: expected at least 24, got 0
LOG: redo done at 0/CE9BDE28 system usage: CPU: user: 6.51 s, system:
2.45 s, elapsed: 8.97 s
LOG: checkpoint starting: end-of-recovery immediate wait
LOG: checkpoint complete: wrote 10 buffers (0.0%); 0 WAL file(s)
added, 0 removed, 68 recycled; write=0.026 s, sync=1.207 s,
total=1.769 s; sync files=10, longest=1.188 s, average=0.121 s;
distance=1113129 kB, estimate=1113129 kB; lsn=0/CE9BDE60, redo
lsn=0/CE9BDE60
LOG: database system is ready to accept connections
```

```
$ ls -lah 16385*
-rw------- 1 eax eax 1.0G May 1 19:33 16385
-rw------- 1 eax eax 147M May 1 19:33 16385.1
-rw------- 1 eax eax 0 May 1 19:33 16385.2
-rw------- 1 eax eax 312K May 1 19:33 16385_fsm
-rw------- 1 eax eax 40K May 1 19:33 16385_vm
```

```
select count(*) from truncateme;
count
---------
1048576
```

So I'm still unable to reproduce the described scenario, at least on PG16.

--
Best regards,
Aleksander Alekseev

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2023-05-01 17:09:06 Re: Overhauling "Routine Vacuuming" docs, particularly its handling of freezing
Previous Message Robert Haas 2023-05-01 16:33:25 Re: Logging parallel worker draught