Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints
Date: 2022-08-03 16:02:00
Message-ID: 20220803160200.GD19644@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 03, 2022 at 04:45:23PM +0530, Dilip Kumar wrote:
> Another version of the patch which closes the smgr at the end using
> smgrcloserellocator() and I have also added a commit message.

Thanks for providing a patch.
This seems to fix the second problem with accessing freed memory.

But I reproduced the first problem with a handful of tries interrupting the
while loop:

2022-08-03 10:39:50.129 CDT client backend[5530] [unknown] PANIC: could not open critical system index 2662

In the failure, when trying to connect to the new "a" DB, it does this:

[pid 10700] openat(AT_FDCWD, "base/17003/pg_filenode.map", O_RDONLY) = 11
[pid 10700] read(11, "\27'Y\0\21\0\0\0\353\4\0\0\353\4\0\0\341\4\0\0\341\4\0\0\347\4\0\0\347\4\0\0\337\4\0\0\337\4\0\0\24\v\0\0\24\v\0\0\25\v\0\0\25\v\0\0K\20\0\0K\20\0\0L\20\0\0L\20\0\0\202\n\0\0\202\n\0\0\203\n\0\0\203\n\0\0\217\n\0\0\217\n\0\0\220\n\0\0\220\n\0\0b\n\0\0b\n\0\0c\n\0\0c\n\0\0f\n\0\0f\n\0\0g\n\0\0g\n\0\0\177\r\0\0\177\r\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\362\366\252\337", 524) = 524
[pid 10700] close(11) = 0
[pid 10700] openat(AT_FDCWD, "base/17003/pg_internal.init", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid 10700] openat(AT_FDCWD, "base/17003/1259", O_RDWR) = 11
[pid 10700] lseek(11, 0, SEEK_END) = 106496
[pid 10700] lseek(11, 0, SEEK_END) = 106496

And then reads nothing but zero bytes from FD 11 (rel 1259/pg_class)

So far, I haven't succeeded in eliciting anything useful from valgrind.

--
Justin

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2022-08-03 16:13:59 Re: pg15b2: large objects lost on upgrade
Previous Message Jacob Champion 2022-08-03 15:58:49 Re: [PATCH] CF app: add "Returned: Needs more interest"