Re: In-placre persistance change of a relation

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: osumi(dot)takamichi(at)fujitsu(dot)com
Cc: tsunakawa(dot)takay(at)fujitsu(dot)com, sfrost(at)snowman(dot)net, masao(dot)fujii(at)oss(dot)nttdata(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: In-placre persistance change of a relation
Date: 2020-11-13 08:23:12
Message-ID: 20201113.172312.1767546251154140847.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Fri, 13 Nov 2020 07:15:41 +0000, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com> wrote in
> Hello, Tsunakawa-San
>

Thanks for sharing it!

> > Do you know the reason why data copy was done before? And, it may be
> > odd for me to ask this, but I think I saw someone referred to the past
> > discussion that eliminating data copy is difficult due to some processing at
> > commit. I can't find it.
> I can share 2 sources why to eliminate the data copy is difficult in hackers thread.
>
> Tom's remark and the context to copy relation's data.
> https://www.postgresql.org/message-id/flat/31724(dot)1394163360%40sss(dot)pgh(dot)pa(dot)us#31724(dot)1394163360(at)sss(dot)pgh(dot)pa(dot)us

https://www.postgresql.org/message-id/CA+Tgmob44LNwwU73N1aJsGQyzQ61SdhKJRC_89wCm0+aLg=x2Q@mail.gmail.com

> No, not really. The issue is more around what happens if we crash
> part way through. At crash recovery time, the system catalogs are not
> available, because the database isn't consistent yet and, anyway, the
> startup process can't be bound to a database, let alone every database
> that might contain unlogged tables. So the sentinel that's used to
> decide whether to flush the contents of a table or index is the
> presence or absence of an _init fork, which the startup process
> obviously can see just fine. The _init fork also tells us what to
> stick in the relation when we reset it; for a table, we can just reset
> to an empty file, but that's not legal for indexes, so the _init fork
> contains a pre-initialized empty index that we can just copy over.
>
> Now, to make an unlogged table logged, you've got to at some stage
> remove those _init forks. But this is not a transactional operation.
> If you remove the _init forks and then the transaction rolls back,
> you've left the system an inconsistent state. If you postpone the
> removal until commit time, then you have a problem if it fails,

It's true. That are the cause of headache.

> particularly if it works for the first file but fails for the second.
> And if you crash at any point before you've fsync'd the containing
> directory, you have no idea which files will still be on disk after a
> hard reboot.

This is not an issue in this patch *except* the case where init fork
is failed to removed but the following removal of inittmp fork
succeeds. Another idea is adding a "not-yet-committed" property to a
fork. I added a new fork type for easiness of the patch but I could
go that way if that is an issue.

> Amit-San quoted this thread and mentioned that point in another thread.
> https://www.postgresql.org/message-id/CAA4eK1%2BHDqS%2B1fhs5Jf9o4ZujQT%3DXBZ6sU0kOuEh2hqQAC%2Bt%3Dw%40mail.gmail.com

This sounds like a bit differrent discussion. Making part-of-a-table
UNLOGGED looks far difficult to me.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Borisov 2020-11-13 08:26:53 Re: Bogus documentation for bogus geometric operators
Previous Message Kyotaro Horiguchi 2020-11-13 07:47:48 Re: In-placre persistance change of a relation