From: | David Steele <david(at)pgmasters(dot)net> |
---|---|
To: | Adam Brightwell <adam(dot)brightwell(at)crunchydata(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Stephen Frost <sfrost(at)snowman(dot)net>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: PATCH: Exclude unlogged tables from base backups |
Date: | 2018-01-24 21:23:12 |
Message-ID: | 3a0be571-dc15-7384-849e-ad8f69412986@pgmasters.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 1/24/18 4:02 PM, Adam Brightwell wrote:
>>> If a new unlogged relation is created after constructed the
>>> unloggedHash before sending file, we cannot exclude such relation. It
>>> would not be problem if the taking backup is not long because the new
>>> unlogged relation unlikely becomes so large. However, if takeing a
>>> backup takes a long time, we could include large main fork in the
>>> backup.
>>
>> This is a good point. It's per database directory which makes it a
>> little better, but maybe not by much.
>>
>> Three options here:
>>
>> 1) Leave it as is knowing that unlogged relations created during the
>> backup may be copied and document it that way.
>>
>> 2) Construct a list for SendDir() to work against so the gap between
>> creating that and creating the unlogged hash is as small as possible.
>> The downside here is that the list may be very large and take up a lot
>> of memory.
>>
>> 3) Check each file that looks like a relation in the loop to see if it
>> has an init fork. This might affect performance since an
>> opendir/readdir loop would be required for every relation.
>>
>> Personally, I'm in favor of #1, at least for the time being. I've
>> updated the docs as indicated in case you and Adam agree.
>
> I agree with #1 and feel the updated docs are reasonable and
> sufficient to address this case for now.
>
> I have retested these patches against master at d6ab720360.
>
> All test succeed.
>
> Marking "Ready for Committer".
Thanks, Adam!
Actually, I was talking to Stephen about this it seems like #3 would be
more practical if we just stat'd the init fork for each relation file
found. I doubt the stat would add a lot of overhead and we can track
each unlogged relation in a hash table to reduce overhead even more.
I'll look at that tomorrow and see if I can work out something practical.
--
-David
david(at)pgmasters(dot)net
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-01-24 21:29:16 | Re: pgsql: Add parallel-aware hash joins. |
Previous Message | Tom Lane | 2018-01-24 21:11:09 | Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump |