Re: Make unlogged table resets detectable

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, pgsql-hackers(at)postgresql(dot)org
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Make unlogged table resets detectable
Date: 2021-06-04 06:42:22
Message-ID: d9f90455-0f76-7615-1303-849c430054dd@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/06/2021 23:04, Jeff Davis wrote:
> One problem with unlogged tables is that the application has no way to
> tell if they were reset, or they just happen to be empty.
>
> This can be a problem with sharding, where you might have different
> shards of an unlogged table on different servers. If one server
> crashes, you'll be missing only one shard of the data, which may appear
> inconsistent. In that case, you'd like the application (or sharding
> solution) to be able to detect that one shard was lost, and TRUNCATE
> those that remain to get back to a reasonable state.
>
> It would be easy enough for the init fork to have a single page with a
> flag set. That way, when the main fork is replaced with the init fork,
> other code could detect that a reset happened.

I'd suggest using a counter rather than a flag. With a flag, if one
client clears the flag to acknowledge that a truncation happened, others
might miss it. See also ABA problem.

> When detected, depending on a GUC, the behavior could be to auto-
> truncate it (to get the current silent behavior), or refuse to perform
> the operation (except an explicit TRUNCATE), or issue a
> warning/log/notice.

TRUNCATE isn't quite what happens when an unlogged table is
re-initialized. It changes the relfilenode, resets stats, and requires a
more strict lock. So I don't think repurposing TRUNCATE for
re-initializing a table is a good idea. There's also potential for a
race condition, if two connections see that a table needs
re-initialization, and issue "TRUNCATE + INSERT" concurrently. One of
the INSERTs will be lost.

A warning or notice is easy to miss.

> The biggest challenge would be: when should we detect that the reset
> has happened? There might be a lot of entry points. Another idea would
> be to just have a SQL function that the application could call whenever
> it needs to know.

Yeah, a SQL function to get the current "reset counter" would be nice.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2021-06-04 06:58:15 Re: security_definer_search_path GUC
Previous Message Joel Jacobson 2021-06-04 06:37:52 Re: security_definer_search_path GUC