From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: A few patches to clarify snapshot management |
Date: | 2025-08-19 20:45:01 |
Message-ID: | bc806cb7-843b-4605-8fe4-44b9ceb21500@iki.fi |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 19/08/2025 03:14, Noah Misch wrote:
> On Fri, Aug 15, 2025 at 02:12:03PM +0300, Heikki Linnakangas wrote:
>> How about always using a fresh snapshot instead? Instead of pushing the
>> historic snapshot as the active snapshot, _disable_ the historic snapshot
>> and use GetTransactionSnapshot() to acquire a regular snapshot?
>
> I see advantages in using the historic snapshot:
>
> 1. It's the longstanding behavior, and applications aren't complaining.
>
> 2. If someone wants "fresh snapshot", they can do that today with a C
> extension that provides an execute_at_fresh_snapshot(sql text) SQL
> function. If we adopt the fresh snapshot in pglogical or in core, I don't
> see a comparably-clean way for the application code to get back to the
> historic snapshot. (That's because the historic snapshot lives only in
> stack variables at the moment in question.)
>
> 3. If an application is relying on the longstanding behavior and needs to
> adapt to the proposed "fresh snapshot" behavior, that may be invasive to
> implement and harmful to performance. For example, instead of reading from
> a user_catalog_table inside the filter, the application may need to
> duplicate that table's data into the rows being filtered.
Oh, I had not considered user_catalog_tables. I didn't remember that's a
thing.
The docs on user_catalog_table says:
Note that access to user catalog tables or regular system catalog
tables in the output plugins has to be done via the systable_* scan APIs
only. Access via the heap_* scan APIs will error out.
That doesn't quite say that you should be able to run arbitrary queries
on a user_catalog_table. In fact it suggests that you can't, because
surely you're not using the systable_* scan APIs when running arbitrary
queries.
That said, I agree it would be nice if we can keep it working.
> Does the "fresh snapshot" alternative bring strengths to outweigh those?
The argument for the fresh snapshot is that using a historic snapshot
only makes sense for catalog tables, and by taking a fresh snapshot, we
avoid the mistake of using the historic snapshot for anything else. I
thought that there's practically no valid use case for using a historic
snapshot in anything that calls GetTransactionSnapshot(), and if it
happens to work, it's only because the snapshot isn't actually used for
anything or is only used to read data that hasn't changed so that you
get away with it.
I agree that reading a table marked as user_catalog_table is valid case,
however, so I take back that argument.
How about the attached, then? It reverts the GetTransactionSnapshot()
change. But to still catch at least some of the invalid uses of the
historic snapshot, it adds checks to heap_beginscan() and
index_beginscan(), to complain if they are called on a non-catalog
relation with a historic snapshot.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
0001-Revert-GetTransactionSnapshot-to-return-historic-sna.patch | text/x-patch | 5.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Jelte Fennema-Nio | 2025-08-19 20:49:06 | Re: BackendKeyData is mandatory? |
Previous Message | Tom Lane | 2025-08-19 20:26:41 | Re: Improve LWLock tranche name visibility across backends |