Re: Detailed questions about pg_xact_commit_timestamp

From: Adrien Nayrat <adrien(dot)nayrat(at)anayrat(dot)info>
To: Morris de Oryx <morrisdeoryx(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Detailed questions about pg_xact_commit_timestamp
Date: 2019-07-11 09:48:11
Message-ID: 1b2e2697-76fc-7d8f-1ca6-588093531d7d@anayrat.info
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 7/9/19 12:22 AM, Morris de Oryx wrote:
> I have some specific questions about pg_xact_commit_timestamp, and am hoping
> that this is the right place to ask. I read a lot of the commentary about the
> original patch, and the contributors seem to be here. If I'm asking in the wrong
> place, just let me know.
>
> I'm working on a design for a concurrency-safe incremental aggregate rollup
> system,and pg_xact_commit_timestamp sounds perfect. But I've found very little
> commentary on it generally, and couldn't figure out how it works in detail from
> the source code.
>
> Hopefully, someone knows the answers to a few questions:
>
> * Is it possible for pg_xact_commit_timestamp to produce times out of order?
> What I'm after is a way to identify records that have been chagned since a
> specific time so that I can get any later changes for processing. I don't need
> them in commit order, so overlapping timestamps aren't a problem.

I think yes. For example, you can have a session "A" xid 34386826 that commit
after session "B" xid 34386827:
postgres=# select pg_xact_commit_timestamp('34386827'::xid);
pg_xact_commit_timestamp
-------------------------------
2019-07-11 09:32:29.806183+00
(1 row)

postgres=# select pg_xact_commit_timestamp('34386826'::xid);
pg_xact_commit_timestamp
------------------------------
2019-07-11 09:32:38.99444+00
(1 row)

>
> * How many bytes are added to each row in the final implementation? The
> discussions I saw seemed to be ranging from 12-24 bytes. There was discussion of
> adding in extra bytes for "just in case." This is pre 9.5, so a world ago.

src/backend/access/transam/commit_ts.c says 8+4 bytes per xact.

Note it is not per row but per xact: We only have to store the timestamp for one
xid.

>
> * Are the timestamps indexed internally? With a B-tree? I ask for
> capacity-planning reasons.

I think no.

>
> * I've seen on StackOverflow and the design discussions that the timestamps are
> not kept indefinitely, but can't find the details on exactly how long they are
> stored.
>

Yes timestamp are stored in pg_commit_ts directory. Old timestamp are removed
after freeze has explained in
https://www.postgresql.org/docs/current/routine-vacuuming.html :

> The sole disadvantage of increasing autovacuum_freeze_max_age (and
vacuum_freeze_table_age along with it) is that the pg_xact and pg_commit_ts
subdirectories of the database cluster will take more space, because it must
store the commit status and (if track_commit_timestamp is enabled) timestamp of
all transactions back to the autovacuum_freeze_max_age horizon. The commit
status uses two bits per transaction, so if autovacuum_freeze_max_age is set to
its maximum allowed value of two billion, pg_xact can be expected to grow to
about half a gigabyte and pg_commit_ts to about 20GB. If this is trivial
compared to your total database size, setting autovacuum_freeze_max_age to its
maximum allowed value is recommended. Otherwise, set it depending on what you
are willing to allow for pg_xact and pg_commit_ts storage. (The default, 200
million transactions, translates to about 50MB of pg_xact storage and about 2GB
of pg_commit_ts storage.)

> * Any rules of thumb on the performance impact of enabling
> pg_xact_commit_timestamp? I don't need the data on all tables but, where I do,
> it sounds like it might work perfectly.
>
> Many thanks for any assistance!

I didn't notice any performance impact, but I didn't do any extensive testing.

Regards,

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-07-11 09:48:20 Re: Add parallelism and glibc dependent only options to reindexdb
Previous Message Binguo Bao 2019-07-11 09:23:24 Re: [proposal] de-TOAST'ing using a iterator