Quick Links

Re: 64 bit transaction id

From:	Павел Ерёмин <shnoor111gmail(at)yandex(dot)ru>
To:	Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc:	Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: 64 bit transaction id
Date:	2019-11-04 13:39:44
Message-ID:	58099761572874784@myt5-bc0f9d8e5f27.qloud-c.yandex.net
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

<div>And yet, if I try to implement a similar mechanism, if successful, will my revision be considered?</div><div> </div><div>regards</div><div> </div><div> </div><div>03.11.2019, 22:15, "Tomas Vondra" <tomas(dot)vondra(at)2ndquadrant(dot)com>:</div><blockquote>On Sun, Nov 03, 2019 at 02:17:15PM +0300, Павел Ерёмин wrote: <blockquote class="b4fd5cf2ec92bc68cb898700bb81355fwmi-quote"> I completely agree with all of the above. Therefore, the proposed mechanism may entail larger improvements (and not only VACUUM). </blockquote> I think the best think you can do is try implementing this ... I'm afraid the "improvements" essentially mean making various imporant parts of the system much more complicated and expensive. There's a trade-off between saving 8B per row and additional overhead (during vacuum etc.), and it does not seem like a winning strategy. What started as "we can simply look at the next row version" is clearly way more complicated and expensive. The trouble here is that it adds dependency between pages in the data file. That for example means that during cleanup of a page it may be necessary to modify the other page, when originally that would be read-only in that checkpoint interval. That's essentially write amplification, and may significantly increase the amount of WAL due to generating FPW for the other page. <blockquote class="b4fd5cf2ec92bc68cb898700bb81355fwmi-quote"> I can offer the following solution. For VACUUM, create a hash table. VACUUM scanning the table sees that the version (tuple1) has t_ctid filled and refers to the address tuple2, it creates a structure into which it writes the address tuple1, tuple1.xid, length tuple1 (well, and other information that is needed), puts this structure in the hash table by key tuple2 addresses. VACUUM reaches tuple2, checks the address of tuple2 in the hash table - if it finds it, it evaluates the connection between them and makes a decision on cleaning. </blockquote> We know VACUUM is already pretty expensive, so making it even more expensive seems pretty awful. And the proposed solution seems damn expensive. We already do something similar for indexes - we track pointers for removed rows, so that we can remove them from indexes. And it's damn expensive because we don't know where in the index the tuples are - so we have to scan the whole indexes. This would mean we have to do the same thing for table, because we don't know where in the table are the older versions of those rows, because we don't know where the other rows are. That seems mighty expensive. Not to mention that this does nothing for page-level vacuum, which we do when trying to fit another row on a page (e.g. for HOT). This has to be absolutely cheap, we certainly are not going to do lookups of other pages or looking for older versions of the row, and so on. Being able to do visibility decisions based on the tuple alone (or possibly page-level + tuple information) has a lot of value, and I don't think we want to make this more complicated. regards -- Tomas Vondra <a href="http://www.2ndquadrant.com/">http://www.2ndQuadrant.com</a> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services </blockquote>

Attachment	Content-Type	Size
unknown_filename	text/html	3.8 KB

In response to

Re: 64 bit transaction id at 2019-11-03 19:15:22 from Tomas Vondra

Responses

Re: 64 bit transaction id at 2019-11-04 15:07:23 from Tomas Vondra

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2019-11-04 14:28:30	Re: auxiliary processes in pg_stat_ssl
Previous Message	Alvaro Herrera	2019-11-04 13:25:59	Re: auxiliary processes in pg_stat_ssl