Re: Table AM Interface Enhancements

From: Japin Li <japinli(at)hotmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Table AM Interface Enhancements
Date: 2024-03-19 14:26:06
Message-ID: ME3P282MB3166E42AEB15A41E6EBADC67B62C2@ME3P282MB3166.AUSP282.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On Tue, 19 Mar 2024 at 21:05, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> Hi, Pavel!
>
> On Tue, Mar 19, 2024 at 11:34 AM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com> wrote:
>> On Tue, 19 Mar 2024 at 03:34, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>>>
>>> On Sun, Mar 3, 2024 at 1:50 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>>> > On Mon, Nov 27, 2023 at 10:18 PM Mark Dilger
>>> > <mark(dot)dilger(at)enterprisedb(dot)com> wrote:
>>> > >
>>> > > > On Nov 25, 2023, at 9:47 AM, Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
>>> > > >
>>> > > >> Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?
>>> > > >
>>> > > > New tuple tuple_insert_with_arbiter() table AM API method needs EState
>>> > > > argument to call executor functions: ExecCheckIndexConstraints(),
>>> > > > ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
>>> > > > probably need to invent some opaque way to call this executor function
>>> > > > without revealing EState to table AM. Do you think this could work?
>>> > >
>>> > > We're clearly not accessing all of the EState, just some specific fields, such as es_per_tuple_exprcontext. I think you could at least refactor to pass the minimum amount of state information through the table AM API.
>>> >
>>> > Yes, the table AM doesn't need the full EState, just the ability to do
>>> > specific manipulation with tuples. I'll refactor the patch to make a
>>> > better isolation for this.
>>>
>>> Please find the revised patchset attached. The changes are following:
>>> 1. Patchset is rebase. to the current master.
>>> 2. Patchset is reordered. I tried to put less debatable patches to the top.
>>> 3. tuple_is_current() method is moved from the Table AM API to the
>>> slot as proposed by Matthias van de Meent.
>>> 4. Assert added to the table_free_rd_amcache() as proposed by Pavel Borisov.
>>
>>
>> Patches 0001-0002 are unchanged compared to the last version in thread [1]. In my opinion, it's still ready to be committed, which was not done for time were too close to feature freeze one year ago.
>>
>> 0003 - Assert added from previous version. I still have a strong opinion that allowing multi-chunked data structures instead of single chunks is completely safe and makes natural process of Postgres improvement that is self-justified. The patch is simple enough and ready to be pushed.
>>
>> 0004 (previously 0007) - Have not changed, and there is consensus that this is reasonable. I've re-checked the current code. Looks safe considering returning a different slot, which I doubted before. So consider this patch also ready.
>>
>> 0005 (previously 0004) - Unused argument in the is_current_xact_tuple() signature is removed. Also comparing to v1 the code shifted from tableam methods to TTS's level.
>>
>> I'd propose to remove Assert(!TTS_EMPTY(slot)) for tts_minimal_is_current_xact_tuple() and tts_virtual_is_current_xact_tuple() as these are only error reporting functions that don't use slot actually.
>>
>> Comment similar to:
>> +/*
>> + * VirtualTupleTableSlots never have a storage tuples. We generally
>> + * shouldn't get here, but provide a user-friendly message if we do.
>> + */
>> also applies to tts_minimal_is_current_xact_tuple()
>>
>> I'd propose changes for clarity of this comment:
>> %s/a storage tuples/storage tuples/g
>> %s/generally//g
>>
>> Otherwise patch 0005 also looks good to me.
>>
>> I'm planning to review the remaining patches. Meanwhile think pushing what is now ready and uncontroversial is a good intention.
>> Thank you for the work done on this patchset!
>
> Thank you, Pavel!
>
> Regarding 0005, I did apply "a storage tuples" grammar fix. Regarding
> the rest of the things, I'd like to keep methods
> tts_*_is_current_xact_tuple() to be similar to nearby
> tts_*_getsysattr(). This is why I'm keeping the rest unchanged. I
> think we could refactor that later, but together with
> tts_*_getsysattr() methods.
>
> I'm going to push 0003, 0004 and 0005 if there are no objections.
>
> And I'll update 0001 and 0002 in their dedicated thread.
>

When I try to test the patch on Ubuntu 22.04 with GCC 11.4.0. There are some
warnings as following:

/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c: In function ‘heapam_acquire_sample_rows’:
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1603:28: warning: implicit declaration of function ‘get_tablespace_maintenance_io_concurrency’ [-Wimplicit-function-declaration]
1603 | prefetch_maximum = get_tablespace_maintenance_io_concurrency(onerel->rd_rel->reltablespace);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1757:30: warning: implicit declaration of function ‘floor’ [-Wimplicit-function-declaration]
1757 | *totalrows = floor((liverows / bs.m) * totalblocks + 0.5);
| ^~~~~
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:49:1: note: include ‘<math.h>’ or provide a declaration of ‘floor’
48 | #include "utils/sampling.h"
+++ |+#include <math.h>
49 |
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1757:30: warning: incompatible implicit declaration of built-in function ‘floor’ [-Wbuiltin-declaration-mismatch]
1757 | *totalrows = floor((liverows / bs.m) * totalblocks + 0.5);
| ^~~~~
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1757:30: note: include ‘<math.h>’ or provide a declaration of ‘floor’
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1603:21: warning: implicit declaration of function 'get_tablespace_maintenance_io_concurrency' is invalid in C99 [-Wimplicit-function-declaration]
prefetch_maximum = get_tablespace_maintenance_io_concurrency(onerel->rd_rel->reltablespace);
^

It seems you forgot to include math.h and utils/spccache.h header files
in heapam_handler.c.

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index ac24691bd2..04365394f1 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -19,6 +19,8 @@
*/
#include "postgres.h"

+#include <math.h>
+
#include "access/genam.h"
#include "access/heapam.h"
#include "access/heaptoast.h"
@@ -46,6 +48,7 @@
#include "utils/builtins.h"
#include "utils/rel.h"
#include "utils/sampling.h"
+#include "utils/spccache.h"

static TM_Result heapam_tuple_lock(Relation relation, Datum tupleid,
Snapshot snapshot, TupleTableSlot *slot,

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-03-19 14:28:06 Re: Read data from Postgres table pages
Previous Message Sushrut Shivaswamy 2024-03-19 14:22:44 Read data from Postgres table pages