Re: Table AM Interface Enhancements

From: Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Japin Li <japinli(at)hotmail(dot)com>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Table AM Interface Enhancements
Date: 2024-03-28 09:46:30
Message-ID: CALT9ZEGNiQH57Edrdk7LRdbMQ3spc_8pQ1auem1BjphKH5z=4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi, Alexander!
Thank you for working on these patches.
On Thu, 28 Mar 2024 at 02:14, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
wrote:

> Hi, Pavel!
>
> Thank you for your feedback. The revised patch set is attached.
>
> I found that vacuum.c has a lot of heap-specific code. Thus, it
> should be OK for analyze.c to keep heap-specific code. Therefore, I
> rolled back the movement of functions between files. That leads to a
> smaller patch, easier to review.
>
I agree with you.
And with the changes remaining in heapam_handler.c I suppose we can also
remove the includes introduced:

#include <math.h>
#include "utils/sampling.h"
#include "utils/spccache.h"

On Wed, Mar 27, 2024 at 2:52 PM Pavel Borisov <pashkin(dot)elfe(at)gmail(dot)com>
> wrote:
> >> The revised rest of the patchset is attached.
> >> 0001 (was 0006) – I prefer the definition of AcquireSampleRowsFunc to
> >> stay in vacuum.h. If we move it to sampling.h then we would have to
> >> add there includes to define Relation, HeapTuple etc. I'd like to
> >> avoid this kind of change. Also, I've deleted
> >> table_beginscan_analyze(), because it's only called from
> >> tableam-specific AcquireSampleRowsFunc. Also I put some comments to
> >> heapam_scan_analyze_next_block() and heapam_scan_analyze_next_tuple()
> >> given that there are now no relevant comments for them in tableam.h.
> >> I've removed some redundancies from acquire_sample_rows(). And added
> >> comments to AcquireSampleRowsFunc based on what we have in FDW docs
> >> for this function. Did some small edits as well. As you suggested,
> >> turned back declarations for acquire_sample_rows() and compare_rows().
> >
> >
> > In my comment in the thread I was not thinking about returning the old
> name acquire_sample_rows(), it was only about the declarations and the
> order of functions to be one code block. To me heapam_acquire_sample_rows()
> looks better for a name of heap implementation of *AcquireSampleRowsFunc().
> I suggest returning the name heapam_acquire_sample_rows() from v4. Sorry
> for the confusion in this.
>
> I found that the function name acquire_sample_rows is referenced in
> quite several places in the source code. So, I would prefer to save
> the old name to keep the changes minimal.
>
The full list shows only a couple of changes in analyze.c and several
comments elsewhere.

contrib/postgres_fdw/postgres_fdw.c: * of the relation. Same
algorithm as in acquire_sample_rows in
src/backend/access/heap/vacuumlazy.c: * match what analyze.c's
acquire_sample_rows() does, otherwise VACUUM
src/backend/access/heap/vacuumlazy.c: * The logic here is a bit
simpler than acquire_sample_rows(), as
src/backend/access/heap/vacuumlazy.c: * what
acquire_sample_rows() does.
src/backend/access/heap/vacuumlazy.c: *
acquire_sample_rows() does, so be consistent.
src/backend/access/heap/vacuumlazy.c: * acquire_sample_rows()
will recognize the same LP_DEAD items as dead
src/backend/commands/analyze.c:static int
acquire_sample_rows(Relation onerel, int elevel,
src/backend/commands/analyze.c: acquirefunc = acquire_sample_rows;
src/backend/commands/analyze.c: * acquire_sample_rows -- acquire a random
sample of rows from the table
src/backend/commands/analyze.c:acquire_sample_rows(Relation onerel, int
elevel,
src/backend/commands/analyze.c: * This has the same API as
acquire_sample_rows, except that rows are
src/backend/commands/analyze.c: acquirefunc =
acquire_sample_rows;

My point for renaming is to make clear that it's a heap implementation of
acquire_sample_rows which could be useful for possible reworking heap
implementations of table am methods into a separate place later. But I'm
also ok with the existing naming.

> > The changed type of static function that always returned true for heap
> looks good to me:
> > static void heapam_scan_analyze_next_block
> >
> > The same is for removing the comparison of always true "block_accepted"
> in (heapam_)acquire_sample_rows()
>
> Ok!
>
> > Removing table_beginscan_analyze and call scan_begin() is not in the
> same style as other table_beginscan_* functions. Though this is not a
> change in functionality, I'd leave this part as it was in v4.
>
> With the patch, this method doesn't have usages outside of table am.
> I don't think we should keep a method, which doesn't have clear
> external usage patterns. But I agree that starting a scan with
> heap_beginscan() and ending with table_endscan() is not correct. Now
> ending this scan with heap_endscan().
>
Good!

> > Also, a comment about it was introduced in v5:
> >
> > src/backend/access/heap/heapam_handler.c: * with
> table_beginscan_analyze()
>
> Corrected.

> For comments I'd propose:
> > %s/In addition, store estimates/In addition, a function should store
> estimates/g
> > %s/zerp/zero/g
>
> Fixed.
>
> >> 0002 (was 0007) – I've turned the redundant "if", which you've pointed
> >> out, into an assert. Also, added some comments, most notably comment
> >> for TableAmRoutine.reloptions based on the indexam docs.
> >
> > %s/validate sthe/validates the/g
>
> Fixed.
>
> > This seems not needed, it's already inited to InvalidOid before.
> > +else
> > +accessMethod = default_table_access_method;
> > + accessMethodId = InvalidOid;
> >
> > This code came from 374c7a22904. I don't insist on this simplification
> in a patch 0002.
>
> This is minor redundancy. I prefer to keep it. This makes it obvious
> that patch just moved this code.
>
I agree with the remaining.

Regards,
Pavel Borisov

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2024-03-28 10:04:21 Re: Synchronizing slots from primary to standby
Previous Message Bertrand Drouvot 2024-03-28 09:43:44 Re: Introduce XID age and inactive timeout based replication slot invalidation