Re: [PATCH] Allow multiple recursive self-references

From: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
To: Denis Hirn <denis(dot)hirn(at)uni-tuebingen(dot)de>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Pantelis Theodosiou <ypercube(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zhihong Yu <zyu(at)yugabyte(dot)com>
Subject: Re: [PATCH] Allow multiple recursive self-references
Date: 2022-01-25 10:19:58
Message-ID: 8c9479c8-ef06-bfbf-9779-4da3b3bc2e59@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The explanations below are satisfactory to me. I think the executor
changes in this patch are ok. But I would like someone else who has
more experience in this particular area to check it too; I'm not going
to take committer responsibility for this by myself without additional
review.

As I said earlier, I think semantically/mathematically, the changes
proposed by this patch set are okay.

The rewriting in the parse analysis is still being debated, but it can
be tackled in separate patches/commits.

On 11.01.22 12:33, Denis Hirn wrote:
>> I wonder why in ExecWorkTableScan() and ExecReScanWorkTableScan() you call
>> tuplestore_copy_read_pointer() instead of just tuplestore_select_read_pointer().
> The WorkTableScan reads the working_table tuplestore of the parent RecursiveUnion
> operator. But after each iteration of the RecursiveUnion operator, the following
> operations are performed:
>
>> 122 /* done with old working table ... */
>> 123 tuplestore_end(node->working_table); -- (1)
>> 124
>> 125 /* intermediate table becomes working table */
>> 126 node->working_table = node->intermediate_table; -- (2)
>> 127
>> 128 /* create new empty intermediate table */
>> 129 node->intermediate_table = tuplestore_begin_heap(false, false,
>> 130 work_mem); -- (3)
> https://doxygen.postgresql.org/nodeRecursiveunion_8c_source.html#l00122
>
> In step (1), the current working_table is released. Therefore, all read pointers
> that we had additionally allocated are gone, too. The intermediate table becomes
> the working table in step (2), and finally a new empty intermediate table is
> created (3).
>
> Because of step (1), we have to allocate new unique read pointers for each worktable
> scan again. Just using tuplestore_select_read_pointer() would be incorrect.
>
>> What is the special role of read pointer 0 that you are copying. Before your
>> changes, it was just the implicit read pointer, but now that we have several,
>> it would be good to explain their relationship.
> To allocate a new read pointer, tuplestore_alloc_read_pointer() could also be used.
> However, we would have to know about the eflags parameter – which the worktable
> scan has no information about.
>
> The important thing about read pointer 0 is that it always exists, and it is initialized correctly.
> Therefore, it is save to copy read pointer 0 instead of creating a new one from scratch.
>
>
>> Also, the comment you deleted says "Therefore, we don't need a private read pointer
>> for the tuplestore, nor do we need to tell tuplestore_gettupleslot to copy."
>> You addressed the first part with the read pointer handling, but what about the
>> second part? The tuplestore_gettupleslot() call in WorkTableScanNext() still
>> has copy=false. Is this an oversight, or did you determine that copying is
>> still not necessary?
> That's an oversight. Copying is still not necessary. Copying would only be required,
> if additional writes to the tuplestore occur. But that can not happen here.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2022-01-25 10:34:17 Re: Non-decimal integer literals
Previous Message Heikki Linnakangas 2022-01-25 10:12:40 Re: Split xlog.c