From: | jian he <jian(dot)universality(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: speedup COPY TO for partitioned table. |
Date: | 2025-10-09 07:10:16 |
Message-ID: | CACJufxEheV10DpjFf+J1OabMgRe6CH+4c6d8ca3Wh1v8Twh3ZA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 9, 2025 at 9:14 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> > please check the attached v15.
> >
> Thank you for working on this! I've reviewed the v15 patch, and here
> are review comments:
>
> ---
> + children = find_all_inheritors(RelationGetRelid(rel),
> + AccessShareLock,
> + NULL);
> +
> + foreach_oid(childreloid, children)
> + {
> + char relkind = get_rel_relkind(childreloid);
>
> I think it's better to write some comments summarizing what we're
> doing in the loop.
sure.
>
> ---
> + relation_name = get_rel_name(childreloid);
> + ereport(ERROR,
> + errcode(ERRCODE_WRONG_OBJECT_TYPE),
> + errmsg("cannot copy from foreign table
> \"%s\"", relation_name),
> + errdetail("Partition \"%s\" is a foreign
> table in the partitioned table \"%s\"",
> + relation_name,
> RelationGetRelationName(rel)),
> + errhint("Try the COPY (SELECT ...) TO variant."));
>
> I think we don't need "the" in the error message.
>
> It's conventional to put all err*() macros in parentheses (i.e.,
> "(errcode(), ...)", it's technically omittable though.
>
https://www.postgresql.org/docs/current/error-message-reporting.html
QUOTE:
<<<<>>>>>
The extra parentheses were required before PostgreSQL version 12, but
are now optional.
Here is a more complex example:
.....
<<<<>>>>>
related commit:
https://git.postgresql.org/cgit/postgresql.git/commit/?id=e3a87b4991cc2d00b7a3082abb54c5f12baedfd1
Less parenthesis is generally more readable, I think.
> ---
> + if (RELKIND_HAS_PARTITIONS(relkind))
> + continue;
> +
> + scan_oids = lappend_oid(scan_oids, childreloid);
>
> find_all_inheritors() returns a list of OIDs of child relations. I
> think we can delete relations whose kind is RELKIND_HAS_PARTITIONS()
> from the list instead of creating a new list scan_oids. Then, we can
> set cstate->partition to the list.
>
yech, we can use foreach_delete_current to delete list elements on the fly.
> ---
> tupDesc = RelationGetDescr(cstate->rel);
> + cstate->partitions = list_copy(scan_oids);
> }
>
> Why do we need to copy the list here?
>
yech, list_copy is not needed.
> ---
> With the patch we have:
>
> /*
> * If COPY TO source table is a partitioned table, then open each
> * partition and process each individual partition.
> */
> if (cstate->rel && cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> {
> foreach_oid(scan_oid, cstate->partitions)
> {
> Relation scan_rel;
>
> /* We already got the needed lock in BeginCopyTo */
> scan_rel = table_open(scan_oid, NoLock);
> CopyRelTo(cstate, scan_rel, cstate->rel, &processed);
> table_close(scan_rel, NoLock);
> }
> }
> else if (cstate->rel)
> CopyRelTo(cstate, cstate->rel, NULL, &processed);
> else
> {
> /* run the plan --- the dest receiver will send tuples */
>
> I think we can refactor the code structure as follow:
>
> if (cstate->rel)
> {
> if (cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
> {
> do CopyRelTo() for each OIDs in cstate->partition here.
> }
> else
> CopyRelTo(cstate, cstate->rel, NULL, &processed);
> }
> else
> {
> ...
sure, this may increase readability.
> + if (root_rel != NULL)
> + {
> + rootdesc = RelationGetDescr(root_rel);
> + root_slot = table_slot_create(root_rel, NULL);
> + map = build_attrmap_by_name_if_req(rootdesc, tupdesc, false);
> + }
>
> rootdesc can be declared inside this if statement or we can directly
> pass 'RelationGetDescr(root_rel)' to build_attrmap_by_name_if_req().
>
sure. good idea.
> ---
> + /* Deconstruct the tuple ... */
> + if (map != NULL)
> + copyslot = execute_attr_map_slot(map, slot, root_slot);
> + else
> + {
> + slot_getallattrs(slot);
> + copyslot = slot;
> + }
>
> ISTM that the comment "Deconstruct the tuple" needs to move to before
> slot_getallattrs(slot).
>
ok.
> How about doing "slot = execute_attr_map_slot(map, slot, root_slot);"
> instead? (i.e., no need to have 'copyslot')
>
I tried but it seems not possible.
table_scan_getnextslot function require certain type of "slot", if we do
"slot = execute_attr_map_slot(map, slot, root_slot);"
then pointer "slot" type becomes virtual slot, then
it will fail on second time call table_scan_getnextslot
> ---
> + if (root_slot != NULL)
> + ExecDropSingleTupleTableSlot(root_slot);
> + table_endscan(scandesc);
>
> We might want to pfree 'map' if we create it.
>
ok.
Please check the attached v16.
Attachment | Content-Type | Size |
---|---|---|
v16-0001-support-COPY-partitioned_table-TO.patch | text/x-patch | 12.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Chao Li | 2025-10-09 07:15:24 | Re: Reorganize GUC structs |
Previous Message | Richard Guo | 2025-10-09 07:01:28 | Re: Eager aggregation, take 3 |