Re: speedup COPY TO for partitioned table.

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: jian he <jian(dot)universality(at)gmail(dot)com>
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, Álvaro Herrera <alvherre(at)kurilemu(dot)de>, Kirill Reshke <reshkekirill(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: speedup COPY TO for partitioned table.
Date: 2025-10-09 01:13:30
Message-ID: CAD21AoA7EwQYG+fy01ye9WOZsxOAn=PQXsnibqv-j_5bFfN68g@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Mon, Oct 6, 2025 at 2:49 AM jian he <jian(dot)universality(at)gmail(dot)com> wrote:
>
> On Fri, Oct 3, 2025 at 8:31 AM torikoshia <torikoshia(at)oss(dot)nttdata(dot)com> wrote:
> >
> > It’s been about two months since this discussion, and there don’t seem
> > to be any further comments.
> > How about updating the patch accordingly?
> > If there are no new remarks, I’d like to mark the patch as Ready for
> > Committer.
> >
> > >> + List *children = NIL;
> > >> ...
> > >> + {
> > >> + children = find_all_inheritors(RelationGetRelid(rel),
> > >>
> > >> Since 'children' is only used inside the else if block, I think we
> > >> don't
> > >> need the separate "List *children = NIL;" declaration.
> > >> Instead, it could just be "List *children =
> > >> find_all_inheritors(...)".
> > >>
> > > you are right.
> > > ""List *children = find_all_inheritors(...)"." should be ok.
> >
>
> hi.
>
> please check the attached v15.
>

Thank you for working on this! I've reviewed the v15 patch, and here
are review comments:

---
+ children = find_all_inheritors(RelationGetRelid(rel),
+ AccessShareLock,
+ NULL);
+
+ foreach_oid(childreloid, children)
+ {
+ char relkind = get_rel_relkind(childreloid);

I think it's better to write some comments summarizing what we're
doing in the loop.

---
+ relation_name = get_rel_name(childreloid);
+ ereport(ERROR,
+ errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("cannot copy from foreign table
\"%s\"", relation_name),
+ errdetail("Partition \"%s\" is a foreign
table in the partitioned table \"%s\"",
+ relation_name,
RelationGetRelationName(rel)),
+ errhint("Try the COPY (SELECT ...) TO variant."));

I think we don't need "the" in the error message.

It's conventional to put all err*() macros in parentheses (i.e.,
"(errcode(), ...)", it's technically omittable though.

---
+ if (RELKIND_HAS_PARTITIONS(relkind))
+ continue;
+
+ scan_oids = lappend_oid(scan_oids, childreloid);

find_all_inheritors() returns a list of OIDs of child relations. I
think we can delete relations whose kind is RELKIND_HAS_PARTITIONS()
from the list instead of creating a new list scan_oids. Then, we can
set cstate->partition to the list.

---
tupDesc = RelationGetDescr(cstate->rel);
+ cstate->partitions = list_copy(scan_oids);
}

Why do we need to copy the list here?

---
With the patch we have:

/*
* If COPY TO source table is a partitioned table, then open each
* partition and process each individual partition.
*/
if (cstate->rel && cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
{
foreach_oid(scan_oid, cstate->partitions)
{
Relation scan_rel;

/* We already got the needed lock in BeginCopyTo */
scan_rel = table_open(scan_oid, NoLock);
CopyRelTo(cstate, scan_rel, cstate->rel, &processed);
table_close(scan_rel, NoLock);
}
}
else if (cstate->rel)
CopyRelTo(cstate, cstate->rel, NULL, &processed);
else
{
/* run the plan --- the dest receiver will send tuples */

I think we can refactor the code structure as follow:

if (cstate->rel)
{
if (cstate->rel->rd_rel->relkind == RELKIND_PARTITIONED_TABLE)
{
do CopyRelTo() for each OIDs in cstate->partition here.
}
else
CopyRelTo(cstate, cstate->rel, NULL, &processed);
}
else
{
...

---
+ if (root_rel != NULL)
+ {
+ rootdesc = RelationGetDescr(root_rel);
+ root_slot = table_slot_create(root_rel, NULL);
+ map = build_attrmap_by_name_if_req(rootdesc, tupdesc, false);
+ }

rootdesc can be declared inside this if statement or we can directly
pass 'RelationGetDescr(root_rel)' to build_attrmap_by_name_if_req().

---
+ /* Deconstruct the tuple ... */
+ if (map != NULL)
+ copyslot = execute_attr_map_slot(map, slot, root_slot);
+ else
+ {
+ slot_getallattrs(slot);
+ copyslot = slot;
+ }

ISTM that the comment "Deconstruct the tuple" needs to move to before
slot_getallattrs(slot).

How about doing "slot = execute_attr_map_slot(map, slot, root_slot);"
instead? (i.e., no need to have 'copyslot')

---
+ if (root_slot != NULL)
+ ExecDropSingleTupleTableSlot(root_slot);
+ table_endscan(scandesc);

We might want to pfree 'map' if we create it.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeremy Schneider 2025-10-09 01:25:20 Re: another autovacuum scheduling thread
Previous Message Jeremy Schneider 2025-10-09 01:11:57 Re: sync_standbys_defined and pg_stat_replication