Re: reload-through-the-top-parent switch the partition table

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Rushabh Lathia <rushabh(dot)lathia(at)gmail(dot)com>
Cc: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: reload-through-the-top-parent switch the partition table
Date: 2017-08-10 14:56:53
Message-ID: CA+Tgmobu0OOLhE-QgBNCPFBAqhw7AMFjHKzZGVLSKThg+8eKKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 10, 2017 at 3:47 AM, Rushabh Lathia
<rushabh(dot)lathia(at)gmail(dot)com> wrote:
>> (1) seems like a pretty arbitrary restriction, so I don't like that
>> option. (2) would hurt performance in some use cases. Do we have an
>> option (3)?
>
> How about protecting option 2) with the load-via-partition-root protection.
> Means
> load the parents information even dump is not set only when
> load-via-partition-root
> & ispartition.
>
> This won't hurt performance for the normal cases.

Yes, that seems like the right approach.

+ Dump data via the top-most partitioned table (rather than partition
+ table) when dumping data for the partition table.

I think we should phrase this a bit more clearly, something like this:
When dumping a COPY or INSERT statement for a partitioned table,
target the root of the partitioning hierarchy which contains it rather
than the partition itself. This may be useful when reloading data on
a server where rows do not always fall into the same partitions as
they did on the original server. This could happen, for example, if
the partitioning column is of type text and the two system have
different definitions of the collation used to partition the data.

+ printf(_(" --load-via-partition-root load partition table via
the root relation\n"));

"relation" seems odd to me here. root table?

/* Don't bother computing anything for non-target tables, either */
if (!tblinfo[i].dobj.dump)
+ {
+ /*
+ * Load the parents information for the partition table when
+ * the load-via-partition-root option is set. As we need the
+ * parents information to get the partition root.
+ */
+ if (dopt->load_via_partition_root &&
+ tblinfo[i].ispartition)
+ findParentsByOid(&tblinfo[i], inhinfo, numInherits);
continue;
+ }

Duplicating the call to findParentsByOid seems less then ideal. How
about doing something like this:

if (tblinfo[i].dobj.dump)
{
find_parents = true;
mark_parents = true;
}
else if (dopt->load_via_partition_root && tblinfo[i].ispartition)
find_parents = true;

if (find_parents)
findParentsByOid(&tblinfo[i], inhinfo, numInherits);

etc.

The comments for this function also need some work - e.g. the function
header comment deserves some kind of update for these changes.

+static TableInfo *
+getRootTableInfo(TableInfo *tbinfo)
+{
+ Assert(tbinfo->ispartition);
+ Assert(tbinfo->numParents == 1);
+ if (tbinfo->parents[0]->ispartition)
+ return getRootTableInfo(tbinfo->parents[0]);
+
+ return tbinfo->parents[0];
+}

This code should iterate, not recurse, to avoid any risk of blowing
out the stack.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2017-08-10 15:00:24 Re: Remove 1MB size limit in tsvector
Previous Message Tom Lane 2017-08-10 14:36:11 Re: pl/perl extension fails on Windows