Re: Adding support for Default partition in partitioning

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding support for Default partition in partitioning
Date: 2017-06-19 07:04:04
Message-ID: 35d68d49-555f-421a-99f8-185a44d085a4@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/06/16 14:16, Ashutosh Bapat wrote:
> On Fri, Jun 16, 2017 at 12:48 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> On Thu, Jun 15, 2017 at 12:54 PM, Ashutosh Bapat
>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>> Some more comments on the latest set of patches.
>>>
>>> In heap_drop_with_catalog(), we heap_open() the parent table to get the
>>> default partition OID, if any. If the relcache doesn't have an entry for the
>>> parent, this means that the entry will be created, only to be invalidated at
>>> the end of the function. If there is no default partition, this all is
>>> completely unnecessary. We should avoid heap_open() in this case. This also
>>> means that get_default_partition_oid() should not rely on the relcache entry,
>>> but should growl through pg_inherit to find the default partition.
>>
>> I am *entirely* unconvinced by this line of argument. I think we want
>> to open the relation the first time we touch it and pass the Relation
>> around thereafter.
>
> If this would be correct, why heap_drop_with_catalog() without this
> patch just locks the parent and doesn't call a heap_open(). I am
> missing something.

As of commit c1e0e7e1d790bf, we avoid creating relcache entry for the
parent. Before that commit, drop table
partitioned_table_with_many_partitions used to take too long and consumed
quite some memory as result of relcache invalidation requested at the end
on the parent table for every partition.

If this patch reintroduces the heap_open() on the parent table, that's
going to bring back the problem fixed by that commit.

>> Anything else is prone to accidentally failing to
>> have the relation locked early enough,
>
> We are locking the parent relation even without this patch, so this
> isn't an issue.

Yes.

>> or looking up the OID in the
>> relcache multiple times.
>
> I am not able to understand this in the context of default partition.
> After that nobody else is going to change its partitions and their
> bounds (since both of those require heap_open on parent which would be
> stuck on the lock we hold.). So, we have to check only once if the
> table has a default partition. If it doesn't, it's not going to
> acquire one unless we release the lock on the parent i.e at the end of
> transaction. If it has one, it's not going to get dropped till the end
> of the transaction for the same reason. I don't see where we are
> looking up OIDs multiple times.

Without heap_opening the parent, the only way is to look up parentOid's
children in pg_inherits and for each child looking up its pg_class tuple
in the syscache to see if its relpartbound indicates that it's a default
partition. That seems like it won't be inexpensive either.

It would be nice if could get that information (that is - is a given
relation being heap_drop_with_catalog'd a partition of the parent that
happens to have default partition) in less number of steps than that.
Having that information in relcache is one way, but as mentioned, that
turns out be expensive.

Has anyone considered the idea of putting the default partition OID in the
pg_partitioned_table catalog? Looking the above information up would
amount to one syscache lookup. Default partition seems to be special
enough object to receive a place in the pg_partitioned_table tuple of the
parent. Thoughts?

>>> + defaultPartOid = get_default_partition_oid(rel);
>>> + if (OidIsValid(defaultPartOid))
>>> + ereport(ERROR,
>>> + (errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
>>> + errmsg("there exists a default partition for table
>>> \"%s\", cannot attach a new partition",
>>> + RelationGetRelationName(rel))));
>>> +
>>> Should be done before heap_open on the table being attached. If we are not
>>> going to attach the partition, there's no point in instantiating its relcache.
>>
>> No, because we should take the lock before examining any properties of
>> the table.
>
> There are three tables involved here. "rel" which is the partitioned
> table. "attachrel" is the table being attached as a partition to "rel"
> and defaultrel, which is the default partition table. If there exists
> a default partition in "rel" we are not allowing "attachrel" to be
> attached to "rel". If that's the case, we don't need to examine any
> properties of "attachrel" and hence we don't need to instantiate
> relcache of "attachrel". That's what the comment is about.
> ATExecAttachPartition() receives "rel" as an argument which has been
> already locked and opened. So, we can check the existence of default
> partition right at the beginning of the function.

It seems that we are examining the properties of the parent table here
(whether it has default partition), which as Ashutosh mentions, is already
locked before we got to ATExecAttachPartition(). Another place where we
are ereporting before locking the table to be attached (actually even
before looking it up by name), based just on the properties of the parent
table, is in transformPartitionCmd():

/* the table must be partitioned */
if (parentRel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_OBJECT_DEFINITION),
errmsg("\"%s\" is not partitioned",
RelationGetRelationName(parentRel))));

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-06-19 07:30:26 Phantom segment upon promotion causing troubles.
Previous Message Ashutosh Sharma 2017-06-19 07:03:06 Re: BUG: pg_dump generates corrupted gzip file in Windows