Re: Partitioned tables and relfilenode

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioned tables and relfilenode
Date: 2017-02-28 02:53:16
Message-ID: 29b4b273-1e48-adf2-2ca7-887f382b2a1b@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/02/28 3:54, Bruce Momjian wrote:
> On Fri, Feb 10, 2017 at 03:19:47PM +0900, Amit Langote wrote:
>> The new partitioned tables do not contain any data by themselves. Any
>> data inserted into a partitioned table is routed to and stored in one of
>> its partitions. In fact, it is impossible to insert *any* data before a
>> partition (to be precise, a leaf partition) is created. It seems wasteful
>> then to allocate physical storage (files) for partitioned tables. If we
>> do not allocate the storage, then we must make sure that the right thing
>> happens when a command that is intended to manipulate a table's storage
>> encounters a partitioned table, the "right thing" here being that the
>> command's code either throws an error or warning (in some cases) if the
>> specified table is a partitioned table or ignores any partitioned tables
>> when it reads the list of relations to process from pg_class. Commands
>> that need to be taught about this are vacuum, analyze, truncate, and alter
>> table. Specifically:
>>
>> - In case of vacuum, specifying a partitioned table causes a warning
>>
>> - In case of analyze, we do not throw an error or warning but simply
>> avoid calling do_analyze_rel() *non-recursively*. Further in
>> acquire_inherited_sample_rows(), any partitioned tables in the list
>> returned by find_all_inheritors() are skipped.
>>
>> - In case of truncate, only the part which manipulates table's physical
>> storage is skipped for partitioned tables.
>>
>> - ATRewriteTables() skips on the AlteredTableInfo entries for partitioned
>> tables, because there is nothing to be done.
>>
>> - Since we cannot create indexes on partitioned tables anyway, there is
>> no need to handle cluster and reindex (they throw a meaningful error
>> already due to the lack of indexes.)
>
> I don't think we are doing this, but if the parent table doesn't have a
> physical file pg_upgrade will need to be taught that. We have that case
> now for unlogged tables on standby servers that we need to address.

Partitioned tables do have physical files as of now. This thread is to
discuss the proposal to get rid of the physical file for partitioned tables.

By the way, I just tried pg_upgrade on top of this patch, and it seems
partitioned tables without the physical file migrate just fine to the new
cluster. To test I did: created a partitioned table and few partitions,
inserted some data into it, pg_upgraded the cluster to find the
partitioned table intact with its data in the new cluster (to be clear,
the data is contained in partitions). Is there something that wouldn't
work that I should instead be testing?

Also, it seems that the partitioned tables (without physical files) won't
have the same issue on the standby servers as unlogged tables. It's just
that we route inserts into a partitioned table to its partitions and hence
the partitioned table itself does not contain because all the incoming
data is routed. Am I missing something?

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2017-02-28 03:01:15 Re: rename pg_log directory?
Previous Message Bruce Momjian 2017-02-28 02:52:10 Re: bytea_output output of base64