RE: Skip partition tuple routing with constant partition key

From: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Skip partition tuple routing with constant partition key
Date: 2021-05-18 02:11:00
Message-ID: OS0PR01MB57165FFCADDA5BFAF8DC18A0942C9@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > Hmm, does this seem common enough for the added complexity to be
> worthwhile?
>
> I'd also like to know if there's some genuine use case for this. For testing
> purposes does not seem to be quite a good enough reason.

Thanks for the response.

For some big data scenario, we sometimes transfer data from one table(only store not expired data)
to another table(historical data) for future analysis.
In this case, we import data into historical table regularly(could be one day or half a day),
And the data is likely to be imported with date label specified, then all of the data to be
imported this time belong to the same partition which partition by time range.

So, personally, It will be nice if postgres can skip tuple routing for each row in this scenario.

> A slightly different optimization that I have considered and even written
> patches before was to have ExecFindPartition() cache the last routed to
> partition and have it check if the new row can go into that one on the next call.
> I imagined there might be a use case for speeding that up for RANGE
> partitioned tables since it seems fairly likely that most use cases, at least for
> time series ranges will
> always hit the same partition most of the time. Since RANGE requires
> a binary search there might be some savings there. I imagine that
> optimisation would never be useful for HASH partitioning since it seems most
> likely that we'll be routing to a different partition each time and wouldn't save
> much since routing to hash partitions are cheaper than other types. LIST
> partitioning I'm not so sure about. It seems much less likely than RANGE to hit
> the same partition twice in a row.

I think your approach looks good too,
and it seems does not conflict with the approach proposed here.

Best regards,
houzj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-05-18 02:20:07 Re: Performance degradation of REFRESH MATERIALIZED VIEW
Previous Message Peter Smith 2021-05-18 02:08:57 Re: What is lurking in the shadows?