From: | Ron <ronljohnsonjr(at)gmail(dot)com> |
---|---|
To: | pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: suggestion about time based partitioning and hibernate |
Date: | 2023-07-18 06:47:58 |
Message-ID: | 92ccfffe-8bcf-249e-82c1-a0ae9cebae41@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On 7/18/23 01:18, Luca Ferrari wrote:
> Dear all,
> I'm looking for ideas here, and it could be someone already stepped
> into declarative partitioning of an existing database where Hibernate
> (a Java ORM) handles the tables.
> The situation is as follows:
>
> create table foo( id primary key, a_date date, ... );
>
> Now, the trivial way to partition this would be on a range based on
> a_date, so that the primary key of the tables shifts from id to (id,
> a_date). One thing that frightens me is that Hibernate does a lot of
> per-row lookups by means of the id, so while the partitioning is
> probably going to make things more manageable and even faster in some
> scenarios, could lead to drawbacks when Hibernate queries by id.
> Moreover, hibernate will think id is unique while it is not anymore.
> Last but not least, referencing foreign keys are made by Hibernate
> thru the id column, and it means that incoming foreign keys to foo
> will not be in place anymore.
>
> Now, I know that I can define a composite key in hibernate, in order
> to match the effective new data structure, but this requires a huge
> rewrite of the application code.
> And after all, we are talking about a non-PostgreSQL related piece, so
> the problem is not on the PostgreSQL side.
>
> Anyone has already done a partitioning in such scenario?
>
> I am thinking that partitioning on an hash of id could be the only way
> to go without having to touch the hibernate side, even if this would
> bring up a less balanced partitioned structure. In such case, I mean
> partitioning by hash, having a table with 60 millions rows per 50 GB
> in size, what would be the rule of thumb to select the number of
> partitions (i.e., a suggested modulus)?
We're in the exact same situation, using Hibernate and having many tables
like foo, partitioned by a_date with PK of (id, a_date) and FK definitions
(id, a_date).
It was a *massive mistake*, since many queries span partitions. Within a
year, I "departitioned" all tables except two giant tables that have large
bytea columns. (All the formerly partitioned tables still have (id, a_date)
as PK and FK. I'd like to change that, but the amount of code change is
untennable given them amount of new features that need to be added.)
Thus, my recommendations are:
1. only partition the tables you /must/, and
2. partition by id.
--
Born in Arizona, moved to Babylonia.
From | Date | Subject | |
---|---|---|---|
Next Message | Zhijie Hou (Fujitsu) | 2023-07-18 07:09:23 | RE: Support logical replication of DDLs |
Previous Message | Luca Ferrari | 2023-07-18 06:18:01 | suggestion about time based partitioning and hibernate |