| From: | Jonathan Reis <jon(dot)reis(at)conevity(dot)com> |
|---|---|
| To: | Greg Sabino Mullane <htamfids(at)gmail(dot)com> |
| Cc: | Olof Salberger <olof(dot)salberger(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Performance implications of partitioning by UUIDv7 range in PostgreSQL v18 |
| Date: | 2025-10-23 20:11:11 |
| Message-ID: | CAE_7N35fiC7Rq03hWutO3tPGfADQeO_gQzKHru1iNiASmUdsPQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-performance |
Greg,
Thank you very much for your recommendations and your sample code. I
originally had it your way, but then I found out this is not possible
create table message (
id uuid PRIMARY KEY
-- ... plus other columns
) partition by range (uuid_extract_timestamp(id));
whereas, this is
create table message (
id uuid PRIMARY KEY
-- ... plus other columns
) partition by range (id);
Also, I had a misunderstanding that in this query
explain select * from message where uuid_extract_timestamp(id)
between '2025-10-23 23:00:00' and '2025-10-24 03:00:00';
that having uuid_extract_timestamp(id) on the left was SARGable, but
clearly it is based on your sample.
Thank you again, this was very helpful.
On Thu, Oct 23, 2025 at 9:52 AM Greg Sabino Mullane <htamfids(at)gmail(dot)com>
wrote:
> I think from a practical standpoint, partitioning directly on uuidv7 is
> going to cause problems. You can't directly see the partition constraints,
> you have to do tricks like your floor function to make it work, and you
> have to be super careful in how you construct your where clauses. However,
> what if you partition by the extracted timestamp? That way, queries are
> simplified, timestamps will not span multiple tables, partitions are
> human-readable again, and you can use pg_partman once more. Untested for
> large-scale performance, but something like this:
>
> \set ON_ERROR_STOP on
>
> drop schema if exists gregtest cascade;
> create schema gregtest;
> set search_path = gregtest;
>
> create table message (
> id uuid
> -- ... plus other columns
> ) partition by range (uuid_extract_timestamp(id));
>
> create table message_2025_10_22 partition of message for values from
> ('2025-10-22') to ('2025-10-23');
> create table message_2025_10_23 partition of message for values from
> ('2025-10-23') to ('2025-10-24');
> create table message_2025_10_24 partition of message for values from
> ('2025-10-24') to ('2025-10-25');
>
> create index m_2025_10_22_id on message_2025_10_22
> (uuid_extract_timestamp(id));
> create index m_2025_10_23_id on message_2025_10_23
> (uuid_extract_timestamp(id));
> create index m_2025_10_24_id on message_2025_10_24
> (uuid_extract_timestamp(id));
>
> -- Today:
> insert into message select uuidv7() from generate_series(1, 111_000);
> -- Yesterday:
> insert into message select uuidv7('-1 day') from generate_series(1,
> 222_000);
> -- Tomorrow:
> insert into message select uuidv7('+1 day') from generate_series(1,
> 333_000);
>
> set random_page_cost = 1.1; -- SSD rulez
> vacuum analyze message;
>
> select count(id) from only message;
> select count(id) from message_2025_10_22;
> select count(id) from message_2025_10_23;
> select count(id) from message_2025_10_24;
>
> explain select * from message where uuid_extract_timestamp(id) =
> '2025-10-23 10:23:45';
>
> explain select * from message where uuid_extract_timestamp(id)
> between '2025-10-23 23:00:00' and '2025-10-24 03:00:00';
>
>
> Which gives this output when run:
>
> count
> -------
> 0
>
> count
> --------
> 222000
>
> count
> --------
> 111000
>
> count
> --------
> 333000
>
>
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------
> Index Scan using m_2025_10_23_id on message_2025_10_23 message
> (cost=0.29..5.29 rows=160)
> Index Cond: (uuid_extract_timestamp(id) = '2025-10-23
> 10:23:45-04'::timestamptz)
>
>
> QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------
> Append (cost=0.29..5.04 rows=2)
> -> Index Scan using m_2025_10_23_id on message_2025_10_23 message_1
> (cost=0.29..2.51 rows=1)
> Index Cond: ((uuid_extract_timestamp(id) >= '2025-10-23
> 23:00:00-04'::timestamptz)
> AND (uuid_extract_timestamp(id) <= '2025-10-24
> 03:00:00-04'::timestamptz))
> -> Index Scan using m_2025_10_24_id on message_2025_10_24 message_2
> (cost=0.30..2.52 rows=1)
> Index Cond: ((uuid_extract_timestamp(id) >= '2025-10-23
> 23:00:00-04'::timestamptz)
> AND (uuid_extract_timestamp(id) <= '2025-10-24
> 03:00:00-04'::timestamptz))
>
>
>
> Cheers,
> Greg
>
> --
> Crunchy Data - https://www.crunchydata.com
> Enterprise Postgres Software Products & Tech Support
>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Laurenz Albe | 2025-10-23 20:38:11 | Re: Performance implications of partitioning by UUIDv7 range in PostgreSQL v18 |
| Previous Message | Greg Sabino Mullane | 2025-10-23 16:51:47 | Re: Performance implications of partitioning by UUIDv7 range in PostgreSQL v18 |