Re: Performance implications of partitioning by UUIDv7 range in PostgreSQL v18

From: Jonathan Reis <jon(dot)reis(at)conevity(dot)com>
To: Greg Sabino Mullane <htamfids(at)gmail(dot)com>
Cc: Olof Salberger <olof(dot)salberger(at)gmail(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org
Subject: Re: Performance implications of partitioning by UUIDv7 range in PostgreSQL v18
Date: 2025-10-23 20:11:11
Message-ID: CAE_7N35fiC7Rq03hWutO3tPGfADQeO_gQzKHru1iNiASmUdsPQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Greg,

Thank you very much for your recommendations and your sample code. I
originally had it your way, but then I found out this is not possible

create table message (
id uuid PRIMARY KEY
-- ... plus other columns
) partition by range (uuid_extract_timestamp(id));

whereas, this is

create table message (
id uuid PRIMARY KEY
-- ... plus other columns
) partition by range (id);

Also, I had a misunderstanding that in this query

explain select * from message where uuid_extract_timestamp(id)
between '2025-10-23 23:00:00' and '2025-10-24 03:00:00';

that having uuid_extract_timestamp(id) on the left was SARGable, but
clearly it is based on your sample.

Thank you again, this was very helpful.

On Thu, Oct 23, 2025 at 9:52 AM Greg Sabino Mullane <htamfids(at)gmail(dot)com>
wrote:

> I think from a practical standpoint, partitioning directly on uuidv7 is
> going to cause problems. You can't directly see the partition constraints,
> you have to do tricks like your floor function to make it work, and you
> have to be super careful in how you construct your where clauses. However,
> what if you partition by the extracted timestamp? That way, queries are
> simplified, timestamps will not span multiple tables, partitions are
> human-readable again, and you can use pg_partman once more. Untested for
> large-scale performance, but something like this:
>
> \set ON_ERROR_STOP on
>
> drop schema if exists gregtest cascade;
> create schema gregtest;
> set search_path = gregtest;
>
> create table message (
> id uuid
> -- ... plus other columns
> ) partition by range (uuid_extract_timestamp(id));
>
> create table message_2025_10_22 partition of message for values from
> ('2025-10-22') to ('2025-10-23');
> create table message_2025_10_23 partition of message for values from
> ('2025-10-23') to ('2025-10-24');
> create table message_2025_10_24 partition of message for values from
> ('2025-10-24') to ('2025-10-25');
>
> create index m_2025_10_22_id on message_2025_10_22
> (uuid_extract_timestamp(id));
> create index m_2025_10_23_id on message_2025_10_23
> (uuid_extract_timestamp(id));
> create index m_2025_10_24_id on message_2025_10_24
> (uuid_extract_timestamp(id));
>
> -- Today:
> insert into message select uuidv7() from generate_series(1, 111_000);
> -- Yesterday:
> insert into message select uuidv7('-1 day') from generate_series(1,
> 222_000);
> -- Tomorrow:
> insert into message select uuidv7('+1 day') from generate_series(1,
> 333_000);
>
> set random_page_cost = 1.1; -- SSD rulez
> vacuum analyze message;
>
> select count(id) from only message;
> select count(id) from message_2025_10_22;
> select count(id) from message_2025_10_23;
> select count(id) from message_2025_10_24;
>
> explain select * from message where uuid_extract_timestamp(id) =
> '2025-10-23 10:23:45';
>
> explain select * from message where uuid_extract_timestamp(id)
> between '2025-10-23 23:00:00' and '2025-10-24 03:00:00';
>
>
> Which gives this output when run:
>
> count
> -------
> 0
>
> count
> --------
> 222000
>
> count
> --------
> 111000
>
> count
> --------
> 333000
>
>
> QUERY PLAN
>
> -----------------------------------------------------------------------------------------------------
> Index Scan using m_2025_10_23_id on message_2025_10_23 message
> (cost=0.29..5.29 rows=160)
> Index Cond: (uuid_extract_timestamp(id) = '2025-10-23
> 10:23:45-04'::timestamptz)
>
>
> QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------
> Append (cost=0.29..5.04 rows=2)
> -> Index Scan using m_2025_10_23_id on message_2025_10_23 message_1
> (cost=0.29..2.51 rows=1)
> Index Cond: ((uuid_extract_timestamp(id) >= '2025-10-23
> 23:00:00-04'::timestamptz)
> AND (uuid_extract_timestamp(id) <= '2025-10-24
> 03:00:00-04'::timestamptz))
> -> Index Scan using m_2025_10_24_id on message_2025_10_24 message_2
> (cost=0.30..2.52 rows=1)
> Index Cond: ((uuid_extract_timestamp(id) >= '2025-10-23
> 23:00:00-04'::timestamptz)
> AND (uuid_extract_timestamp(id) <= '2025-10-24
> 03:00:00-04'::timestamptz))
>
>
>
> Cheers,
> Greg
>
> --
> Crunchy Data - https://www.crunchydata.com
> Enterprise Postgres Software Products & Tech Support
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Laurenz Albe 2025-10-23 20:38:11 Re: Performance implications of partitioning by UUIDv7 range in PostgreSQL v18
Previous Message Greg Sabino Mullane 2025-10-23 16:51:47 Re: Performance implications of partitioning by UUIDv7 range in PostgreSQL v18