Re: range_agg

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Jeff Davis <pgsql(at)j-davis(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, David Fetter <david(at)fetter(dot)org>
Subject: Re: range_agg
Date: 2020-12-15 23:21:47
Message-ID: CAPpHfduQ3iwkOH+myncE7Gb=dAn9vM=KrH2oALN0LQzi+XPOqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 8, 2020 at 3:20 AM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> I'd like to publish my revision of the patch. So Paul could start
> from it. The changes I made are minor
> 1. Add missing types to typedefs.list
> 2. Run pg_indent run over the changed files and some other formatting changes
> 3. Reorder the regression tests to evade the error spotted by
> commitfest.cputube.org
>
> I'm switching this patch to WOA.

I decided to work on this patch myself. The next revision is attached.

The changes are as follows.

1. CREATE TYPE ... AS RANGE command now accepts new argument
multirange_type_name. If multirange_type_name isn't specified, then
multirange type name is selected automatically. pg_dump always
specifies multirange_type_name (if dumping at least pg14). Thanks to
that dumps are always restorable.
2. Multiranges now have a new binary format. After the MultirangeType
struct, an array of offsets comes, then an array of flags and finally
bounds themselves. Offsets points to the bounds of particular range
within multirange. Thanks to that particular range could be accessed
by number without deserialization of the whole multirange. Offsets
are stored in compression-friendly format similar to jsonb (actually
only every 4th of those "offsets" is really offsets, others are
lengths).
3. Most of simple functions working with multirages now don't
deserialize the whole multirange. Instead they fetch bounds of
particular ranges, and that doesn't even require any additional memory
allocation.
4. I've removed ExpandedObject support from the patch. I don't see
much point in it assuming all the functions are returning serialized
multirage anyway. We can add ExpandedObject support in future if
needed.
5. multirange_contains_element(), multirange_contains_range(),
multirange_overlaps_range() now use binary search. Thanks to binary
format, which doesn't require full deserialization, these functions
now work with O(log N) complexity.

Comments and documentation still need revision according to these
changes. I'm going to continue with this.

------
Regards,
Alexander Korotkov

Attachment Content-Type Size
v26-multirange.patch application/octet-stream 434.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-12-15 23:22:30 Re: Sorting case branches in outfuncs.c/outNode alphabetically
Previous Message Alastair Turner 2020-12-15 23:13:12 Re: Proposed patch for key managment