Re: [PATCH] SortSupport for macaddr type

From: Brandur Leach <brandur(at)mutelight(dot)org>
To: Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [PATCH] SortSupport for macaddr type
Date: 2017-02-07 17:06:58
Message-ID: CABR_9B_RTT=OXR2YM2x2XNwOKiSRdWRbpY2z7StBca6VTz+YOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

And as a short follow up, I've moved the patch to the
current commit fest:

https://commitfest.postgresql.org/13/743/

On Sun, Feb 5, 2017 at 1:56 PM, Brandur Leach <brandur(at)mutelight(dot)org> wrote:

> Hi Julien,
>
> Thank you for taking the time to do this review, and my
> apologies for the very delayed response. I lost track of
> this work and have only jumped back into it today.
>
> Please find attached a new version of the patch with your
> feedback integrated. I've also rebased the patch against
> the current master and selected a new OID because my old
> one is now in use.
>
> > * you used macaddr_cmp_internal() function name, for uuid
> > the same function is named uuid_internal_cmp(). Using
> > the same naming pattern is probably better.
>
> I was a little split on this one! It's true that UUID uses
> `_internal_cmp`, but `_cmp_internal` is also used in a
> number of places like `enum`, `timetz`, and `network`. I
> don't have a strong feeling about it either way, so I've
> changed it to `_internal_cmp` to match UUID as you
> suggested.
>
> > * the function comment on macaddr_abbrev_convert()
> > doesn't mention specific little-endian handling
>
> I tried to bake this into the comment text. Here are the
> relevant lines of the amended version:
>
> * Packs the bytes of a 6-byte MAC address into a Datum and treats it
> as an
> * unsigned integer for purposes of comparison. On a 64-bit machine,
> there
> * will be two zeroed bytes of padding. The integer is converted to
> native
> * endianness to facilitate easy comparison.
>
> > * "There will be two bytes of zero padding on the least
> > significant end"
> >
> > "least significant bits" would be better
>
> Also done. Here is the amended version:
>
> * On a 64-bit machine, zero out the 8-byte datum and copy the 6 bytes
> of
> * the MAC address in. There will be two bytes of zero padding on the
> end
> * of the least significant bits.
>
> > * This patch will trigger quite a lot modifications after
> > a pgindent run. Could you try to run pgindent on mac.c
> > before sending an updated patch?
>
> Good call! I've run the new version through pgindent.
>
> Let me know if you have any further feedback and/or
> suggestions. Thanks!
>
> Brandur
>
>
> On Wed, Sep 14, 2016 at 3:14 AM, Julien Rouhaud <julien(dot)rouhaud(at)dalibo(dot)com
> > wrote:
>
>> On 26/08/2016 19:44, Brandur wrote:
>> > Hello,
>> >
>>
>> Hello,
>>
>> > I've attached a patch to add SortSupport for Postgres' macaddr which
>> has the
>> > effect of improving the performance of sorting operations for the type.
>> The
>> > strategy that I employ is very similar to that for UUID, which is to
>> create
>> > abbreviated keys by packing as many bytes from the MAC address as
>> possible into
>> > Datums, and then performing fast unsigned integer comparisons while
>> sorting.
>> >
>> > I ran some informal local benchmarks, and for cardinality greater than
>> 100k
>> > rows, I see a speed up on `CREATE INDEX` of roughly 25% to 65%. (For
>> those
>> > interested, I put a few more numbers into a small report here [2].)
>> >
>>
>> That's a nice improvement!
>>
>> > Admittedly, this is not quite as useful as speeding up sorting on a
>> more common
>> > data type like TEXT or UUID, but the change still seems like a useful
>> > performance improvement. I largely wrote it as an exercise to
>> familiarize
>> > myself with the Postgres codebase.
>> >
>> > I'll add an entry into the current commitfest as suggested by the
>> Postgres Wiki
>> > and follow up here with a link.
>> >
>> > Thanks, and if anyone has feedback or other thoughts, let me know!
>> >
>>
>> I just reviewed your patch. It applies and compiles cleanly, and the
>> abbrev feature works as intended. There's not much to say since this is
>> heavily inspired on the uuid SortSupport. The only really specific part
>> is in the abbrev_converter function, and I don't see any issue with it.
>>
>> I have a few trivial comments:
>>
>> * you used macaddr_cmp_internal() function name, for uuid the same
>> function is named uuid_internal_cmp(). Using the same naming pattern is
>> probably better.
>>
>> * the function comment on macaddr_abbrev_convert() doesn't mention
>> specific little-endian handling
>>
>> * "There will be two bytes of zero padding on the least significant end"
>>
>> "least significant bits" would be better
>>
>> * This patch will trigger quite a lot modifications after a pgindent
>> run. Could you try to run pgindent on mac.c before sending an updated
>> patch?
>>
>> Best regards.
>>
>> --
>> Julien Rouhaud
>> http://dalibo.com - http://dalibo.org
>>
>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2017-02-07 17:10:17 Re: Idea on how to simplify comparing two sets
Previous Message Michael Banck 2017-02-07 16:51:28 Re: Press Release Draft - 2016-02-09 Cumulative Update