Re: Improve error message for duplicate labels in enum types

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Rahila Syed <rahilasyed90(at)gmail(dot)com>
Cc: Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Improve error message for duplicate labels in enum types
Date: 2025-07-04 05:48:52
Message-ID: 20250704144852.8fb6cd2aba5b8d259798bc77@sraoss.co.jp
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Rahila,

On Fri, 4 Jul 2025 07:42:58 +0530
Rahila Syed <rahilasyed90(at)gmail(dot)com> wrote:

> Thank you for sharing the patch.
> +1 to the idea of improving the error message.

Thank you for your review.

> Please take the following points mentioned into consideration.
>
> 1. I’m considering whether there might be a more efficient way to handle
> this.
> The current method adds an extra loop to check for duplicates, in addition
> to the existing duplicate index check,
> even when no duplicates are present. Would it be possible to achieve this
> by wrapping the following
> insert call in a PG_TRY() and PG_CATCH() block and logging more descriptive
> error in the PG_CATCH() block?

Although this introduces a loop-in-loop for checkin the dulicates, I believe
the impact to the performance is not high because the number of values in an
enum would be not so large and creating an enum type is not executed so fequently.
I found check_duplicates_in_publist() and interpret_function_parameter_list take
similar ways, where duplicates are checked in nested loops.

Although this introduces a nested loop to check for duplicates, I believe the
performance impact is negligible, since the number of values in an enum is typically
small, and enum type creation is not a frequent operation.
I found that check_duplicates_in_publist() and interpret_function_parameter_list()
take a similar approach, using nested loops to check for duplicates.

If we were to use PG_TRY and PG_CATCH block, we wouldn't be able to identify exactly
which label is duplicated.

> 2. If we choose to include the check in the 0001 patch you provided, would
> it make more sense to place
> it earlier in the function, before assigning OIDs to the labels and running
> qsort? This way, we could
> catch duplicates sooner and prevent unnecessary processing.

If the duplicate check were done before the loop, we would have to add an extra nested
loop, which seems wasteful. The loop is the first place where the actual label text is
accessed using strVal(), and there's already a check for the label length here.
So I believe this is also the appropriate place to check for duplicates.

Regards,
Yugo Nagata

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo Nagata 2025-07-04 05:58:05 Re: Prevent internal error at concurrent CREATE OR REPLACE FUNCTION
Previous Message Peter Smith 2025-07-04 05:44:27 Re: TOAST versus toast