| From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
|---|---|
| To: | Sugamoto Shinya <shinya34892(at)gmail(dot)com> |
| Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: [PATCH] Add error hints for invalid COPY options |
| Date: | 2025-12-03 22:55:44 |
| Message-ID: | CAD21AoBC9O80OJchfyjnnxCZV-YfR-fDXyCKLNxdXdTNFYVk1Q@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Dec 2, 2025 at 3:42 AM Sugamoto Shinya <shinya34892(at)gmail(dot)com> wrote:
>
>
>
> On Sat, Nov 29, 2025 at 9:36 PM Sugamoto Shinya <shinya34892(at)gmail(dot)com> wrote:
>>
>>
>>
>> On Fri, Nov 28, 2025 at 2:59 AM Kirill Reshke <reshkekirill(at)gmail(dot)com> wrote:
>>>
>>> On Wed, 26 Nov 2025 at 11:55, Sugamoto Shinya <shinya34892(at)gmail(dot)com> wrote:
>>> >
>>> >
>>> >
>>> > 2025年11月25日(火) 6:50 Nathan Bossart <nathandbossart(at)gmail(dot)com>:
>>> >>
>>> >> On Mon, Nov 24, 2025 at 11:56:34AM -0800, Masahiko Sawada wrote:
>>> >> > On Sat, Nov 22, 2025 at 8:33 PM Sugamoto Shinya <shinya34892(at)gmail(dot)com> wrote:
>>> >> >> This follows the pattern already used elsewhere in PostgreSQL for providing
>>> >> >> helpful error hints to users.
>>> >> >
>>> >> > Given we have 15 COPY options now, it sounds like a reasonable idea.
>>> >> >
>>> >> > One concern about the patch is that when adding a new COPY option, we
>>> >> > could miss updating valid_copy_options list, resulting in providing a
>>> >> > wrong suggestion. I think we can consider refactoring the COPY option
>>> >> > handling so that we check the given option is a valid name or not by
>>> >> > checking valid_copy_options array and then process the option value.
>>> >>
>>> >> +1. Ideally, folks wouldn't need to update a separate list when adding new
>>> >> options.
>>> >>
>>> >> >> Additionally, this patch corrects a misleading comment for the
>>> >> >> convert_selectively option. The comment stated it was "not-accessible-from-SQL",
>>> >> >> but actualy it has been accessible from SQL due to PostgreSQL's generic option parser.
>>> >> >> The updated comment clarifies that while technically accessible, it's intended for
>>> >> >> internal use and not recommended for end-user use due to potential data loss.
>>> >> >
>>> >> > Hmm, I'm not sure the proposed comment improves the clarification.
>>> >> > It's essentially non-accessible from SQL since we cannot provide a
>>> >> > valid value for convert_selectively from SQL commands.
>>> >>
>>> >> Yeah, I'd leave it alone, at least for this patch.
>>> >>
>>> >> --
>>> >> nathan
>>> >>
>>> >>
>>> >
>>> >
>>> > Thanks for checking my proposal.
>>> >
>>> >
>>> > For the refactoring of the COPY options, it sounds reasonable to me. Let me take that changes in my patch.
>>>
>>>
>>> Also one little thing:
>>>
>>>
>>> >+{
>>> >+ {"default", copy_opt_default, true},
>>> >+ {"delimiter", copy_opt_delimiter, true},
>>> >+ {"encoding", copy_opt_encoding, true},
>>> >+ {"escape", copy_opt_escape, true},
>>> >+ {"force_not_null", copy_opt_force_not_null, true},
>>> >+ {"force_null", copy_opt_force_null, true},
>>> >+ {"force_quote", copy_opt_force_quote, true},
>>> >+ {"format", copy_opt_format, true},
>>> >+ {"freeze", copy_opt_freeze, true},
>>> >+ {"header", copy_opt_header, true},
>>> >+ {"log_verbosity", copy_opt_log_verbosity, true},
>>> >+ {"null", copy_opt_null, true},
>>> >+ {"on_error", copy_opt_on_error, true},
>>> >+ {"quote", copy_opt_quote, true},
>>> >+ {"reject_limit", copy_opt_reject_limit, true},
>>> >+ {"convert_selectively", copy_opt_convert_selectively, false},
>>> >+ {NULL, NULL, false}
>>> >+};
>>>
>>> Maybe we need one more struct member here, to indicate which options
>>> are valid to be specified by user?
>>>
>>> Also, pattern
>>>
>>> static const struct {..} array_name[] = ... is not used in PostgreSQL
>>> sources. At least, I do not see any use of such .
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Best regards,
>>> Kirill Reshke
>>
>>
>>
>> Thanks for checking my proposal.
>>
>>
>> > Maybe we need one more struct member here, to indicate which options
>> > are valid to be specified by user?
>>
>> If you don't mind, I would like to make a separate patch for fixing the "convert_selectively"
>> option and focus on refactoring error handling here because I tend to feel we should
>> separate refactoring changes and non-backward compatible changes into different commits.
>> After this patch gets merged, I'll make another thread to discuss whether we should block
>> unexpected "convert_selectively" use or not.
>>
>>
>> > static const struct {..} array_name[] = ... is not used in PostgreSQL
>> > sources. At least, I do not see any use of such .
>>
>> I saw several places that use that sort of style, for example src/backend/utils/adt/encode.c:836
>> and src/backend/access/heap/heapam.c:122, but you seems to be more or less correct since
>> usually we define types explicitly like src/backend/foreign/foreign.c:576 and src/backend/backup/basebackup.c:191.
>> I updated my patch by defining a new type `CopyCoptionDef`.
>>
>>
>> Also, I added improvements to helper functions like defGet**. I just removed and unified those
>> into corresponding proceeOption** functions.
>>
>> Regards,
>
>
>
> Hi,
>
>
> Just a friendly ping on this thread.
>
>
> In the latest version of the patch, I refactored COPY option handling so that:
>
> All the COPY options and their validation functions are defined in a single table (CopyOptionDef), and
>
> Error hints for invalid option names/values are generated based on that table.
>
>
> The goal was to make it harder to forget updating the error-hinting logic when adding new options, and to keep validation logic in one place.
>
>
> But on the other hand, I can also simplify this if you feel the current approach is too heavy. For example, one alternative would be to keep the existing per-option handling and just add a minimal option check like validate_copy_option() near the top of the main options loop in order to keep our implementations simple and small, even if that does not completely eliminate the chance of someone missing an update.
>
> This is an alternative approche what I mentioned here.
>
> ```
> list valid_options = ["format", "force_null", ...]
>
> foreach(opt, options)
> {
> validate_copy_option(opt, valid_options) <--- THIS
>
> if (opt.name == "format") ...
> if (opt.name == "force_null") ...
> ...
> }
> ```
The proposed patch requires us to create one function per option. I'm
concerned that it could cause bloating functions and be overkill just
to save changing a separate list. I would suggest starting with
putting a validation check for options at the top of foreach() loop.
When adding a new COPY option in the future, it wouldn't work if we
miss either changing the valid-options list or handling the option,
which seems a good prevention.
>
> Regarding convert_selectively, I have kept behavior and comments unchanged in this patch. As I said, I plan to propose a separate patch to address the possibility of users specifying convert_selectively from SQL (e.g., by rejecting it in the parser), once we agree on the direction for this refactoring.
+1 for a separate discussion and patch. Thank you for pointing out
that it actually can be accessible via SQL command.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2025-12-03 23:04:02 | Re: Cleanup shadows variable warnings, round 1 |
| Previous Message | Sami Imseih | 2025-12-03 22:40:26 | Re: explain plans for foreign servers |