From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Sutou Kouhei <kou(at)clear-code(dot)com> |
Cc: | david(dot)g(dot)johnston(at)gmail(dot)com, tgl(at)sss(dot)pgh(dot)pa(dot)us, zhjwpku(at)gmail(dot)com, michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Make COPY format extendable: Extract COPY TO format implementations |
Date: | 2025-03-23 09:01:59 |
Message-ID: | CAD21AoAfWrjpTDJ0garVUoXY0WC3Ud4Cu51q+ccWiotm1uo_2A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 19, 2025 at 6:25 PM Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
>
> Hi,
>
> In <CAKFQuwaMAFMHqxDXR=SxA0mDjdmntrwxZd2w=nSruLNFH-OzLw(at)mail(dot)gmail(dot)com>
> "Re: Make COPY format extendable: Extract COPY TO format implementations" on Wed, 19 Mar 2025 17:49:49 -0700,
> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
> >> And could someone help (take over if possible) writing a
> >> document for this feature? I'm not good at writing a
> >> document in English... 0009 in the attached v37 patch set
> >> has a draft of it. It's based on existing documents in
> >> doc/src/sgml/ and *.h.
> >>
> >>
> > I haven't touched the innards of the structs aside from changing
> > programlisting to synopsis. And redoing the two section opening paragraphs
> > to better integrate with the content in the chapter opening.
> >
> > The rest I kinda went to town on...
>
> Thanks!!! It's very helpful!!!
>
> I've applied your patch. 0009 is only changed.
Thank you for updating the patches. I've reviewed the main part of
supporting the custom COPY format. Here are some random comments:
---
+/*
+ * Process the "format" option.
+ *
+ * This function checks whether the option value is a built-in format such as
+ * "text" and "csv" or not. If the option value isn't a built-in format, this
+ * function finds a COPY format handler that returns a CopyToRoutine (for
+ * is_from == false) or CopyFromRountine (for is_from == true). If no COPY
+ * format handler is found, this function reports an error.
+ */
I think this comment needs to be updated as the part "If the option
value isn't ..." is no longer true.
I think we don't necessarily need to create a separate function
ProcessCopyOptionFormat for processing the format option.
We need more regression tests for handling the given format name. For example,
- more various input patterns.
- a function with the specified format name exists but it returns an
unexpected Node.
- looking for a handler function in a different namespace.
etc.
---
I think that we should accept qualified names too as the format name
like tablesample does. That way, different extensions implementing the
same format can be used.
---
+ if (routine == NULL || !IsA(routine, CopyFromRoutine))
+ ereport(
+ ERROR,
+ (errcode(
+
ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("COPY handler function "
+ "%u did not return "
+ "CopyFromRoutine struct",
+ opts->handler)));
It's not conventional to put a new line between 'ereport(' and 'ERROR'
(similarly between 'errcode(' and 'ERRCODE_...'. Also, we don't need
to split the error message into multiple lines as it's not long.
---
+ if (routine == NULL || !IsA(routine, CopyToRoutine))
+ ereport(
+ ERROR,
+ (errcode(
+
ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("COPY handler function "
+ "%u did not return "
+ "CopyToRoutine struct",
+ opts->handler)));
Same as the above comment.
---
+ descr => 'pseudo-type for the result of a copy to/from method function',
s/method function/format function/
---
+ Oid handler; /* handler
function for custom format routine */
'handler' is used also for built-in formats.
---
+static void
+CopyFromInFunc(CopyFromState cstate, Oid atttypid,
+ FmgrInfo *finfo, Oid *typioparam)
+{
+ ereport(NOTICE, (errmsg("CopyFromInFunc: atttypid=%d", atttypid)));
+}
OIDs could be changed across major versions even for built-in types. I
think it's better to avoid using it for tests.
---
+static void
+CopyToOneRow(CopyToState cstate, TupleTableSlot *slot)
+{
+ ereport(NOTICE, (errmsg("CopyToOneRow: tts_nvalid=%u",
slot->tts_nvalid)));
+}
Similar to the above comment, the field name 'tts_nvalid' might also
be changed in the future, let's use another name.
---
+static const CopyFromRoutine CopyFromRoutineTestCopyFormat = {
+ .type = T_CopyFromRoutine,
+ .CopyFromInFunc = CopyFromInFunc,
+ .CopyFromStart = CopyFromStart,
+ .CopyFromOneRow = CopyFromOneRow,
+ .CopyFromEnd = CopyFromEnd,
+};
I'd suggest not using the same function names as the fields.
---
+/*
+ * Export CopySendEndOfRow() for extensions. We want to keep
+ * CopySendEndOfRow() as a static function for
+ * optimization. CopySendEndOfRow() calls in this file may be optimized by a
+ * compiler.
+ */
+void
+CopyToStateFlush(CopyToState cstate)
+{
+ CopySendEndOfRow(cstate);
+}
Is there any reason to use a different name for public functions?
---
+/*
+ * Export CopyGetData() for extensions. We want to keep CopyGetData() as a
+ * static function for optimization. CopyGetData() calls in this file may be
+ * optimized by a compiler.
+ */
+int
+CopyFromStateGetData(CopyFromState cstate, void *dest, int minread,
int maxread)
+{
+ return CopyGetData(cstate, dest, minread, maxread);
+}
+
The same as the above comment.
---
+ /* For custom format implementation */
+ void *opaque; /* private space */
How about renaming 'private'?
---
I've not reviewed the documentation patch yet but I think the patch
seems to miss the updates to the description of the FORMAT option in
the COPY command section.
---
I think we can reorganize the patch set as follows:
1. Create copyto_internal.h and change COPY_XXX to COPY_SOURCE_XXX and
COPY_DEST_XXX accordingly.
2. Support custom format for both COPY TO and COPY FROM.
3. Expose necessary helper functions such as CopySendEndOfRow().
4. Add CopyFromSkipErrorRow().
5. Documentation.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Guo | 2025-03-23 09:25:15 | Re: Reduce "Var IS [NOT] NULL" quals during constant folding |
Previous Message | Richard Guo | 2025-03-23 08:50:25 | Re: Reduce "Var IS [NOT] NULL" quals during constant folding |