Re: WIP Patch: Selective binary conversion of CSV file foreign tables

From: "Etsuro Fujita" <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: "'Kohei KaiGai'" <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: "'Robert Haas'" <robertmhaas(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP Patch: Selective binary conversion of CSV file foreign tables
Date: 2012-06-27 03:28:06
Message-ID: 002d01cd5414$ddf9d7f0$99ed87d0$@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Kaigai-san,

> -----Original Message-----
> From: Kohei KaiGai [mailto:kaigai(at)kaigai(dot)gr(dot)jp]
> Sent: Tuesday, June 26, 2012 11:05 PM
> To: Etsuro Fujita
> Cc: Robert Haas; pgsql-hackers(at)postgresql(dot)org
> Subject: Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file
> foreign tables
>
> 2012/6/26 Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>:
> > Hi Kaigai-san,
> >
> >> -----Original Message-----
> >> From: Kohei KaiGai [mailto:kaigai(at)kaigai(dot)gr(dot)jp]
> >> Sent: Monday, June 25, 2012 9:49 PM
> >> To: Etsuro Fujita
> >> Cc: Robert Haas; pgsql-hackers(at)postgresql(dot)org
> >> Subject: Re: [HACKERS] WIP Patch: Selective binary conversion of CSV file
> >> foreign tables
> >>
> >> Fujita-san,
> >>
> >> The revised patch is almost good for me.
> >> Only point I noticed is the check for redundant or duplicated options I
> > pointed
> >> out on the previous post.
> >> So, if you agree with the attached patch, I'd like to hand over this patch
> for
> >> committers.
> >
> > OK I agree with you.  Thanks for the revision!
> >
> > Besides the revision, I modified check_selective_binary_conversion() to run
> > heap_close() in the whole-row-reference case.  Attached is an updated
version
> of
> > the patch.
> >
> Sorry, I overlooked this code path.

No, It's my fault.

> So, I'd like to take over this patch for committers.

Thanks,

Best regards,
Etsuro Fujita

>
> Thanks,
>
> > Thanks.
> >
> > Best regards,
> > Etsuro Fujita
> >
> >> All I changed is:
> >>  --- a/src/backend/commands/copy.c
> >>  +++ b/src/backend/commands/copy.c
> >>  @@ -122,6 +122,11 @@ typedef struct CopyStateData @@ -217,7 +221,7 @@
index
> >> 98bcb2f..0143d81 100644
> >>         }
> >>  +      else if (strcmp(defel->defname, "convert_binary") == 0)
> >>  +      {
> >> -+          if (cstate->convert_binary)
> >> ++          if (cstate->convert_selectively)
> >>  +              ereport(ERROR,
> >>  +                      (errcode(ERRCODE_SYNTAX_ERROR),
> >>  +                       errmsg("conflicting or redundant options")));
> >>
> >> Thanks,
> >>
> >> 2012/6/20 Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>:
> >> > Hi KaiGai-san,
> >> >
> >> > Thank you for the review.
> >> >
> >> >> -----Original Message-----
> >> >> From: pgsql-hackers-owner(at)postgresql(dot)org
> >> >> [mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Kohei KaiGai
> >> >> Sent: Wednesday, June 20, 2012 1:26 AM
> >> >> To: Etsuro Fujita
> >> >> Cc: Robert Haas; pgsql-hackers(at)postgresql(dot)org
> >> >> Subject: Re: [HACKERS] WIP Patch: Selective binary conversion of CSV
> >> >> file foreign tables
> >> >>
> >> >> Hi Fujita-san,
> >> >>
> >> >> Could you rebase this patch towards the latest tree?
> >> >> It was unavailable to apply the patch cleanly.
> >> >
> >> > Sorry, I updated the patch.  Please find attached an updated version
> >> > of the patch.
> >> >
> >> >> I looked over the patch, then noticed a few points.
> >> >>
> >> >> At ProcessCopyOptions(), defel->arg can be NIL, isn't it?
> >> >> If so, cstate->convert_binary is not a suitable flag to check
> >> >> redundant option. It seems to me cstate->convert_selectively is more
> >> >> suitable flag to check it.
> >> >>
> >> >> +       else if (strcmp(defel->defname, "convert_binary") == 0)
> >> >> +       {
> >> >> +           if (cstate->convert_binary)
> >> >> +               ereport(ERROR,
> >> >> +                       (errcode(ERRCODE_SYNTAX_ERROR),
> >> >> +                        errmsg("conflicting or redundant
> >> >> + options")));
> >> >> +           cstate->convert_selectively = true;
> >> >> +           if (defel->arg == NULL || IsA(defel->arg, List))
> >> >> +               cstate->convert_binary = (List *) defel->arg;
> >> >> +           else
> >> >> +               ereport(ERROR,
> >> >> +                       (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> >> >> +                        errmsg("argument to option \"%s\" must be a
> >> >> list of column names",
> >> >> +                               defel->defname)));
> >> >> +       }
> >> >
> >> > Yes, defel->arg can be NIL.  defel->arg is a List structure listing
> >> > all the columns needed to be converted to binary representation, which
> >> > is NIL for the case where no columns are needed to be converted.  For
> >> > example,
> >> > defel->arg is NIL for SELECT COUNT(*).  In this case, while
> >> > cstate->convert_selectively is set to true, no columns are converted
> >> > cstate->at
> >> > NextCopyFrom().  Most efficient case!  In short,
> >> > cstate->convert_selectively represents whether to do selective binary
> >> > conversion at NextCopyFrom(), and
> >> > cstate->convert_binary represents all the columns to be converted at
> >> > NextCopyFrom(), that can be NIL.
> >> >
> >> >> At NextCopyFrom(), this routine computes default values if configured.
> >> >> In case when these values are not referenced, it might be possible to
> >> >> skip unnecessary calculations.
> >> >> Is it unavailable to add logic to avoid to construct cstate->defmap
> >> >> on unreferenced columns at  BeginCopyFrom()?
> >> >
> >> > I think that we don't need to add the above logic because file_fdw
> >> > does
> >> > BeginCopyFrom() with attnamelist = NIL, in which case, BeginCopyFrom()
> >> > doesn't construct cstate->defmap at all.
> >> >
> >> > I fixed a bug plus some minor optimization in
> >> > check_binary_conversion() that is renamed to
> >> > check_selective_binary_conversion() in the updated version, and now
> >> > file_fdw gives up selective binary conversion for the following
> >> > cases:
> >> >
> >> >  a) BINARY format
> >> >  b) CSV/TEXT format and whole row reference
> >> >  c) CSV/TEXT format and all the user attributes needed
> >> >
> >> >
> >> > Best regards,
> >> > Etsuro Fujita
> >> >
> >> >> Thanks,
> >> >>
> >> >> 2012/5/11 Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>:
> >> >> >> -----Original Message-----
> >> >> >> From: Robert Haas [mailto:robertmhaas(at)gmail(dot)com]
> >> >> >> Sent: Friday, May 11, 2012 1:36 AM
> >> >> >> To: Etsuro Fujita
> >> >> >> Cc: pgsql-hackers(at)postgresql(dot)org
> >> >> >> Subject: Re: [HACKERS] WIP Patch: Selective binary conversion of
> >> >> >> CSV file foreign tables
> >> >> >>
> >> >> >> On Tue, May 8, 2012 at 7:26 AM, Etsuro Fujita
> >> >> > <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
> >> >> >> wrote:
> >> >> >> > I would like to propose to improve parsing efficiency of
> >> >> >> > contrib/file_fdw by selective parsing proposed by Alagiannis et
> >> >> >> > al.[1], which means that for a CSV/TEXT file foreign table,
> >> >> >> > file_fdw performs binary conversion only for the columns needed
> >> >> >> > for query processing.  Attached is a WIP patch implementing the
> > feature.
> >> >> >>
> >> >> >> Can you add this to the next CommitFest?  Looks interesting.
> >> >> >
> >> >> > Done.
> >> >> >
> >> >> > Best regards,
> >> >> > Etsuro Fujita
> >> >> >
> >> >> >> --
> >> >> >> Robert Haas
> >> >> >> EnterpriseDB: http://www.enterprisedb.com The Enterprise
> >> >> >> PostgreSQL Company
> >> >> >>
> >> >> >
> >> >> >
> >> >> >
> >> >> > --
> >> >> > Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> >> >> > To make changes to your subscription:
> >> >> > http://www.postgresql.org/mailpref/pgsql-hackers
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
> >> >>
> >> >> --
> >> >> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org) To
> >> >> make changes to your subscription:
> >> >> http://www.postgresql.org/mailpref/pgsql-hackers
> >> >>
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
>
>
>
> --
> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-27 03:28:14 Re: Posix Shared Mem patch
Previous Message Tom Lane 2012-06-27 02:40:54 Re: Posix Shared Mem patch