Re: SQL/MED - file_fdw

From: Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp>
To: Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: SQL/MED - file_fdw
Date: 2011-02-18 13:10:56
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

On Wed, 16 Feb 2011 16:48:33 +0900
Itagaki Takahiro <itagaki(dot)takahiro(at)gmail(dot)com> wrote:

> On Tue, Feb 8, 2011 at 00:30, Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp> wrote:
> > Fixed version is attached.
> I reviewed your latest git version, that is a bit newer than the attached patch.
> The code still works with small adjustment, but needs to be rebased on the
> latest master, especially for extension support and copy API changes.
> Here are a list of comments and suggestions:

Thanks for the comments. Revised version of patch has been attached.

> * You might forget some combination or unspecified options in
> file_fdw_validator().
> For example, format == NULL or !csv && header cases. I've not tested all
> cases, but please recheck validations used in BeginCopy().

Right, I've revised validation based on BeginCopy(), and added
regression tests about validation.

> * estimate_costs() needs own row estimation rather than using baserel.
> > > What value does baserel->tuples have?
> > > Foreign tables are never analyzed for now. Is the number correct?
> > No, baserel->tuples is always 0, and baserel->pages is 0 too.
> > In addition, width and rows are set to 100 and 1, as default values.
> It means baserel is not reliable at all, right?

Right, tables which has not been ANALYZEd have default stats in
baserel. But getting # of records needs another parsing for the file...

> If so, we need alternative
> solution in estimate_costs(). We adjust statistics with runtime relation
> size in estimate_rel_size(). Also, we use get_rel_data_width() for not
> analyzed tables. We could use similar technique in file_fdw, too.

Ah, using get_relation_data_width(), exported version of
get_rel_data_width(), seems to help estimation. I'll research around
it little more. By the way, adding ANALYZE support for foreign tables
is reasonable idea for this issue?

> * Should use int64 for file byte size (or, uint32 in blocks).
> unsigned long might be 32bit. ulong is not portable.
> * Oid List (list_make1_oid) might be more handy than Const to hold relid
> in FdwPlan.fdw_private.
> * I prefer FileFdwExecutionState to FileFdwPrivate, because there are
> two different 'fdw_private' variables in FdwPlan and FdwExecutionState.
> * The comment in fileIterate seems wrong. It should be
> /* Store the next tuple as a virtual tuple. */ , right?
> * #include <sys/stat.h> is needed.

Fixed all of above.

Shigeru Hanada

Attachment Content-Type Size
20110218-file_fdw.patch.gz application/octet-stream 9.8 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-02-18 13:28:58 Re: contrib loose ends: 9.0 to 9.1 incompatibilities
Previous Message Robert Haas 2011-02-18 12:37:46 Re: About the performance of startup after dropping many tables