Re: WIP: Collecting statistics on CSV file data

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: WIP: Collecting statistics on CSV file data
Date: 2011-10-18 05:45:19
Message-ID: 4E9D126F.5010508@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2011/10/18 2:27), Shigeru Hanada wrote:
> The new patch could be applied with some shifts. Regression tests of
> core and file_fdw have passed cleanly. Though I've tested only simple
> tests, ANALYZE works for foreign tables for file_fdw, and estimation of
> costs and selectivity seem appropriate.

Thank you for your testing.

> New API AnalyzeForeignTable
> ===========================

> And I think that AnalyzeForeignTable should be optional, and it would be
> very useful if a default handler is provided. Probably a default
> handler can use basic FDW APIs to acquire sample rows from the result of
> "SELECT * FROM foreign_table" with skipping periodically. It won't be
> efficient but I think it's not so unreasonable.

I agree with you. However, I think that it is difficult to support such
a default handler in a unified way because there exist external data
sources for which we cannot execute "SELECT * FROM foreign_table", e.g.,
web-accessible DBs limiting full access to the contents.

> Other issues
> ============
> There are some other comments about non-critical issues.
> - When there is no analyzable column, vac_update_relstats is not called.
> Is this behavior intentional?
> - psql can't complete foreign table name after ANALYZE.
> - A new parameter has been added to vac_update_relstats in a recent
> commit. Perhaps 0 is OK for that parameter.

I'll check.

> - ANALYZE without relation name ignores foreign tables because
> get_rel_oids doesn't list foreign tables.

I think that it might be better to ignore foreign tables by default
because analyzing such tables may take long depending on FDW.

> - IMO logging "analyzing foo.bar" should not be done in
> AnalyzeForeignTable handler of each FDW because some FDW might forget to
> do it. Maybe it should be pulled up to analyze_rel or somewhere in core.
> - It should be mentioned in a document that foreign tables are not
> analyzed automatically because they are read-only.

OK. I'll revise.

> Regards,

Best regards,
Etsuro Fujita

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2011-10-18 06:20:31 Re: spinlocks on HP-UX
Previous Message Tatsuo Ishii 2011-10-18 05:04:50 Re: spinlocks on HP-UX