Re: WIP Patch: Use sortedness of CSV foreign tables for query planning

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP Patch: Use sortedness of CSV foreign tables for query planning
Date: 2012-08-06 14:33:06
Message-ID: 9700.1344263586@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Sun, Aug 5, 2012 at 10:41 PM, Etsuro Fujita
> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> I think file_fdw is useful for managing log files such as PG CSV logs. Since
>> often, such files are sorted by timestamp, I think the patch can improve the
>> performance of log analysis, though I have to admit my demonstration was not
>> realistic.

> Hmm, I guess I could buy that as a plausible use case.

In the particular case of PG log files, I'd bet good money against them
being *exactly* sorted by timestamp. Clock skew between backends, or
varying amounts of time to construct and send messages, will result in
small inconsistencies. This would generally not matter, until the
planner relied on the claim of sortedness for something like a mergejoin
... and then it would matter a lot.

In general I'm quite suspicious of the idea of believing that externally
supplied data is sorted in exactly the way that PG thinks it should
sort. If we implement this you can bet that people will screw up, for
instance by using the wrong locale/collation to sort text data.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-08-06 14:37:21 Re: WIP patch for LATERAL subqueries
Previous Message Magnus Hagander 2012-08-06 14:25:29 Re: tzdata2012d