Re: [HACKERS] Solution for LIMIT cost estimation

From: Chris Bitmead <chrisb(at)nimrod(dot)itg(dot)telstra(dot)com(dot)au>
To: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Solution for LIMIT cost estimation
Date: 2000-02-14 04:32:31
Message-ID: 38A7855F.DB3EF2CD@nimrod.itg.telecom.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane wrote:
>
> Chris Bitmead <chrisb(at)nimrod(dot)itg(dot)telstra(dot)com(dot)au> writes:
> > Tom Lane wrote:
> >> I have currently got it working (I think; not too well tested yet)
> >> using the proposal I offered before of "pay attention to the size
> >> of LIMIT, but ignore OFFSET", so that the same query plan will be
> >> derived from similar queries with different OFFSETs. Does anyone
> >> have a substantial gripe with that compromise?
>
> > Would offset be any use if you did make use of it?
>
> Yes, because the number of tuples that will *actually* get fetched
> is offset+limit. If you had a large offset so that the tuples
> getting returned were from somewhere near the end of the query,
> then choosing a fast-start algorithm would be a Bad Idea; you'd
> really want a plan that optimizes on the basis of total cost
> rather than startup cost.
> Hmm, I'm on the verge of talking myself out of the compromise ;-).
> I'm not sure how many people will really use large offsets, but
> anyone who does might be a pretty unhappy camper. If you're asking
> for OFFSET 1000000 LIMIT 1, the thing might pick a nested loop
> which is exceedingly fast-start ... but also exceedingly expensive
> when you go ahead and fetch many tuples anyway.
>
> Perhaps we should stick to two alternatives:
>
> 1. If LIMIT is present, optimize on an assumption that X% of the
> tuples are fetched, where X does *not* depend on the specific
> values given for OFFSET or LIMIT. (But we could make X a settable
> parameter...)
>
> 2. Optimize using OFFSET+LIMIT as the expected number of tuples to
> fetch. Document that varying OFFSET or LIMIT will not necessarily
> generate consistent results unless you specify ORDER BY to force a
> consistent tuple order.
>
> I don't really like #1, but I can see where #2 might cause some
> unhappiness as well. Comments, opinions?

I agree you should probably go the whole hog one way or the other. I
think
ignoring offset+limit is a useful option, but like I said at the
beginning, it doesn't bother me _that_ much.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Don Baccus 2000-02-14 04:41:35 Re: [HACKERS] Suggestion to split /data/base directory
Previous Message Tom Lane 2000-02-14 04:24:35 Re: [HACKERS] Solution for LIMIT cost estimation