Re: PostgreSQL 8.3.3 chooses wrong query plan when LIMIT 1 added?

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Mark Cave-Ayland <mark(dot)cave-ayland(at)siriusit(dot)co(dot)uk>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: PostgreSQL 8.3.3 chooses wrong query plan when LIMIT 1 added?
Date: 2008-10-28 08:47:56
Message-ID: 87skqhcdhf.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


This looks like another form of the cross-column dependency problem. Postgres
is assuming that the revisions for all files will be evenly spread throughout
the date range and apparently there's a larger variety of dates than files so
it expects to find the last revision for that file fairly quickly scanning
backwards through the dates.

In fact of course files tend to be hot for a period of time and then mostly
idle, so depending on which file you pick that may work well if it's currently
hot or be absolutely terrible if it's a file that hasn't been touched
recently.

With the LIMIT Postgres favours the plan it thinks will return one row quickly
without sorting. Without it it's favouring the plan that will return all the
rows for that file_id most quickly.

I'm not sure what to suggest for this case if you can't change the data model
except perhaps increasing the statistics target.

One thing that comes to mind though, I would have defined one of those two
indexes to include both columns. Probably the file_id index, so you would have
an index on <revision_id> and an index on <file_id,revision_id>. That would
be a huge win for this query.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's Slony Replication support!

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Thomas 2008-10-28 09:42:47 Re: How to know the password for the user 'postgres'
Previous Message Ivano Luberti 2008-10-28 07:52:51 Re: Replication with slony-I