Skip site navigation (1) Skip section navigation (2)

Re: Sequential scan instead of index scan

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Sequential scan instead of index scan
Date: 2012-08-07 17:42:08
Message-ID: CAMkU=1zYCHv4J6gw319inngd25-dsgWk+NRXNDVg5jTGr+0gug@mail.gmail.com (view raw or flat)
Thread:
Lists: pgsql-performance
On Tue, Aug 7, 2012 at 9:06 AM, Ioannis Anagnostopoulos
<ioannis(at)anatec(dot)com> wrote:
> On 07/08/2012 17:00, Jeff Janes wrote:
>>
>> What happens if you set "enable_seqscan=off" and run the query with
>> the very large list?  (This is an experiment, not a recommendation for
>> production use)
>>
>>
>> Cheers,
>>
>> Jeff
>
> As Tom said, the actual question is not valid. Seq scan are not bad,

Right, that is why I proposed it as an experiment, not for production use.

> we just
> need to understand the way around it instead of forcing them off.

I think the first step to understanding the way around it is to force
it off, and see what the planner thinks it's next best option is, and
why it thinks that.


> In my
> case, the problem was the ARRAY as a parameter (which all together is not
> that great for holding so many data).

I think the only thing that is great for holding that much data is a
query against live permanent tables which returns it.  Given the
choice between stuffing it in an ARRAY and stuffing it in a temp table
and then manually analyzing it, neither one of those seems
fundamentally better than the other at the scale of 300,000.


> By converting it into a temporary
> table and performing an inner join in the query (after analysing the temp
> table) you get a nice Hash join (or Merge Join if you don't analyse the temp
> table).

I don't see those as being very good.  The "primary key" part of the
query is far more selective than the date part, so what you are doing
is fetching a huge number of rows only to throw out the vast majority
of them.

I think the optimal plan would be a bitmap scan on the indexes of the
"primary key" column.  This should automatically take advantage of the
sequential read nature of the table data to the extent the results are
well clustered, and if they aren't clustered it should benefit from
effective_io_concurrency if that is set appropriately.

Cheers,

Jeff

In response to

pgsql-performance by date

Next:From: Craig JamesDate: 2012-08-07 18:48:23
Subject: Is drop/restore trigger transactional?
Previous:From: Ioannis AnagnostopoulosDate: 2012-08-07 16:06:54
Subject: Re: Sequential scan instead of index scan

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group