Re: [HACKERS] Optimizer fails?

From: dg(at)illustra(dot)com (David Gould)
To: mimo(at)interdata(dot)com(dot)pl (Michal Mosiewicz)
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: [HACKERS] Optimizer fails?
Date: 1998-03-25 05:48:59
Message-ID: 9803250549.AA20100@hawk.illustra.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Michal Mosiewicz asks:
> xxxx=> \d logdt
>
> Table = logdt
> +----------------------------------+----------------------------------+-------+
> | Field | Type |
> Length|
> +----------------------------------+----------------------------------+-------+
> | dt | timestamp
> | 4 |
> +----------------------------------+----------------------------------+-------+
> xxxx=> explain select * from log where dt < '19980203';
> NOTICE: QUERY PLAN:
>
> Seq Scan on log (cost=105832.15 size=699588 width=62)
>
> There is an index on log table, dt field. The index is b-tree.
> However it doesn't seem to be used. (Of course I have vacuumed it). The
> table has about 2M records. I don't think that Seq Scan is a good idea.

This is almost certainly correct optimizer behavior, unless there are very few
rows that would be qualified by "< '19980203'". Do you know how many rows
were returned?

The key to all this is that scanning sequentially is very cheap. Disks like
it, the OS filesystem is optimized for it, striping across multiple devices
is a big win... It is easy to scan 10 Mb per second on PC class hardward.

Using an index means using random I/O for each row. Unless the target table
fits in the buffer cache this means you are limited to examining a few
dozen rows per second (good disks can sustain about 80 to 100 random IOs
per second).

Back of the envelope calculation:

Assume

rowsize = 100 bytes.
Sequential I/O rate = 1Mb/second.
Random I/O rate = 50 IO/second.

Then

2M rows @ 10,000 rows per Mb = 200 MB = 200 seconds.

200 seconds * 50 IO per second = 10,000 IOs

So

If less than 10,000 rows (0.5 % of the table) would qualify, the index
scan might be faster. Otherwise the table scan is optimal.

This calculation ignores the overhead of actually scanning the index and
probably underestimates the sequential I/O rate a so in real life, the table
scan is may be even better than this.

> Also, what if I agregate it on dt field to count(*) or sum some values.
> It would be sequentially scanned, then sorted, then grouped and finally
> agregated, right?
>
> Well, sometimes it may be good enough. But if this table is big enough,
> it would be wiser to use index to select ranges from the table and then
> agregate those values without sorting.

Assuming a table and query like:

create table foo (d date, m money); - 2 M rows of this, index on d.

select d, sum(m) from foo
group by d;

I would be very surprised if there existed a better plan than to sort the
table and then scan the sorted table (once) to do the grouping and
aggregation.

If you knew the number of 'd' values was smallish, you might try scanning
the table and building a hash containing the d values and aggregates.

I don't know if postgreSQL has this strategy. It is fairly uncommon to
actually find it implemented in real systems.

> Once I saw index based agregates in the TODO list, but somehow it
> disappeared.

Do you mean using covering indexes? This would be very worthwhile, but might
be a bit of a job.

-dg

David Gould dg(at)illustra(dot)com 510.628.3783 or 510.305.9468
Informix Software (No, really) 300 Lakeside Drive Oakland, CA 94612
- Linux. Not because it is free. Because it is better.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message t-ishii 1998-03-25 06:11:39 using alternative tcl include dir?
Previous Message Bruce Momjian 1998-03-25 05:33:15 Re: [HACKERS] Optimizer fails?