Re: strange buildfarm failure on lionfish

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Stefan Kaltenbrunner" <stefan(at)kaltenbrunner(dot)cc>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: strange buildfarm failure on lionfish
Date: 2007-07-24 16:53:02
Message-ID: 87r6mxzqtt.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> Stefan Kaltenbrunner <stefan(at)kaltenbrunner(dot)cc> writes:
>> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=lionfish&dt=2007-07-24%2005:30:13
>> any ideas ?
>
> I saw what I think was the identical failure last night on my own
> machine, but it wasn't repeatable. Evidently the planner is changing to
> a different plan for those queries, but why has this only started
> recently? Maybe the recent changes to autovacuum defaults are causing
> autovac to hit these tables when it never did before?

Indeed the only alternate plan I can imagine for this is to do the join the
other way around. And given the large difference in sizes between the two
tables the only way I could get that to happen was by obliterating the
statistics entirely for one table but having stats for the other.

This does raise a possible issue with autovacuum. Treating ANALYZE like VACUUM
and running it on individual tables one at a time is probably the wrong thing
to be doing. What really has to happen is it should run analyze on all tables
together in a single transaction and commit all the new stats together.
Out-of-sync stats can be worse than out-of-date stats.

With stats on timestamp_tbl but not on interval_tbl:

postgres-# QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=2.02..48.29 rows=432 width=24) (actual time=0.112..1.515 rows=104 loops=1)
-> Seq Scan on interval_tbl i (cost=0.00..36.55 rows=9 width=16) (actual time=0.036..0.070 rows=2 loops=1)
Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= '23:00:00'::interval))
-> Materialize (cost=2.02..2.50 rows=48 width=8) (actual time=0.030..0.377 rows=52 loops=2)
-> Seq Scan on timestamp_tbl t (cost=0.00..1.97 rows=48 width=8) (actual time=0.048..0.333 rows=52 loops=1)
Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time zone) AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
Total runtime: 1.904 ms
(7 rows)

All other combinations perform the join the other way around:

With both analyzed:

postgres-# QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.15..6.37 rows=144 width=24) (actual time=0.109..1.653 rows=104 loops=1)
-> Seq Scan on timestamp_tbl t (cost=0.00..1.97 rows=48 width=8) (actual time=0.063..0.356 rows=52 loops=1)
Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time zone) AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
-> Materialize (cost=1.15..1.18 rows=3 width=16) (actual time=0.003..0.008 rows=2 loops=52)
-> Seq Scan on interval_tbl i (cost=0.00..1.15 rows=3 width=16) (actual time=0.017..0.052 rows=2 loops=1)
Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= '23:00:00'::interval))
Total runtime: 2.025 ms
(7 rows)

With no stats at all:

postgres-# postgres-# QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=36.56..80.89 rows=99 width=24) (actual time=0.147..1.698 rows=104 loops=1)
-> Seq Scan on timestamp_tbl t (cost=0.00..42.10 rows=11 width=8) (actual time=0.101..0.388 rows=52 loops=1)
Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time zone) AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
-> Materialize (cost=36.56..36.65 rows=9 width=16) (actual time=0.003..0.008 rows=2 loops=52)
-> Seq Scan on interval_tbl i (cost=0.00..36.55 rows=9 width=16) (actual time=0.017..0.053 rows=2 loops=1)
Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= '23:00:00'::interval))
Total runtime: 2.063 ms
(7 rows)

With stats on interval_tbl but not timestamp_tbl:

postgres-# postgres-# QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=1.15..44.00 rows=33 width=24) (actual time=0.100..1.725 rows=104 loops=1)
-> Seq Scan on timestamp_tbl t (cost=0.00..42.10 rows=11 width=8) (actual time=0.055..0.351 rows=52 loops=1)
Filter: ((d1 >= '1990-01-01 00:00:00'::timestamp without time zone) AND (d1 <= '2001-01-01 00:00:00'::timestamp without time zone))
-> Materialize (cost=1.15..1.18 rows=3 width=16) (actual time=0.003..0.008 rows=2 loops=52)
-> Seq Scan on interval_tbl i (cost=0.00..1.15 rows=3 width=16) (actual time=0.016..0.051 rows=2 loops=1)
Filter: ((f1 >= '00:00:00'::interval) AND (f1 <= '23:00:00'::interval))
Total runtime: 2.104 ms
(7 rows)

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jim Nasby 2007-07-24 16:59:22 Re: avoiding WAL logging in 8.3
Previous Message Jim Nasby 2007-07-24 16:50:11 Re: autovacuum default parameters