Re: PG planning randomly ?

From: "Laurent Raufaste" <analogue(at)glop(dot)org>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: PG planning randomly ?
Date: 2008-02-27 17:38:48
Message-ID: 669dc9710802270938j78d6fc80tc3e588ece1add4f8@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

2008/2/26, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>
> That's the right RESTRICT function, but what exactly did you mean by
> "first definition"? Are there more?

I thought it was enough, here is the complete definition of the <@ operator:

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
PROCEDURE = ltree_risparent,
LEFTARG = ltree,
RIGHTARG = ltree,
COMMUTATOR = @>,
RESTRICT = ltreeparentsel,
JOIN = contjoinsel
);

ALTER OPERATOR public.<@ (ltree, ltree) OWNER TO postgres;

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
PROCEDURE = _ltree_r_isparent,
LEFTARG = ltree,
RIGHTARG = ltree[],
COMMUTATOR = @>,
RESTRICT = contsel,
JOIN = contjoinsel
);

ALTER OPERATOR public.<@ (ltree, ltree[]) OWNER TO postgres;

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
PROCEDURE = _ltree_risparent,
LEFTARG = ltree[],
RIGHTARG = ltree,
COMMUTATOR = @>,
RESTRICT = contsel,
JOIN = contjoinsel
);

ALTER OPERATOR public.<@ (ltree[], ltree) OWNER TO postgres;

--
-- Name: <@; Type: OPERATOR; Schema: public; Owner: postgres
--

CREATE OPERATOR <@ (
PROCEDURE = hs_contained,
LEFTARG = hstore,
RIGHTARG = hstore,
COMMUTATOR = @>,
RESTRICT = contsel,
JOIN = contjoinsel
);

ALTER OPERATOR public.<@ (hstore, hstore) OWNER TO postgres;

>
> It may be that it's just not possible for the estimator to come up with
> accurate rowcount estimates given the amount of info it has available.
> The query you are complaining about confuses the issue quite a lot by
> involving other issues. Would you try just "explain analyze select 1
> from _commment where path <@ '....';" for various typical path values,
> and see if it's coming up with estimated rowcounts that are in the right
> ballpark compared to the actual ones?
>

It might be the source of the problem =)
I executed the following query on all the servers with a varying path
(but with the same path on each server), before and after lauching an
ANALYZE _comment.

EXPLAIN ANALYZE SELECT 1
FROM _comment
WHERE path <@ '0.1.810879'
;

On every server except one it showed the same plan before and after the ANALYZE:
Bitmap Heap Scan on _comment (cost=174.87..6163.31 rows=1536
width=0) (actual time=1.072..1.495 rows=1070 loops=1)
Recheck Cond: (path <@ '0.1.14155763'::ltree)
-> Bitmap Index Scan on gist_idx_comment_path (cost=0.00..174.48
rows=1536 width=0) (actual time=1.058..1.058 rows=1070 loops=1)
Index Cond: (path <@ '0.1.14155763'::ltree)
Total runtime: 1.670 ms

On a random server, the plan before the ANALYZE was:
Bitmap Heap Scan on _comment (cost=15833.00..440356.99 rows=155649
width=0) (actual time=1.581..2.885 rows=1070 loops=1)
Recheck Cond: (path <@ '0.1.14155763'::ltree)
-> Bitmap Index Scan on gist_idx_comment_path
(cost=0.00..15794.09 rows=155649 width=0) (actual time=1.552..1.552
rows=1070 loops=1)
Index Cond: (path <@ '0.1.14155763'::ltree)
Total runtime: 3.160 ms

The runtime is ok, but the planned cost is huge, because the row count
of the index scan estimates 100x more rows. After the ANALYZE it was
like the others. If this wrong row count happens, I understand why the
planner try to find an alternative plan in the first query I showed
you in a previous mail.

How can I help him to better estimate the row count ? Setting
default_stats_target to 1000 did not help =(

--
Laurent Raufaste
<http://www.glop.org/>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Bill Moran 2008-02-27 17:40:57 Re: questions about CLUSTER
Previous Message Frits Hoogland 2008-02-27 17:36:52 Re: how to identify expensive steps in an explain analyze output