From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Andy Halsall <halsall_andy(at)hotmail(dot)com> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: Can I do better than this heapscan and sort? |
Date: | 2012-06-26 13:36:54 |
Message-ID: | CAHyXU0zVGH_MsXjJu5-++V0ZYQkEBju1JZY41nqEVddrtUWoYw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Thu, Jun 21, 2012 at 3:07 PM, Andy Halsall <halsall_andy(at)hotmail(dot)com> wrote:
> I have two tables node and relationship. Each relationship record connects
> two nodes and has an application keys (unfortunately named) that can be used
> by the application to look-up a relationship and get from one node to the
> other.
>
> My query uses a node id and a description of a relationship from the node,
> and must find the "next" relationship that the node has. It does this by
> finding all the relationships that could be "next", ordering them and then
> getting the first.
>
> Details are below but I end up with 6896 candidates for "next".
>
> If I'm reading the output correctly it takes 13.509 ms to apply the filter
> and another 7 ms or so to do the sort of the remaining 6896 nodes.
>
> Have tried many index combinations to try and improve the results. I suspect
> that with so many nodes to sort, postgresql will opt for heap scan rather
> than index. But why does it not use the IDX_order_sort_down_2 index for the
> sort?
>
> Thanks,
> Andy
>
>
> Details..........
>
> Version
> -------
> PostgreSQL 9.1.2 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.5.2,
> 64-bit
>
> Tables
> ------
> CREATE TABLE node (
> node_id bigint NOT NULL,
> node_type int4 NOT NULL,
> c_state int4 NOT NULL,
> d_state int4 NOT NULL,
> sort_key bigint NOT NULL,
> permissions bytea NOT NULL,
> audit bytea NOT NULL,
> pkg_id bytea NULL,
> created timestamp NOT NULL
> );
>
> CREATE TABLE relationship (
> rel_id bigint NOT NULL,
> rel_type integer NOT NULL,
> s_t_n bigint NOT NULL,
> t_s_n bigint NOT NULL,
> state integer NOT NULL,
> control integer NOT NULL,
> sort_key bigint NOT NULL,
> prime_key bytea NULL,
> prime_key_len integer NOT NULL,
> sec_key bytea NULL,
> sec_key_len integer NOT NULL,
> up_sort_key bigint NOT NULL,
> up_prime_key bytea NULL,
> up_prime_key_len integer NOT NULL,
> up_sec_key bytea NULL,
> up_sec_key_len integer NOT NULL,
> permissions bytea NOT NULL,
> t_s_n_type integer NOT NULL,
> created timestamp NOT NULL
> );
>
> Constraints
> -----------
> -- Primary keys
> ALTER TABLE node ADD CONSTRAINT PK_node PRIMARY KEY (node_id);
>
> ALTER TABLE relationship ADD CONSTRAINT PK_relationship PRIMARY KEY
> (rel_id);
>
> -- Foreign keys
> ALTER TABLE relationship ADD CONSTRAINT FK_node_s FOREIGN KEY (s_t_n)
> REFERENCES node (node_id);
>
> ALTER TABLE relationship ADD CONSTRAINT FK_node_n FOREIGN KEY (t_s_n)
> REFERENCES node (node_id);
>
>
> Indexes
> -------
> CREATE INDEX IDX_node_type ON node (node_type ASC) TABLESPACE ds_appex_ts_10
> ;
> CREATE INDEX IDX_node_sort_key ON node (sort_key ASC) TABLESPACE
> ds_appex_ts_10
> ;
> CREATE INDEX IDX_relationship_s_t_n ON relationship (s_t_n ASC) TABLESPACE
> ds_appex_ts_10
> ;
> CREATE INDEX IDX_relationship_t_s_n ON relationship (t_s_n ASC) TABLESPACE
> ds_appex_ts_10
> ;
> CREATE INDEX IDX_relationship_type ON relationship (rel_type ASC) TABLESPACE
> ds_appex_ts_10
> ;
> CREATE INDEX IDX_relationship_prime_key ON relationship (prime_key ASC)
> TABLESPACE ds_appex_ts_10
> ;
> CREATE INDEX IDX_relationship_u_prime_key ON relationship (up_prime_key ASC)
> TABLESPACE ds_appex_ts_10
> ;
> CREATE INDEX IDX_relationship_sec_key ON relationship (sec_key ASC)
> TABLESPACE ds_appex_ts_10
> ;
> CREATE INDEX IDX_order_first ON node(sort_key DESC, node_id DESC) TABLESPACE
> ds_appex_ts_10
> ;
> CREATE INDEX IDX_order_sort_down_1 ON relationship(sort_key DESC, prime_key
> ASC NULLS FIRST, sec_key ASC NULLS FIRST) TABLESPACE ds_appex_ts_10
> ;
> CREATE INDEX IDX_order_sort_down_2 ON relationship(sort_key DESC, prime_key
> ASC NULLS FIRST, sec_key DESC NULLS FIRST) TABLESPACE ds_appex_ts_10
> ;
> CREATE INDEX IDX_order_sort_up ON relationship(up_sort_key DESC,
> up_prime_key ASC NULLS FIRST, up_sec_key ASC NULLS FIRST) TABLESPACE
> ds_appex_ts_10
> ;
>
> Query
> -----
> CREATE OR REPLACE FUNCTION sp_get_rel_sort_dup_sec_desc(in_rel_type1
> integer, in_rel_type2 integer, in_node_type integer, in_own_guid bigint,
> in_prev_prime_key bytea, in_prev_prime_key_len integer, in_prev_sec_key
> bytea, in_prev_sec_key_len integer, in_prev_sort_key bigint, in_ctrl
> integer) RETURNS select_rel_holder AS
> '
> declare
> h select_rel_holder%rowtype;
>
> begin
> SELECT INTO h r.rel_id, r.t_s_n, r.rel_type, r.sort_key,
> r.state,r.permissions, r.control,
> r.prime_key, r.prime_key_len, r.sec_key,
> r.sec_key_len,
> r.up_prime_key, r.up_prime_key_len, r.up_sec_key,
> r.up_sec_key_len
> FROM relationship r
> WHERE r.s_t_n = in_own_guid AND (r.rel_type = in_rel_type1 OR
> r.rel_type = in_rel_type2)
> AND
> (
> (
> (
> r.prime_key > in_prev_prime_key
> OR
> ( r.prime_key = in_prev_prime_key AND r.sec_key <
> in_prev_sec_key)
> )
> AND
> r.sort_key = in_prev_sort_key
> )
>
> OR
> r.sort_key < in_prev_sort_key
> )
> AND t_s_n_type = in_node_type
> AND r.control >= in_ctrl
> ORDER BY sort_key DESC, prime_key ASC NULLS FIRST, sec_key DESC
> NULLS FIRST LIMIT 1;
> RETURN h;
> end
> '
> language 'plpgsql' STABLE;
>
>
> EXPLAIN ANALYZE output
> -------------------------------
> Limit (cost=48.90..48.90 rows=1 width=89) (actual
> time=21.480..21.480 rows=1 loops=1)
> Output: rel_id, t_s_n, rel_type, sort_key, state, permissions,
> control, prime_key, prime_key_len, sec_key, sec_key_len, up_prime_key,
> up_prime_key_l
> en, up_sec_key, up_sec_key_len
>
> -> Sort (cost=48.90..48.90 rows=1 width=89) (actual
> time=21.479..21.479 rows=1 loops=1)
> Output: rel_id, t_s_n, rel_type, sort_key, state,
> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len,
> up_prime_key, up_prime
> _key_len, up_sec_key, up_sec_key_len
> Sort Key: r.sort_key, r.prime_key, r.sec_key
> Sort Method: top-N heapsort Memory: 25kB
>
> -> Bitmap Heap Scan on public.relationship r
> (cost=3.39..48.89 rows=1 width=89) (actual time=1.034..13.509 rows=6986
> loops=1)
> Output: rel_id, t_s_n, rel_type, sort_key, state,
> permissions, control, prime_key, prime_key_len, sec_key, sec_key_len,
> up_prime_key, up
> _prime_key_len, up_sec_key, up_sec_key_len
> Recheck Cond: (r.s_t_n = $4)
> Filter: ((r.control >= $10) AND (r.t_s_n_type = $3)
> AND ((r.rel_type = $1) OR (r.rel_type = $2)) AND ((((r.prime_key > $5) OR
> ((r.prime_
> key = $5) AND (r.sec_key < $7))) AND (r.sort_key = $9)) OR (r.sort_key <
> $9)))
>
> -> Bitmap Index Scan on idx_relationship_s_t_n
> (cost=0.00..3.39 rows=18 width=0) (actual time=0.951..0.951 rows=6989
> loops=1)
> Index Cond: (r.s_t_n = $4)
Absolutely. You need to learn and master row-wise comparison. It was
added for exactly this purpose :-).
SELECT * FROM foo WHERE (a,b,c) > (a1,b1,c1) ORDER BY a,b,c LIMIT k;
will be fully optimized if you have an index on a,b,c (a1,b1,c1 are
the last ones you read off). Be advised that if there is not a lot of
cardinality on 'a', you may need to disable certain index plans to get
a good plan in some cases.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Merlin Moncure | 2012-06-26 13:42:34 | Re: Can I do better than this heapscan and sort? |
Previous Message | Pavel Stehule | 2012-06-26 09:03:14 | Re: Performance of a large array access by position (tested version 9.1.3) |