Re: A very long running query....

From: Ioannis Anagnostopoulos <ioannis(at)anatec(dot)com>
To: Marc Mamin <M(dot)Mamin(at)intershop(dot)de>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: A very long running query....
Date: 2012-07-21 09:22:09
Message-ID: 500A74C1.8020109@anatec.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice pgsql-performance

On 21/07/2012 10:16, Marc Mamin wrote:
> RE: [PERFORM] A very long running query....
>
> Hello,
> isn't the first test superfluous here ?
>
> > where extract('day' from message_copies.msg_date_rec) = 17
> > and date_trunc('day', message_copies.msg_date_rec) = '2012-07-17'
>
>
> > Here is the index:
> >
> > CREATE INDEX idx_message_copies_wk2_date_src_pos_partial
> > ON feed_all_y2012m07.message_copies_wk2
> > USING btree
> > (date_trunc('day'::text, msg_date_rec),
> > src_id,
> > (((pos_georef1::text || pos_georef2::text) || pos_georef3::text)
> || pos_georef4::text))
> > TABLESPACE archive
> > WHERE (((pos_georef1::text || pos_georef2::text) ||
> pos_georef3::text) || pos_georef4::text) IS NOT NULL
> > OR NOT (((pos_georef1::text || pos_georef2::text) ||
> pos_georef3::text) || pos_georef4::text) = ''::text;
>
>
> the georef test can be simplified using coalesce:
>
> > and (message_copies.pos_georef1 || message_copies.pos_georef2
> || message_copies.pos_georef3 || message_copies.pos_georef4) <> ''
> > and not (message_copies.pos_georef1 || message_copies.pos_georef2
> || message_copies.pos_georef3 || message_copies.pos_georef4) is null
> =>
> and coaesce (
> (message_copies.pos_georef1 || message_copies.pos_georef2 ||
> message_copies.pos_georef3 || message_copies.pos_georef4),
> '') <> ''
>
> In order to avoid this test at query time you might add a boolean
> column message_copies.pos.has_georef,
> and keep it up to date with a before insert or update trigger. This
> will allow to shorten your index definition and simplify the planner
> task a little bit.
> Moreover it will fasten your query in cases when the index don't get used.
>
> As Tom already mentioned it, it may make sense not to concatenate the
> georef within the index, but keep them separated, or even keep them in
> different indexes.
> Which is the best depend on the other queries running against this table
>
> HTH,
>
> Marc Mamin
>
>
>
> -----Original Message-----
> From: pgsql-performance-owner(at)postgresql(dot)org on behalf of Ioannis
> Anagnostopoulos
> Sent: Sat 7/21/2012 1:56 AM
> To: Tom Lane
> Cc: Claudio Freire; pgsql-performance(at)postgresql(dot)org
> Subject: Re: [PERFORM] A very long running query....
>
> On 21/07/2012 00:10, Tom Lane wrote:
> > Claudio Freire <klaussfreire(at)gmail(dot)com> writes:
> >> Looking at this:
> >> " -> Index Scan using
> >> idx_message_copies_wk2_date_src_pos_partial on message_copies_wk2
> >> message_copies (cost=0.00..19057.93 rows=52 width=32) (actual
> >> time=62.124..5486270.845 rows=387524 loops=1)"
> >> " Index Cond: ((date_trunc('day'::text,
> >> msg_date_rec) = '2012-07-17 00:00:00'::timestamp without time zone)
> >> AND (src_id = 1))"
> >> " Filter: ((date_part('day'::text,
> >> msg_date_rec) = 17::double precision) AND (NOT (((((pos_georef1)::text
> >> || (pos_georef2)::text) || (pos_georef3)::text) ||
> >> (pos_georef4)::text) IS NULL)) AND (((((pos_georef1)::text ||
> >> (pos_georef2)::text) || (pos_georef3)::text) || (pos_georef4)::text)
> >> <> ''::text))"
> > I think the real problem is that the planner has no hope of doing
> > anything very accurate with such an unwieldy filter condition. I'd look
> > at ways of making the filter conditions simpler, perhaps by recasting
> > the data representation. In particular, that's a horridly bad way of
> > asking whether some columns are empty, which I gather is the intent.
> > If you really want to do it just like that, creating an index on the
> > concatenation expression would guide ANALYZE to collect some stats about
> > it, but it would probably be a lot more efficient to put together an AND
> > or OR of tests on the individual columns.
> >
> > regards, tom lane
> So what you suggest is to forget all together the concatenation of the
> georef1/2/3/4 and instead alter my query with something like:
>
> georef1 is not null and not georeg1 = ''....etc for georef2 3 and 4
>
> That would require to alter my index and have the four georef columns
> separately in it and not as a concatenation and so on for the partial
> index part. And a final thing, you seem to imply that the indexes are
> used by the analyser to collect statistics even if they are not used. So
> an index serves not only as a way to speed up targeted queries but also
> to provide better statistics to the analyzer?
>
> Kind Regards
> Yiannis
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>
>
No because it is used to select a partition. Otherwise it will go
through the whole hierarchy...

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Tom Lane 2012-07-21 16:28:56 Re: Backend freezes at "idle in transaction" while using libpq PQexec (PostgreSQL version 9.1.2)
Previous Message Marc Mamin 2012-07-21 09:16:16 Re: A very long running query....

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2012-07-21 16:58:20 Re: A very long running query....
Previous Message Marc Mamin 2012-07-21 09:16:16 Re: A very long running query....