Re: Optimize date query for large child tables: GiST or GIN?

From: Thom Brown <thombrown(at)gmail(dot)com>
To: David Jarvis <thangalin(at)gmail(dot)com>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimize date query for large child tables: GiST or GIN?
Date: 2010-05-20 18:36:36
Message-ID: AANLkTimk7lRS8FHJOot3tF-c6Ds4jexCvWKEQNU1V0Xt@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 20 May 2010 17:36, David Jarvis <thangalin(at)gmail(dot)com> wrote:
> Hi, Thom.
>
> The query is given two items:
>
> Range of years
> Range of days
>
> I need to select all data between the range of days (e.g., Dec 22 - Mar 22)
> over the range of years (e.g., 1950 - 1970), such as shown here:
>
> http://i.imgur.com/MUkuZ.png
>
> For Jun 1 to Jul 1 it would be no problem because they the same year. But
> for Dec 22 to Mar 22, it is difficult because Mar 22 is in the next year
> (relative to Dec 22).
>
> How do I do that without strings?
>
> Dave
>
>

Okay, get your app to convert the month-date to a day of year, so we
have year_start, year_end, day_of_year_start, day_of_year_end

and your where clause would something like this:

WHERE extract(YEAR from m.taken) BETWEEN year1 and year2
AND (
extract(DOY from m.taken) BETWEEN day_of_year_start AND day_of_year_end
OR (
extract(DOY from m.taken) >= day_of_year_start OR extract(DOY from
m.taken) <= day_of_year_end
)
)

... substituting the placeholders where they appear.

So if we had:

year1=1941
year2=1952
day_of_year_start=244 (based on input date of 1st September)
day_of_year_end=94 (based on 4th April)

We'd have:

WHERE extract(YEAR from m.taken) BETWEEN 1941 and 1952
AND (
extract(DOY from m.taken) BETWEEN 244 AND 94
OR (
extract(DOY from m.taken) >= 244 OR extract(DOY from m.taken) <= 94
)
)

Then you could add expression indexes for the YEAR and DOY extract parts, like:

CREATE INDEX idx_taken_doy ON climate.measurement (EXTRACT(DOY from taken));
CREATE INDEX idx_taken_year ON climate.measurement (EXTRACT(YEAR from taken));

Although maybe you don't need those, depending on how the date
datatype matching works in the planner with the EXTRACT function.

Regards

Thom

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Thom Brown 2010-05-20 18:58:53 Re: Optimize date query for large child tables: GiST or GIN?
Previous Message David Jarvis 2010-05-20 15:43:26 Re: Optimize date query for large child tables: GiST or GIN?