Re: Optimize date query for large child tables: GiST or GIN?

From: David Jarvis <thangalin(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thom Brown <thombrown(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Optimize date query for large child tables: GiST or GIN?
Date: 2010-05-21 00:46:43
Message-ID: AANLkTinAs8sWmMjEynf1K0DNsmaIiqRRNqVR5tNlkbuC@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi,

Something in here really smells fishy to me. Those extract's above are
> working on values which are from the table.. Why aren't you using these
> functions to figure out how to construct the actual dates based on the
> values provided by the *user*..?
>

Because I've only been using PostgreSQL for one week. For the last several
years I've been developing with Oracle on mid-sized systems (40 million
books, 5 million reservations per year, etc.). And even then, primarily on
the user-facing side of the applications.

Looking at your screenshot, I think you need to take those two date
> values that the user provides, make them into actual dates (maybe you
> need a CASE statement or something similar, that shouldn't be that hard,
>

So the user selects Dec 22 and Mar 22 for 1900 to 2009 and the system feeds
the report a WHERE clause that looks like:

m.taken BETWEEN '22-12-1900'::date AND '22-03-1901'::date and
m.taken BETWEEN '22-12-1901'::date AND '22-03-1902'::date and
m.taken BETWEEN '22-12-1902'::date AND '22-03-1903'::date and ...

That tightly couples the report query to the code that sets the report
engine parameters. One of the parameters would be SQL code in the form of a
dynamically crafted WHERE clause. I'd rather keep the SQL code that is used
to create the report entirely with the report engine if at all possible.

> Also, you're trying to do constraint_exclusion, but have you made sure
> that it's turned on? And have you made sure that those constraints are
> really the right ones and that they make sense? You're using a bunch of
> extract()'s there too, why not just specify a CHECK constraint on the
> date ranges which are allowed in the table..?
>

I don't know what the date ranges are? So I can't partition them by year?

Right now I created 72 child tables by using the category and month. This
may have been a bad choice. But at least all the data is in the system now
so dissecting or integrating it back in different ways shouldn't take days.

Thanks everyone for all your help, I really appreciate the time you've taken
to guide me in the right direction to make the system as fast as it can be.

Dave

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message David Jarvis 2010-05-21 02:02:34 Re: Optimize date query for large child tables: GiST or GIN?
Previous Message David Jarvis 2010-05-21 00:28:26 Re: Optimize date query for large child tables: GiST or GIN?