Thx and additional Q's .....

From: Dave Crooke <dcrooke(at)gmail(dot)com>
To: Joe Conway <mail(at)joeconway(dot)com>
Cc: pgsql-performance <pgsql-performance(at)postgresql(dot)org>
Subject: Thx and additional Q's .....
Date: 2010-02-23 23:37:04
Message-ID: ca24673e1002231537q475ca112j23ca8b1212e75ee@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks Joe.

1. In my case, I'm erring on the side of not using the limited partitioning
support in PG 8.3, which we're using .... because I'm generating new tables
all the time, I need to dynamically generate the DML anyway, and it's
actually less code to just do my own calculation on the application side to
see which table name I need to use (as opposed to dynamically creating all
the constraints and triggers needed to get PG to do it), and doing almost
everything in vanilla SQL makes it much easier to port (I'm going to need to
support Oracle within the next 12 months). I have the luxury of not having
any application code which directly hits the database, it all goes through
one persistence manager class which encapsulates both CRUD operations and
bulk stats queries, so I don't have to fake the existence of the old table
with views or such.

2. The idea of partitioning by "b", the performance counter type column, is
an interesting one .... I am definitely going to consider it for a future
release. For now, my new schema is going to end up turning one 300GB table
(incl indexes) into about 100 tables of sizes ranging from about 0.1GB to
3.5GB each (with indexes), which feels like an OK size range for both
manageability and performance (though I'd still be interested to see what
people on this list think). When I get to storing multiple terabytes, having
6,000 tables is going to look more attractive :-)

3. If you were suggesting the (ts, a) index as an alternative to (a, b, ts)
and (ts) that's an interesting and cunning plan ... I'd need to see how it
performs on the queries which extract data series for graphing, which are of
the form " .... where a=2617 and b=4 and ts between '2010-02-22 00:00' and
'2010-02-25 00:00'" and which are interactive.
<testing>
I tried this and it took quite a bit longer than the version with the
"natural" index (102ms vs 0.5ms both from buffer cache) ... and this is a
test data set with only 1,000 pieces of equipment (values of a) and my
design target is 25,000 pieces.

# explain analyse select time_stamp, value from foo where a=1 and b=14 and
time_stamp between '2010-02-23 09:00' and '2010-02-23 12:00';

QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using foo1 on foo (cost=0.00..21196.67 rows=17 width=16)
(actual time=0.133..102.571 rows=72 loops=1)
Index Cond: ((time_stamp >= '2010-02-23 09:00:00'::timestamp without time
zone) AND (time_stamp <= '2010-02-23 12:00:00'::timestamp without time zone)
AND (a = 1) AND (b = 14))
Total runtime: 102.720 ms
(3 rows)

Time: 103.844 ms

# explain analyse select time_stamp, value from perf_raw_2010_02_23 where
a=1 and b=14 and time_stamp between '2010-02-23 09:00' and '2010-02-23
12:00';

QUERY
PLAN

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on perf_raw_2010_02_23 (cost=5.37..68.55 rows=16
width=16) (actual time=0.107..0.319 rows=72 loops=1)
Recheck Cond: ((a = 1) AND (b = 14) AND (time_stamp >= '2010-02-23
09:00:00'::timestamp without time zone) AND (time_stamp <= '2010-02-23
12:00:00'::timestamp without time zone))
-> Bitmap Index Scan on perf_raw_2010_02_23_a (cost=0.00..5.36 rows=16
width=0) (actual time=0.082..0.082 rows=72 loops=1)
Index Cond: ((a = 1) AND (b = 14) AND (time_stamp >= '2010-02-23
09:00:00'::timestamp without time zone) AND (time_stamp <= '2010-02-23
12:00:00'::timestamp without time zone))
Total runtime: 0.456 ms
(5 rows)

Time: 1.811 ms

"foo" is just a copy of "perf_raw_2010_02_23" with the different indexing
that Joe suggested.

I don't know how sophisticated the index scan in PG 8.3 is, i.e. whether it
will do a sparse scan skipping btree nodes that can be constraint-excluded
using the 2nd and subsequent fields of the index tuple. By the looks of the
performance, it looks as if it is just doing a simple scan over all the
index "leaf node" entries in the range and testing them individually.

*Tom* - is an optimization for this in / in plan for a future release?

For my purposes, I only need the ts-first index for doing data rollups,
which are done at the end of each rollup period ... once I have started a
new data table, I don't need that index anymore and I can drop it from all
the older tables ... right now I am using 1 table per day for raw data, but
keeping 30 days history, so I can drop the secondary index from 29 out of 30
which makes the disk space cost of it modest, and the insertion overhead
seems to be covered.

Looks like the a-first index is definitely necessary for acceptable
interactive performance ... it's powering a Flex UI that graphs multiple
data series concurrently.

4. Apart from query performance, the big benefit I wanted from the sharding
scheme is to get out of the DELETE and VACUUM business .... now all my data
aging is table drops, which are much, much faster in PG :-)

On Mon, Feb 22, 2010 at 11:23 PM, Joe Conway <mail(at)joeconway(dot)com> wrote:

>
> Without going through your very long set of questions in detail, it
> strikes me that you might be better off if you:
>
> 1) use PostgreSQL partitioning (constraint exclusion)
> 2) partition by ts range
> 3) consider also including b in your partitioning scheme
> 4) create one index as (ts, a)
> 5) use dynamically generated SQL and table names in the application
> code to create (conditionally) and load the tables
>
> But of course test both this and your proposed method and compare ;-)
>
> Also you might consider PL/R for some of your analysis (e.g. mode would
> be simple, but perhaps not as fast):
> http://www.joeconway.com/web/guest/pl/r
>
> HTH,
>
> Joe
>
>

Browse pgsql-performance by date

  From Date Subject
Next Message Scott Carey 2010-02-23 23:51:43 Re: Planner question - "bit" data types
Previous Message Kevin Grittner 2010-02-23 23:03:23 Re: Internal operations when the planner makes a hash join.