Re: Postgresql 'eats' all mi data partition

From: Tomas Szepe <szepe(at)pinerecords(dot)com>
To: Gaetano Mendola <mendola(at)bigfoot(dot)com>
Cc: "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: Postgresql 'eats' all mi data partition
Date: 2003-09-27 11:31:27
Message-ID: 20030927113127.GC32507@louise.pinerecords.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

> [mendola(at)bigfoot(dot)com]
>
> >>>>>indexes:
> >>>>>stats_min_pkey primary key btree (ip, "start")
> >>>>>stats_min_start btree ("start")
> >>>>>stats_hr_pkey primary key btree (ip, "start")
> >>>>>stats_hr_start btree ("start")
> >>>>
> >>>>>ip is of type "inet" in all tables.
> >>>>>start is of type "timestamp without time zone" in all tables.
> >>>>
> >>>>Okay, so a pkey index entry will take 32 bytes counting overhead ...
> >>>>you've got about 10:1 bloat on the stats_min indexes and 2:1 in
> >>>>stats_hr.
> >>>>Definitely bad :-(
> >>>
> >>>
> >>>The only difference between the way stats_min and stats_hr are updated
> >>>stems from the fact that stats_min only holds records for the last 1440
> >>>minutes (because of its killer time granularity), whereas stats_hr
> >>>holds its data until we decide some of it is obsolete enough and
> >>>issue a "delete from" by hand.
> >>
> >>Are you sure that all indexes are needed and that a partial index could
> >>not help ? What about the statistics on these indexes ? Are they really
> >>used ?
> >
> >
> >Yup, they're all essential. :(
>
> May I see yours tipical queries where these indexes are involved ?

A very typical query (apart from those I've already posted in my "how the
updates work" mail) would be:

select ip, start::time,
(in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest) as d_in,
(out_tcp_web + out_tcp_mail + out_udp_and_icmp
+ out_tcp_rest + out_rest) as d_out,
(in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest
+ out_tcp_web + out_tcp_mail + out_udp_and_icmp
+ out_tcp_rest + out_rest) as d_sum,
((in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest
+ out_tcp_web + out_tcp_mail + out_udp_and_icmp
+ out_tcp_rest + out_rest) / intlen / 128) as rate_sum
from stats_hr
where start=(select start from stats_hr order by start desc limit 1)
order by (in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest + out_tcp_web + out_tcp_mail
+ out_udp_and_icmp + out_tcp_rest + out_rest)
desc
limit 20;

->

QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=5152.26..5152.31 rows=20 width=104)
InitPlan
-> Limit (cost=0.00..1.11 rows=1 width=8)
-> Index Scan Backward using stats_hr_start on stats_hr (cost=0.00..12059890.93 rows=10847279 width=8)
-> Sort (cost=5152.26..5162.55 rows=4115 width=104)
Sort Key: (((((((((in_tcp_web + in_tcp_mail) + in_udp_and_icmp) + in_tcp_rest) + in_rest) + out_tcp_web) + out_tcp_mail) + out_udp_and_icmp) + out_tcp_rest) + out_rest)
-> Index Scan using stats_hr_start on stats_hr (cost=0.00..4905.22 rows=4115 width=104)
Index Cond: ("start" = $0)

(done in 0.079s.)

--
Tomas Szepe <szepe(at)pinerecords(dot)com>

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Szepe 2003-09-27 11:50:16 [7.4beta3] pg_dump -t xxx won't output sequences
Previous Message Gaetano Mendola 2003-09-27 11:20:25 Re: Postgresql 'eats' all mi data partition