Re: Performance issues when the number of records are around 10 Million

From: venu madhav <venutaurus539(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Performance issues when the number of records are around 10 Million
Date: 2010-05-11 07:28:17
Message-ID: baa9e2ef-1e4b-4fe4-bd82-4d149f8fed12@u20g2000pru.googlegroups.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On May 11, 12:03 pm, andreas(dot)kretsch(dot)(dot)(dot)(at)schollglas(dot)com ("A.
Kretschmer") wrote:
> In response to venu madhav :
>
>
>
> > Hi all,
> >        In my database application, I've a table whose records can
> > reach 10M and insertions can happen at a faster rate like 100
> > insertions per second in the peak times. I configured postgres to do
> > auto vacuum on hourly basis. I have frontend GUI application in CGI
> > which displays the data from the database.
> >         When I try to get the last twenty records from the database,
> > it takes around 10-15 mins to complete the operation.This is the query
> > which is used:
>
> > select e.cid, timestamp, s.sig_class, s.sig_priority, s.sig_name,
> > e.sniff_ip, e.sniff_channel, s.sig_config, e.wifi_addr_1,
> > e.wifi_addr_2, e.view_status, bssid  FROM event e, signature s WHERE
> > s.sig_id = e.signature   AND e.timestamp >= '1270449180' AND
> > e.timestamp < '1273473180'  ORDER BY e.cid DESC,  e.cid DESC limit 21
> > offset 10539780;
>
> First, show us the table-definition for both tables.
> Secondly the output generated from EXPLAIN ANALYSE <your query>
[Venu Madhav]
-------------------------------- event schema
-------------------------------------------------------------
snort=# \d event;
Table "public.event"
Column | Type | Modifiers
--------------------+-----------------------+-----------
sid | integer | not null
cid | bigint | not null
sig_name | character varying(80) | not null
signature | integer | not null
sig_class | character varying(80) | not null
sig_priority | bigint |
timestamp | bigint | not null
sniff_ip | character varying(16) |
sniff_channel | smallint |
bssid | character varying(18) |
view_status | smallint |
wifi_ver | smallint | not null
wifi_type | smallint | not null
wifi_stype | smallint | not null
wifi_other_fc_bits | smallint | not null
wifi_dur_id | integer | not null
wifi_addr_1 | character varying(18) |
wifi_addr_2 | character varying(18) |
wifi_addr_3 | character varying(18) |
wifi_addr_4 | character varying(18) |
wifi_seq_ctrl | integer |
Indexes:
"event_pkey" PRIMARY KEY, btree (cid)
"cid_idx" btree (cid)
"signature_idx" btree (signature)
"timestamp_idx" btree ("timestamp")
"wifi_addr_1_idx" btree (wifi_addr_1)
"wifi_addr_2_idx" btree (wifi_addr_2)
Foreign-key constraints:
"event_fkey_sid" FOREIGN KEY (sid) REFERENCES sensor(sid) ON
UPDATE CASCADE
"event_fkey_signature" FOREIGN KEY (signature) REFERENCES
signature(sig_id) ON UPDATE CASCADE

----------------------------------------------- schema of signature
----------------------------------
snort=# \d signature;
Table "public.signature"
Column | Type |
Modifiers
--------------+-----------------------
+------------------------------------------------------------
sig_id | integer | not null default
nextval('signature_sig_id_seq'::regclass)
sig_name | character varying(80) | not null
sig_class | character varying(80) | not null
sig_priority | bigint |
sig_rev | bigint |
sig_sid | bigint |
sig_config | text |
Indexes:
"signature_pkey" PRIMARY KEY, btree (sig_id)
"sig_class_idx" btree (sig_class)
"sig_name_idx" btree (sig_name)

-------------------------- Explain Analyse of the query
-------------------------------------------
nort=# EXPLAIN ANALYZE select e.cid, timestamp, s.sig_class,
s.sig_priority, s.sig_name, e.sniff_ip, e.sniff_channel, s.sig_config,
e.wifi_addr_1, e.wifi_addr_2, e.view_status, bssid FROM event e,
signature s WHERE s.sig_id = e.signature AND e.timestamp >=
'1270449180' AND e.timestamp < '1273473180' ORDER BY e.cid DESC,
e.cid DESC limit 21 offset 10539780;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=7885743.98..7885743.98 rows=1 width=287) (actual
time=1462193.060..1462193.083 rows=14 loops=1)
-> Sort (cost=7859399.66..7885743.98 rows=10537727 width=287)
(actual time=1349648.207..1456496.334 rows=10539794 loops=1)
Sort Key: e.cid
-> Hash Join (cost=2.44..645448.31 rows=10537727 width=287)
(actual time=0.182..139745.001 rows=10539794 loops=1)
Hash Cond: ("outer".signature = "inner".sig_id)
-> Seq Scan on event e (cost=0.00..487379.97
rows=10537727 width=104) (actual time=0.012..121595.257 rows=10539794
loops=1)
Filter: (("timestamp" >= 1270449180::bigint) AND
("timestamp" < 1273473180::bigint))
-> Hash (cost=2.35..2.35 rows=35 width=191) (actual
time=0.097..0.097 rows=36 loops=1)
-> Seq Scan on signature s (cost=0.00..2.35
rows=35 width=191) (actual time=0.005..0.045 rows=36 loops=1)
Total runtime: 1463829.145 ms
(10 rows)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> I'm surprised about the "e.timestamp >= '1270449180'", is this a
> TIMESTAMP-column?
[Venu Madhav] Yes, it is timestamp in epoch time.
>
> And, to retrieve the last twenty records you should write:
>
> ORDER BY ts DESC LIMIT 20
>
> With a proper index on this column this should force an index-scan.
>
> Andreas
> --
> Andreas Kretschmer
> Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
> GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99
>
> --
> Sent via pgsql-general mailing list (pgsql-gene(dot)(dot)(dot)(at)postgresql(dot)org)
> To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alexander Pyhalov 2010-05-11 07:53:37 log database in which error occurs
Previous Message Alex Hunsaker 2010-05-11 07:23:02 Re: initdb fails on Centos 5.4 x64