Re: PostgreSQL performance problem -> tuning

From: Yaroslav Mazurak <yamazurak(at)Lviv(dot)Bank(dot)Gov(dot)UA>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: PostgreSQL performance problem -> tuning
Date: 2003-08-07 16:30:49
Message-ID: 3F327EB9.5020707@lviv.bank.gov.ua
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Hi All!

First, thanks for answers!

Richard Huxton wrote:

> On Thursday 07 August 2003 09:24, Yaroslav Mazurak wrote:

>>>IIRC there is a limit on filesystem cache on freeBSD. 300MB by default.
>>>If that is the case, you might have to raise it to make
>>>effective_cache_size really effective..

>> "Try various sysctls" says nothing for me. I want use *all available
>>RAM* (of course, without needed for OS use) for PostgreSQL.

> PG will be using the OS' disk caching.

I think all applications using OS disk caching. ;)
Or you want to say that PostgreSQL tuned for using OS-specific cache
implementation?
Do you know method for examining real size of OS filesystem cache? If I
understood right, PostgreSQL dynamically use all available RAM minus
shared_buffers minus k * sort_mem minus effective_cache_size?
I want configure PostgreSQL for using _maximum_ of available RAM.

> Looks fine - PG isn't growing too large and your swap usage seems steady. We
> can try upping the sort memory later, but given the amount of data you're
> dealing with I'd guess 64MB should be fine.

> I think we're going to have to break the query down a little and see where the
> issue is.

> What's the situation with:
> EXPLAIN ANALYZE SELECT <some_field> FROM v_file02wide WHERE a011 = 3 AND
> inrepdate(data) AND b030 IN (SELECT b030 FROM dov_bank WHERE
> dov_bank_box_22(box) IN ('NL', 'NM')) AND r030 = 980;

> and:
> EXPLAIN ANALYZE SELECT SUM(showcalc(<parameters>)) FROM <something simple>

> Hopefully one of these will run in a reasonable time, and the other will not.
> Then we can examine the slow query in more detail. Nothing from your previous
> EXPLAIN (email of yesterday 13:42) looks unreasonable but something must be
> going wild in the heart of the query, otherwise you wouldn't be here.

Yes, you're right. I've tested a few statements and obtain interesting
results.
SELECT * FROM v_file02wide WHERE... executes about 34 seconds.
SELECT showcalc(...); executes from 0.7 seconds (without recursion) up
to 6.3 seconds if recursion is used! :(
This mean, that approximate execute time for fully qualified SELECT
with about 8K rows is... about 13 hours! :-O
Hence, problem is in my function showcalc:

CREATE OR REPLACE FUNCTION showcalc(VARCHAR(10), VARCHAR(2), VARCHAR(4),
NUMERIC(16)) RETURNS NUMERIC(16)
LANGUAGE SQL STABLE AS '
-- Parameters: code, dd, r020, t071
SELECT COALESCE(
(SELECT sc.koef * $4
FROM showing AS s NATURAL JOIN showcomp AS sc
WHERE s.kod = $1
AND NOT SUBSTR(acc_mask, 1, 1) = ''[''
AND SUBSTR(acc_mask, 1, 4) = $3
AND SUBSTR(acc_mask, 5, 1) = SUBSTR($2, 1, 1)),
(SELECT SUM(sc.koef * COALESCE(showcalc(SUBSTR(acc_mask, 2,
LENGTH(acc_mask) - 2), $2, $3, $4), 0))
FROM showing AS s NATURAL JOIN showcomp AS sc
WHERE s.kod = $1
AND SUBSTR(acc_mask, 1, 1) = ''[''),
0) AS showing;
';

BTW, cross join "," with WHERE clause don't improve performance
relative to NATURAL JOIN.
Additionally, with user-defined function beginchar (SUBSTR(..., 1, 1)),
used for indexing, showcalc executes about 16 seconds. With function
SUBSTR the same showcalc executes 6 seconds.

Table showing contain information about showing: showing id (id_show),
code (kod) and description (opys). Table showcomp contain information
about showing components (accounts): showing id (id_show), coefficient
(koef) and account_mask (acc_mask). Account mask is 4-char balance
account mask || 1-char account characteristics or another showing in
square bracket.
Example:
showing
=========+==========+===========
id_show | kod | opys
=========+==========+===========
1 | 'A00101' | 'Received'
2 | 'A00102' | 'Sent'
3 | 'A00103' | 'Total'
=========+==========+===========
showcomp
=========+======+===========
id_show | koef | acc_mask
=========+======+===========
1 | 1.0 | '60102'
1 | 1.0 | '60112'
2 | 1.0 | '70011'
2 | 1.0 | '70021'
3 | 1.0 | '[A00101]'
3 | -1.0 | '[A00102]'
=========+======+===========
This mean that: A00101 includes accounts 6010 and 6011 with
characteristics 2, A00102 includes accounts 7001 and 7002 with
characteristics 1, and A00103 = A00102 - A00101. In almost all cases
recursion depth not exceed 1 level, but I'm not sure. :)

View v_file02wide contain account (r020) and 2-char characteristics
(dd). Using showcalc I want to sum numbers (t071) on accounts included
in appropriate showings. I.e SELECT SUM(showcalc('A00101', dd, r020,
t071)) FROM ... must return sum on accounts 6010 and 6011 with
characteristics 2 etc.

Now I think about change function showcalc or/and this data
structures... :)
Anyway, 600Mb is too low for PostgreSQL for executing my query - DBMS
raise error after 11.5 hours (of estimated 13?). :(

With best regards
Yaroslav Mazurak.

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Yaroslav Mazurak 2003-08-07 17:04:03 Re: PostgreSQL performance problem -> tuning
Previous Message scott.marlowe 2003-08-07 16:20:26 Re: PostgreSQL performance problem -> tuning