Re: Improving performance of merging data between tables

From: Pawel Veselov <pawel(dot)veselov(at)gmail(dot)com>
To: Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com>
Cc: Andy Colson <andy(at)squeakycode(dot)net>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Improving performance of merging data between tables
Date: 2015-02-11 02:08:55
Message-ID: CAMnJ+BdXSCiCf10teXRd52HAPcR3yifBBe0Nx57Xy87Ns5RExg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Sorry, it took me a while to respond, but I re-factored all of this process
to suggestions.

On Wed, Jan 7, 2015 at 7:49 PM, Maxim Boguk <maxim(dot)boguk(at)gmail(dot)com> wrote:
>
> On Wed, Jan 7, 2015 at 8:49 PM, Pawel Veselov <pawel(dot)veselov(at)gmail(dot)com>
> wrote:
>>
>> PS: your setup look pretty complicated and hard to analyze without seeing
>>> all involved table structures, transaction/query flow, and (especially)
>>> involved procedures source code.
>>>
>>
>> Sure :) At this point, I've put together the "bulk merge" code as well. I
>> can't quite see much of a difference, actually, but it's hard to trust the
>> execution times, as on the same amount of data they vary from, say, 0.5s to
>> 2s, and the sample data is not stepping on any other locks. In general, I'm
>> afraid all those left joins and multiple scans, even over small amount of
>> data, is nullifying any positive effect.
>>
> Now some ideas to check.
> The high CPU usage usually isn't related to locking, but related to seq
> scan or wrong plans or simple inefficient pl/pgsql code, locked processes
> usually doesn't use too much cpu.
>
> 1)on the test database perform select pg_stat_reset(); then perform full
> round of merges, then check
> select * from pg_stat_user_tables where seq_scan>0 order by seq_tup_read;
> and if you find a lot of seq_scan and seq_tuple_reads on the particular
> table try find where they coming from (it could be reason for high CPU
> usage).
>
> 2)enable track_functions in postgresql.conf and perform the same sequence
> (select pg_stat_reset() + full round of merges
> ) then check
> select * FROM pg_stat_user_functions order by self_time desc;
> and check which function using the most time.
>

These are good pointers, if the new process is having the same sort of
problems, this will come in handy on figuring out where they are coming
from, thank you.

> 3)old/lost prepared transactions can have deadly effect on the database
> performance at whole. So check select * from pg_prepared_xact(); and verify
> that you don't have a hours (or weeks) old prepared xact lying around.
>

If there are "lost" prepared transactions, they will lock up a particular
instance from being able to write into its table data, so it will just
stall the node. But does happen, and we have an application mechanism to
find and delete those.

> PS: btw I still don't fully understood relation between the:
> "
> - merges data into its own node tables (using merge_xxx PL/pgSQL functions)
> "
> and provided code for the public."merge_all02-9A-46-8B-C1-DD" and
> PUBLIC.merge_agrio.
> As I see
> public."merge_all02-9A-46-8B-C1-DD" calling PUBLIC.merge_agrio, and the
> PUBLIC.merge_agrio updates a global table R_AGRIO (but not the "own node
> table").
>

It's a bit irrelevant at this point, but. merge_all02-9A-46-8B-C1-DD()
function will take all data for 02-9A-46-8B-C1-DD node and move it into the
master table. There is an analogous merge_02-9A-46-8B-C1-DD() function that
takes data from application, and writes it into the tables for
02-9A-46-8B-C1-DD node. The process of moving data node tables->main tables
and application->node tables is nearly identical, hence I only provided the
body once. The big difference, is when merging into master, there is a lot
more data to look through, as node tables only contain data that has not
yet been merged into the master yet.

> I think the best implementation of such task is asynchronous processing of
> this changes via background process. An application only inserts events
> into queue table (it lockless process), and some background process read
> these data from queue table and merge it into main table (again lockless
> because it single thread so no concurrent writes), and then delete the
> merged data from queue table.
>

Well, that was a really good suggestion, thank you. Some weeks later I've
put it together. This hasn't hit production yet, so I'm yet to see the
overall improvement effect. Along with turning it into a queue, I've added
provisions to try to combine as much data as possible before writing it out
into the databse tables, and merged all of the satellite tables with the
main data.

Before, I had:
r_agrio
r_brk_xxx (multiple entries reference rows in r_agrio)

Now, I have:
r_agrio_daily
r_agrio_total
r_agrio_hourly

All the data that was in the r_brk_xxx tables is now in columns of the
r_agrio* tables. To get around the fact that there are potentially multiple
BRK records for each AGR record, the data is now stored as JSON object. The
primary key used for the BRK tables is turned into a string that serves as
a key in a top level JSON object. This should help me tremendously on the
side that needs to read that data, as I had to join or left join the BRK
tables.

Splitting this into 3 tables may come back and bite me in back, since it's
two more inserts and corresponding look ups, but it seriously helps me on
the reading side of things.

The code that aggregates the JSON data is still done in PL/PGSQL, which is
probably a bad idea, considering that PL doesn't have good ways of
manipulating JSON data in-place (I can't run PL/V8, or untrusted
languages). But I should move this logic to Java, and use updateable result
sets to modify data in place. The reason I left it in PL/PGSQL is that I
didn't want to do select and then update, making it two look ups per each
update. Another concern on moving it to the application side is the network
turn-around.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Oliver 2015-02-11 07:53:31 Re: [GENERAL] Change postgresql encoding
Previous Message Paul Jungwirth 2015-02-11 01:38:15 Re: Hardware requirements for a PostGIS server