Re: Fast data, slow data

From: Tim Uckun <timuckun(at)gmail(dot)com>
To: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Fast data, slow data
Date: 2014-06-26 22:35:28
Message-ID: CAGuHJrOdoKktqKTxG1Eq=oE-LgUimToeV57MmVS-TVQF2OOjQw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

The database is functioning fine now but I am anticipating a much higher
workload in the future. The table in question is probably going to have a
few million rows per day inserted into it when it gets busy, if it gets
very busy it might be in the tens of millions per day but that's
speculation at this point.

I don't want to say that the data is not important but if I drop one or two
sensor readings it's not going to be the end of the world.

On Fri, Jun 27, 2014 at 3:45 AM, Steve Crawford <
scrawford(at)pinpointresearch(dot)com> wrote:

> On 06/26/2014 02:29 AM, Tim Uckun wrote:
>
>> I have a use case in which the most recent data experiences a lot of
>> transactions (inserts and updates) and then the churn kind of calms down.
>> Eventually the data is relatively static and will only be updated in
>> special and sporatic events.
>>
>> I was thinking about keeping the high churn data in a different table so
>> that the vacuums on that table can go faster and the vacuums on the rest of
>> the data will rarely be needed.
>>
>> Ideally the high churn data would be in an in memory table but failing
>> that I thought that an unlogged table would be good enough. So now I need
>> a way to flush the oldest data in this table to another logged table and do
>> it in an efficient manner. I don't think partitioning is a good idea in
>> this case because the partitions will be for small time periods (5 to 15
>> minutes).
>>
>> Anybody have a similar problem? If so how did you solve it?
>>
>
> It's not at all unusual for a database to experience a high portion of its
> activity on a small subset of the data so in that sense the "problem" is
> already solved by appropriate sizing and tuning to make appropriate
> adjustments to the caching and other properties already provided by the OS
> and DB.
>
> Having said that, there is far too little information here to provide
> specific advice on tuning and other approaches that may be of use for you.
> In particular:
>
> Is this currently a hypothetical question or do you have a running system
> that is experiencing performance problems?
>
> In either case can you provide some parameters including rows and size of
> your large table(s), what you mean by "high churn" (rate of inserts,
> deletes, updates and how they are grouped into transactions), the size of
> the table that is heavily updated, how the database is used (transactional,
> data-mining, ...) and anything else you feel might help the list understand
> your use-case.
>
> If you have a live database, any metrics/observations you can supply might
> be helpful. For example, are particular queries slow? What speed is
> required? Do you have lots of simultaneous connections or is everything
> through a single connection.
>
> Finally, confirm that the data is not important or is easily recreated
> (implied by your in-memory/unlogged-table comments).
>
> Cheers,
> Steve
>
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2014-06-27 02:27:58 Re: python modul pre-import to avoid importing each time
Previous Message Tim Uckun 2014-06-26 22:31:58 Re: Fast data, slow data