Re: Deletion Challenge

From: Berend Tober <btober(at)computer(dot)org>
To: Steve Crawford <scrawford(at)pinpointresearch(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: Deletion Challenge
Date: 2015-12-10 01:59:32
Message-ID: 5668DC84.6080703@computer.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Steve Crawford wrote:
> The two general solutions are the "keep the last one" proposed by Adrian
> "keep the last N" that I sent.
>
> But it might be worth stepping back a bit. You said you are having
> performance problems that you feel would be improved by removing only a
> million rows which doesn't sound like that much to me. It's less than
> half of what I *add* to just one of my tables every week and my database
> is dwarfed by those of many of the participants on this list.
>
> This suggests that there may be other issues such as tuning, indexing or
> query optimization at play. Depending on your requirements, partitioning
> might be useful. It wouldn't be last N but could easily be done to
> partition by date-ranges which makes archiving and purging a low-cost
> operation.
>
> You might want to expand a bit on the core issue you are trying to solve.
>

I really appreciate the deep-dive.

I'm quite sure the performance issue is mostly a hardware limitation at
this point. The application is in a developmental phase and running for
test purposes on grossly inadequate hardware for production purposes ...
think home laptop computer in one case.

The issue is that I'd like the application (that is, the data base and
its stored procedures) to be robust enough to be a "long-running"
application, i.e. one that doesn't suffer gradual performance
degradation as time and the accumulated data increase. For the cash and
similar journals, I'd like to retain some amount of history so that
players can review "recent" transactions so as to understand and verify
how the current balance was arrived at, but at some point let old
transactions age-off and be deleted.

This question was sort of addressed at the "query tuning" aspect, and
I'm confident that partitioning would help. But since this is just a
game, retention of a full and auditable history is not really required:
I have a lot of flexibility to determine what to keep and in fact am not
exactly sure how much to keep. I know I need at least one row in order
to retain the current balance, but I'm thinking something on the order
of scores or hundreds, maybe a thousand transactions per player in each
of several similar journals retained at any point in time would be
sufficient.

This project is a game, btw, described at

https://github.com/bmtober/fairwinds

for those interested in the backstory.

I am eager to get some real-world experience with multiple players
actually using the application and providing feedback, which is
gradually happening by means of presentations I have and am scheduled to
make at local user groups. Eventually I want to raise some money to rent
some virtual private server space and host it on a publicly-available
site when obvious scalability issues like this are mitigated.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Corradini, Carlos 2015-12-10 13:38:10 [JDBC] plpgsql function with RETURNS SETOF refcursor in JAVA
Previous Message Andreas Kretschmer 2015-12-10 01:05:06 Re: HELP!!! The WAL Archive is taking up all space