| From: | Ron Johnson <ronljohnsonjr(at)gmail(dot)com> |
|---|---|
| To: | pgsql-general <pgsql-general(at)postgresql(dot)org> |
| Subject: | Re: Attempting to delete excess rows from table with BATCH DELETE |
| Date: | 2026-01-28 03:57:56 |
| Message-ID: | CANzqJaBcHnScWQjsVLoNg3OiQqWh3RcNEyNxwFV65Ps6PErQNw@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
On Tue, Jan 27, 2026 at 10:22 PM Gus Spier <gus(dot)spier(at)gmail(dot)com> wrote:
> Environment: AWS RDS Aurora for PostgreSQL 15 hosting tables that
> support scientific research. The development environment predominantly
> uses JPA with Hibernate.
>
> Years of neglect have allowed mission tables to accumulate hundreds of
> millions of rows of excess data. The developers and the customer
> decided we must delete all rows older than 75 days. Table partitioning
> was briefly considered but discarded because of the effort needed to
> refactor the codebase.
>
> I proposed the straight-forward course of action: delete by batches
> from the victim tables.
>
> The solution seemed obvious:
>
> For candidate tables:
> - Determine timestamp column iwith the most relevant value and call it
> the Discriminator.
> - Delete any row whose Discriminator value is 60 days older than the
> Discriminator, with a LIMIT of 50000,
> - Get the results of the batch by querying GET DIAGNOSTICS. That value
> held aside for later use.
> - COMMIT the delete transaction
> - Loop to the top and continue deleting batches until there are no
> more rows older than 60 days.
>
> - Before running a test, I ensure that the Discriminator column is indexed
>
> I write a procedure to accomplish all this work but it persists in
> returning a error to the effect that a COMMIT is not valid in a block
> tht tries to DELETE data.
>
> Has anybody seen this before?
Yup, when putting the COMMIT in a DO block, which isn't allowed.
> Is there a solution?
>
Loop using bash. In a similar case like this, I first looked for the
oldest day's data in the tables, then did something like this bash
pseudo-code:
StopDate=$(date -d'60 days ago')
DeleteDay=$1
export PGHOST=foo.example.com
export PGDATABASE=bar
while [[ "$DeleteDay <= "$StopDate" ]]; do
psql -Xc "DELETE FROM blarge WHERE txn_date > '$DeleteDay' + INTERVAL
'1' DAY;"
DeleteDay=$(date -d "$DeleteDay + 1 day" +"%Y-%m-%d")
done
Using that method, I developed a fast and automated monthly archive process
which exported and then deleted from 120 tables.
--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ron Johnson | 2026-01-28 04:03:46 | Re: Attempting to delete excess rows from table with BATCH DELETE |
| Previous Message | Tom Lane | 2026-01-28 03:52:00 | Re: Attempting to delete excess rows from table with BATCH DELETE |