Re: Large insert and delete batches

From: Anssi Kääriäinen <anssi(dot)kaariainen(at)thl(dot)fi>
To: Marti Raudsepp <marti(at)juffo(dot)org>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Large insert and delete batches
Date: 2012-03-02 12:51:53
Message-ID: 4F50C269.2020807@thl.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 03/01/2012 10:51 PM, Marti Raudsepp wrote:
> The problem with IN() and ARRAY[] is that the whole list of numbers
> has to be parsed by the SQL syntax parser, which has significant
> memory and CPU overhead (it has to accept arbitrary expressions in the
> list). But there's a shortcut around the parser: you can pass in the
> list as an array literal string, e.g:
> select * from the_table where id = ANY('{1,2,3,4,5}')
OK, that explains the memory usage.
> The SQL parser considers the value one long string and passes it to
> the array input function, which is a much simpler routine. This should
> scale up much better.
>
> Even better if you could pass in the array as a query parameter, so
> the SQL parser doesn't even see the long string -- but I think you
> have to jump through some hoops to do that in psycopg2.
Luckily there is no need to do any tricks. The question I was trying to
seek answer for was should there be some default batch size for inserts
and deletes in Django, and the answer seems clear: the problems appear
only when the batch sizes are enormous, so there doesn't seem to be a
reason to have default limits. Actually, the batch sizes are so large
that it is likely the Python process will OOM before you can trigger
problems in the DB.

- Anssi

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2012-03-02 19:31:00 Re: Inefficient min/max against partition (ver 9.1.1)
Previous Message Claudio Freire 2012-03-02 02:05:15 Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: [PERFORM] Re: 回复: [PERFORM] PG as in-memory db? How to warm up and re-populate buffers? How to read in all tuples into memory?