From: | Abhijeet R <abhijeet(dot)1989(at)gmail(dot)com> |
---|---|
To: | Josh Kupershmidt <schmiddy(at)gmail(dot)com> |
Cc: | pgsql-novice(at)postgresql(dot)org |
Subject: | Re: Batch updates to 1 column using python-pgsql in multiple rows |
Date: | 2012-08-29 18:00:32 |
Message-ID: | 503E58C0.3040302@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-novice |
Hi Josh,
The reason for me using python-pgsql was that it was the first adapter
to come in Google search. After you said, I looked at the documentation
of psycopg2 and it looks more promising.
As myvar value is different for every row, I will use the COPY and then
UPDATE statement as said by you. Thanks for your suggestions.
On Wed 29 Aug 2012 11:12:11 PM IST, Josh Kupershmidt wrote:
> On Wed, Aug 29, 2012 at 6:58 AM, Abhijeet R <abhijeet(dot)1989(at)gmail(dot)com> wrote:
>> I have to update a certain column of a table at many rows. (order of
>> thousands).
>>
>> For now, I am using a prepare statement and executing the queries like:
>>
>> query = "UPDATE mytable SET myvar = $1 WHERE myvar2=$2"
>> db.execute(query, (var, var1)) #db is my connection object
>
> Are you able to reformulate the multiple individual UPDATEs into a
> single bulk UPDATE statement? That will almost certainly be the
> fastest way.
>
>> But, I still feel like I can make it faster by using db.executemany()
>> method. The thing is I am not able to get the syntax of it.
>
> That's more of a question for your specific database adapter
> (python-pgsql, according to your message subject) rather than this
> list. But if you are able to construct a list of tuples, e.g.
> list_to_insert = [(var, var1) for (var, var1) in ... ]
>
> Then you should be able to use executemany() like this:
>
> db.executemany(query, list_to_insert)
>
> BTW, is there a reason you're using python-pgsql instead of, say,
> psycopg2? It seems like the former may be a dead project.
>
>> Can anyone please guide me as to how do I do batch updates in the fastest
>> possible manner? Also, will executemany() really be faster? If so, how does
>> that result in faster batch updates?
>
> Again, the fastest way to perform these UPDATEs would be to
> reformulate your individual queries into a single bulk UPDATE. If you
> really are setting each value of "myvar" to something different, and
> not a value derived from some other column within its row, and you
> can't reasonably formulate that logic in a single bulk UPDATE query,
> then I'd bet the next fastest way would be to use a COPY statement to
> load a temporary table with your (myvar, myvar2) pairs as rows, then
> perform a single:
>
> UPDATE mytable
> FROM my_temp_table
> SET myvar = my_temp_table.myvar,
> WHERE myvar2 = my_temp_table.myvar2;
>
> Josh
--
Regards,
Abhijeet
http://blog.abhijeetr.com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Wood | 2012-08-29 21:50:55 | Re: using index "pg_toast_..." despite IgnoreSystemIndexes |
Previous Message | Josh Kupershmidt | 2012-08-29 17:42:11 | Re: Batch updates to 1 column using python-pgsql in multiple rows |