Re: Database not browsable during COPY on PostgreSQL

From: Steve Crawford <scrawford(at)pinpointresearch(dot)com>
To: pgsql-novice(at)postgresql(dot)org
Cc: Daniel Staal <DStaal(at)usa(dot)net>, Majid Azimi <majid(dot)merkava(at)gmail(dot)com>
Subject: Re: Database not browsable during COPY on PostgreSQL
Date: 2012-03-06 23:55:08
Message-ID: 4F56A3DC.9000504@pinpointresearch.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

On 03/06/2012 08:59 AM, Daniel Staal wrote:
> When using COPY to restore a CSV file, PostgreSQL shows alot of
>>>>>> CONTEXT: COPY tbl_vbvdata, line 8039085:
>>>>>> "1648469982,20431325,1314343300,4.5,87,1,643160,1"
>>>>>>
>>>>>> Also phppgadmin shows that the real database size(4GB) but when I
>>>>>> choose to browse the table it shows Estimated row count to 0...
>>>>>
>>>>> Your COPY is running inside a Transaction, PhpPgAdmin is outside, it
>>>>> can't
>>>>> see
>>>>> the not-commited rows
>>>> I pressed CTRL+C when COPY was running and cancelled the process. Now
>>>> the DB size is still 4GB but no data is available for SELECT. How can
>>>> I commit that?...
>>>
>>> You can't, the whole transaction is canceled now....
>>>
>>
>> I cannot regain the disk space too?
> Disk space should be regained during the next vacuum. This will either
> happen with the autovacuum, or you can issue the vacuum command manually.

No, it won't. Vacuum will make space available for reuse but will not
actually shrink the on-disk size. To do that you need to use CLUSTER or
VACUUM FULL. Alternately, if you want to delete *all* data in the table,
use TRUNCATE.

To clarify what is happening for the OP, PostgreSQL is ACID compliant
(check the docs or Google for all the details). You are observing the
effects of A, C and to some extent I.

(A)tomicity - your transaction will fully succeed or fully fail. You
can't have a transaction fail part way through and debit one account but
not credit another or in your example, you can't do a partial copy
leaving some unknown portion of the data in your database and other
parts missing. It's all or none.

(C)onsistency - everything in the database must conform to your defined
rules or constraints. Until you successfully copy the full CSV into the
database you won't know if it has attempted to create duplicate primary
keys or violated other uniqueness constraints, attempted to read text
into an int column, etc. And per "A", any problem will cause the full
copy to fail. You won't be left with a partial import to untangle.

(I)solation - a topic unto itself but basically in your case your copy
will be invisible to other users until it succeeds and is committed.

To achieve some of these goals as well as for performance, deleting data
from a table does not shrink it but vacuuming (as performed by the
autovacuum process), identifies portions of the data files that can be
reused so typical tables reach an equilibrium state. But odd occurrences
like failed bulk inserts, bulk deletes and bulk updates may bloat the
table in such a way that manual intervention is desirable.

Cheers,
Steve

In response to

Browse pgsql-novice by date

  From Date Subject
Next Message Majid Azimi 2012-03-07 12:48:20 Is Tablespace something like Logical Volume?
Previous Message Merlin Moncure 2012-03-06 22:04:06 Re: Slow duplicate deletes