Disk usage problem with BLOBs using libpq

From: John Khouri <khouri(at)csl(dot)sri(dot)com>
To: pgsql-interfaces(at)postgresql(dot)org
Subject: Disk usage problem with BLOBs using libpq
Date: 2003-08-06 17:17:12
Message-ID: a05200f04bb56e3b071f5@[130.107.14.36]
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-interfaces

Hi,

I'm experiencing a disk usage "lossage" when rewriting large objects
(BLOBs) multiple times. I'm using the libpq libraries, posgresql
7.3.2 on FreeBSD-4.5 (default settings on installation), but I don't
believe the version or OS are factors here.

In the test case below, a 5k-byte BLOB is created once and rewritten
n-times. Given these multiple same-size rewrites, I would expect the
disk usage to increase only slightly due to (hopefully small)
overhead needed for the transaction. However I am seeing a
substantial increase in disk usage that is linear with respect to the
number of rewrites (in /usr/local/pgsql/data/base/16976):

- 10000 rewrites increases the disk space by 11.5 Mbytes
- 50000 rewrites increases the disk space by 57.0 Mbytes
and so on

Is this a postgres "bug" or is there a way to minimize this "lost"
disk space by adjusting a configuration setting or manipulating the
transaction? (Note: in our actual application with multiple BLOBS
summing 40 Mbytes, the resulting disk space consumed in
...pgsql/data/base... is over 4 Gbytes).

Any help is appreciated.

Thanks,
JK

#include <stdio.h>
#include "libpq-fe.h"
#include "libpq/libpq-fs.h"

#define BLOB_SIZE 5000

static void writeBlob (PGconn* conn, Oid lobjId)
{
char buf[BLOB_SIZE];
int lobj_fd;
int count;
int status;

lobj_fd = lo_open (conn, lobjId, INV_WRITE);

count = lo_write (conn, lobj_fd, buf, BLOB_SIZE);
if (count < BLOB_SIZE)
fprintf (stderr, "error while writing large object\n");

status = lo_close (conn, lobj_fd);
if (status != 0)
fprintf (stderr, "lo_close error %d\n", status);
}

int main (int argc, char **argv)
{
Oid lobjId;
PGconn* conn;
PGresult* res;
int num_rewrites;
int j;

conn = PQconnectdb ("dbname='test_db' user='postgres'");

if (PQstatus(conn) == CONNECTION_BAD) {
fprintf(stderr, "Connection to database 'test_db' failed.\n");
fprintf(stderr, "%s", PQerrorMessage(conn));
return (1);
}

num_rewrites = atoi (argv[1]);

res = PQexec (conn, "begin");
PQclear(res);

/* create initial blob */

lobjId = lo_creat (conn, INV_READ | INV_WRITE);

if (lobjId == 0)
fprintf (stderr, "error creating large object\n");

writeBlob (conn, lobjId);

res = PQexec (conn, "end");
PQclear (res);

/* rewrite blob "num_rewrites" times */

for (j = 0; j < num_rewrites; j++) {
res = PQexec (conn, "begin");
PQclear(res);
writeBlob (conn, lobjId);
res = PQexec (conn, "end");
PQclear (res);
}

PQfinish (conn);
return (0);
}

Browse pgsql-interfaces by date

  From Date Subject
Next Message Bruce Momjian 2003-08-07 03:59:56 Re: ecpg problem in 7.4 beta
Previous Message Alastair G. Hogge 2003-08-06 01:23:31 Re: Seeking PyGreSQL help