Re: Article on MySQL vs. Postgres

From: "Robert B(dot) Easter" <reaster(at)comptechnews(dot)com>
To: Hannu Krosing <hannu(at)tm(dot)ee>, Tim Perdue <tperdue(at)valinux(dot)com>
Cc: The Hermit Hacker <scrappy(at)hub(dot)org>, pgsql-hackers(at)hub(dot)org
Subject: Re: Article on MySQL vs. Postgres
Date: 2000-07-05 12:27:07
Message-ID: 00070508422804.01354@comptechnews
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 05 Jul 2000, Hannu Krosing wrote:
> Tim Perdue wrote:
> >
> > The Hermit Hacker wrote:
> > > > Further, I have had situations where postgres actually had DUPLICATE
> > > > ids in a primary key field, probably due to some abort or other nasty
> > > > situation in the middle of a commit. How did I recover from That?
> > > > Well, I had to run a count(*) next to each ID and select out the rows
> > > > where there was more than one of each "unique" id, then reinsert those
> > > > rows and drop and rebuild the indexes and reset the sequences.
> > >
> > > Odd, were you using transactions here, or transactionless?
> >
> > Does it matter? I suppose it was my programming error that somehow I got
> > duplicate primary keys in a table in the database where that should be
> > totally impossible under any circumstance? Another stupid
> > transactionless program I'm sure.
> >
> > At any rate, it appears that the main problem I had with postgres (the
> > 8K tuple limit) is being fixed and I will mention that in my writeup.
>
> Currently (as of 7.0.x) you could use BLKSIZE=32K + lztext datatype and
> get text fields about 64-128K depending on data if you are desperately
> after big textfields.
>
> -----------
> Hannu

While it is slow, I've been able to store unlimited amounts of text into
the database by using the following code. I've tested inserting over 4
megabytes from a TEXTAREA web form using PHP. When inserting such massive
amounts of text, you will have to wait a while, but it will eventually succeed
if you don't run out of memory. If you do run out of memory, the backend
terminates gracefully and the transaction aborts/rollsback.

-- Load the PGSQL procedural language
-- This could also be done with the createlang script/program.
-- See man createlang.
CREATE FUNCTION plpgsql_call_handler()
RETURNS OPAQUE AS '/usr/local/pgsql/lib/plpgsql.so'
LANGUAGE 'C';

CREATE TRUSTED PROCEDURAL LANGUAGE 'plpgsql'
HANDLER plpgsql_call_handler
LANCOMPILER 'PL/pgSQL';

--------------------------------------------------------------------------------
--
-- Large Text storage
--

-- putlgtext - generic function to store text into the
-- specified text storage table.
-- The table specified in $1 should have the following
-- fields:
-- id, text_seq, text_block
--
-- $1 is the name of the table into which $3 is stored
-- $2 is the id of the text and references id in another table
-- $3 is the text to store, which is broken into chunks.
-- returns 0 on success
-- nonzero otherwise
CREATE FUNCTION putlgtext (TEXT, INTEGER, TEXT) RETURNS INTEGER AS '
set i_table $1
set i_id $2
set i_t {}
regsub -all {([\\''\\\\])} $3 {\\\\\\1} i_t
set i_seq 0
while { $i_t != {} } {
set i_offset 0
set tblock [string range $i_t 0 [expr 7000 + $i_offset]]
# Do not split string at a backslash
while { [string range $tblock end end] == "\\\\" && $i_offset < 1001 } {
set i_offset [expr $i_offset + 1]
set tblock [string range $i_t 0 [expr 7000 + $i_offset]]
}
set i_t [string range $i_t [expr 7000 + [expr $i_offset + 1]] end]
spi_exec "INSERT INTO $i_table (id, text_seq, text_block) VALUES ( $i_id , $i_seq , ''$tblock'' )"
incr i_seq
}
return 0
' LANGUAGE 'pltcl';

-- getlgtext - like putlgtext, this is a generic
-- function that does the opposite of putlgtext
-- $1 is the table from which to get TEXT
-- $2 is the id of the text to get
-- returns the text concatenated from one or more rows
CREATE FUNCTION getlgtext(TEXT, INTEGER) RETURNS TEXT AS '
set o_text {}
spi_exec -array q_row "SELECT text_block FROM $1 WHERE id = $2 ORDER BY text_seq" {
append o_text $q_row(text_block)
}
return $o_text
' LANGUAGE 'pltcl';

-- largetext exists just to hold an id and a dummy 'lgtext' attribute.
-- This table's trigger function provides for inserting and updating
-- into largetext_block. The text input to lgtext actually gets
-- broken into chunks and stored in largetext_block.
-- Deletes to this table will chain to largetext_block automatically
-- by referential integrity on the id attribute.
-- Selects have to be done using the getlgtext function.
CREATE TABLE largetext (
id INTEGER PRIMARY KEY,
lgtext TEXT -- dummy field
);
COMMENT ON TABLE largetext IS 'Holds large text';

-- This table must have the field names as they are.
-- These attribute names are expected by put/getlgtext.
CREATE TABLE largetext_block (
id INTEGER NOT NULL
REFERENCES largetext
ON DELETE CASCADE,

text_seq INTEGER NOT NULL,

text_block TEXT,

PRIMARY KEY (id, text_seq)
);
COMMENT ON TABLE largetext_block IS 'Holds blocks of text for table largetext';
CREATE SEQUENCE largetext_seq;

-- SELECT:
-- SELECT id AS the_id FROM largetext;
-- SELECT getlgtext('largetext_block', id) FROM largetext WHERE id = the_id;

-- INSERT:
-- INSERT INTO largetext (lgtext) values ('.......');

-- DELETE:
-- DELETE FROM largetext WHERE id = someid;
-- deletes from largetext and by referential
-- integrity, from largetext_text all associated block rows.
CREATE FUNCTION largetext_trigfun() RETURNS OPAQUE AS '
set i_t {}
regsub -all {([\\''\\\\])} $NEW($2) {\\\\\\1} i_t
switch $TG_op {
INSERT {
spi_exec "SELECT nextval(''largetext_seq'') AS new_id"
set NEW($1) $new_id
spi_exec "SELECT putlgtext(''largetext_block'', $new_id, ''$i_t'') AS rcode"
if { $rcode != 0 } then { return SKIP }
}
UPDATE {
if { $NEW($2) != {} } then {
spi_exec "DELETE FROM largetext_text WHERE id = $OLD($1)"
spi_exec "SELECT putlgtext(''largetext_block'', $OLD($1), ''$NEW($2)'') AS rcode"
if { $rcode != 0 } then { return SKIP }
}
}
}
set NEW($2) "ok"
return [array get NEW]
' LANGUAGE 'pltcl';

-- Set the function as trigger for table largetext
CREATE TRIGGER largetext_trig BEFORE INSERT OR UPDATE
ON largetext FOR EACH ROW EXECUTE
PROCEDURE largetext_trigfun(id,lgtext);

I had to use the regsub function calls to replace the \ escaping on literal
'\'s. What a pain! If anyone can try this code and suggest ways to improve
its speed, I'd be happy.

--
Robert

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message eisentrp 2000-07-05 12:29:07 Re: Per-database/schema settings
Previous Message The Hermit Hacker 2000-07-05 12:11:43 Re: [HACKERS] Re: Revised Copyright: is this morepalatable?