Re: Problem with insert into select...

From: stephen farrell <sfarrell(at)almaden(dot)ibm(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: Problem with insert into select...
Date: 2003-11-21 02:05:09
Message-ID: 3FBD72D5.1040202@almaden.ibm.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Ok -- so we created indexes and it was able to complete successfully.
But why would creating indexes affect the memory footprint, and should it?

Does it buffer the sub-select before doing the insert, or does it do the
insert record-by-record?

See correspondence below for details:

Steve,

With the indexes created it worked. It took about 4 hours, but
it inserted all of the records.

stephen farrell <sfarrell(at)almaden(dot)ibm(dot)com>

11/20/2003 05:22 PM

To: James Rhodes/Almaden/IBM(at)IBMUS
cc:
Subject: Re: [Fwd: Re: [PERFORM] Problem with insert
into select...]

if you do "explain" before the sql statement (e.g., "explain select *
from foo"), it'll tell you the query plan.

James Rhodes wrote:
>
> Steve,
>
> Here is the detailed structure of the tables and the query that
> is failing (the "INSERT INTO FACT" query) and I attached the logfile.
> Also what is EXPLAIN???
>
> CREATE TABLE RAW ( RAW_KEY serial, PATNO_TEXT VARCHAR (9),
> APPDATE_DATETIME VARCHAR (11), ISDATE_DATETIME VARCHAR (11),
> WHATEVERSNO_TEXT VARCHAR (5), WHATEVERSNO_NUMBER VARCHAR (6), APPNO_TEXT
> VARCHAR (10), TITLE_TEXT TEXT, USCLASS_TEXT VARCHAR (14),
> USCLASS_TEXTLIST_TEXT TEXT, AUTHORCODE_TEXT VARCHAR (9),
> AUTHORNORM_TEXT VARCHAR (195), AUTHOR_TEXT VARCHAR (212),
> AUTHOR_TEXTLIST_TEXT TEXT, AUTHORADDRESS_TEXT VARCHAR (84),
> AUTHORADDRESS_TEXTLIST_TEXT TEXT, INVENTOR_TEXT VARCHAR (50),
> INVENTOR_TEXTLIST_TEXT TEXT, INVENTORADDRESS_TEXT VARCHAR (90),
> INVENTORADDRESS_TEXTLIST_TEXT TEXT, AGENT_TEXT TEXT, AGENT_TEXTLIST_TEXT
> TEXT, USSEARCHFIELD_TEXT VARCHAR (26), USSEARCHFIELD_TEXTLIST_TEXT
> VARCHAR (150), USREFISDATE_TEXT VARCHAR (13), USREFISDATE_TEXTLIST_TEXT
> TEXT, USREFNAME_TEXT VARCHAR (34), USREFNAME_TEXTLIST_TEXT TEXT,
> ABSTRACT_TEXT TEXT, ABSTRACT_TEXTLIST_TEXT TEXT, ABSTRACT_RICHTEXT_PAR
> TEXT, WHATEVERS_RICHTEXT_PAR TEXT, USREFPATNO_RICHTEXT_PAR TEXT, PRIMARY
> KEY(RAW_KEY));
>
>
> CREATE TABLE ISSUE_TIME (
> TAB_KEY serial,
> ISDATE_DATETIME varchar (8),
> MONTH INT,
> DAY INT,
> YEAR INT
> , PRIMARY KEY(TAB_KEY))
>
> CREATE TABLE SOMETHING_NUMBER (
> TAB_KEY serial,
> PATNO_TEXT varchar (7)
> , PRIMARY KEY(TAB_KEY))
>
> CREATE TABLE APP_TIME (
> TAB_KEY serial,
> APPDATE_DATETIME varchar (8),
> MONTH INT,
> DAY INT,
> YEAR INT
> , PRIMARY KEY(TAB_KEY))
>
> CREATE TABLE AUTHOR (
> TAB_KEY serial,
> CODE varchar (6),
> AUTHOR text
> , PRIMARY KEY(TAB_KEY))
>
> CREATE TABLE APPLICATION_NUMBER (
> TAB_KEY serial,
> APPNO_TEXT varchar (7)
> , PRIMARY KEY(TAB_KEY))
>
> CREATE TABLE WHATEVERS (
> TAB_KEY serial,
> abstract_richtext_par text,
> WHATEVERS_richtext_par text,
> raw_key int,
> title_text text
> , PRIMARY KEY(TAB_KEY))
>
> CREATE TABLE FACT (DYN_DIM1 BIGINT, DYN_DIM2 BIGINT,DYN_DIM3
> BIGINT,ISSUE_TIME BIGINT, SOMETHING_NUMBER BIGINT, APP_TIME BIGINT,
> AUTHOR BIGINT, APPLICATION_NUMBER BIGINT, WHATEVERS BIGINT)
>
> INSERT INTO FACT (ISSUE_TIME, SOMETHING_NUMBER, APP_TIME, AUTHOR,
> APPLICATION_NUMBER, WHATEVERS) SELECT ISSUE_TIME.TAB_KEY,
> SOMETHING_NUMBER.TAB_KEY, APP_TIME.TAB_KEY, AUTHOR.TAB_KEY,
> APPLICATION_NUMBER.TAB_KEY, WHATEVERS.TAB_KEY FROM ISSUE_TIME,
> SOMETHING_NUMBER, APP_TIME, AUTHOR, APPLICATION_NUMBER, WHATEVERS, raw
> WHERE ISSUE_TIME.ISDATE_DATETIME=raw.ISDATE_DATETIME AND
> SOMETHING_NUMBER.PATNO_TEXT=raw.PATNO_TEXT AND
> APP_TIME.APPDATE_DATETIME=raw.APPDATE_DATETIME AND
> AUTHOR.CODE=AUTHORCODE_TEXT AND AUTHOR.AUTHOR=(AUTHOR_TEXT ||
> ' | ' || AUTHOR_TEXTLIST_TEXT) AND
> APPLICATION_NUMBER.APPNO_TEXT=raw.APPNO_TEXT AND
> WHATEVERS.raw_key=raw.raw_key

Tom Lane wrote:
> stephen farrell <sfarrell(at)almaden(dot)ibm(dot)com> writes:
>
>>I'm having a problem with a queyr like: INSERT INTO FACT (x,x,x,x,x,x)
>>SELECT a.key,b.key,c.key,d.key,e.key,f.key from x,a,b,c,d,e,f where x=a
>>and x=b .... -- postgres7.4 is running out of memory. I'm not sure
>>why this would happen -- does it buffer the subselect before doing the
>>insert?
>
>
> What does EXPLAIN show for the query? And we need to see the exact
> query and table definitions, not abstractions.
>
> regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Josh Berkus 2003-11-21 06:24:45 Re: [HACKERS] More detail on settings for pgavd?
Previous Message Matthew T. O'Connor 2003-11-21 00:40:15 Re: [HACKERS] More detail on settings for pgavd?