Best Practice Question

Post Reply
mjacobson
Posts: 204
Joined: Fri Feb 08, 2002 3:35 pm

Best Practice Question

Post by mjacobson »

I am wondering which method would be better for processing a large amounts of records in a texis database.

Method 1
<sql select item1, item2 from table1>
<sql select item3 from table2 where id like item1 and host like item2>
Does something with item3 that also requires a sql insert
</sql>
</sql>

Method 2
<sql select item1, item2 from table1></sql>
<loop $item1 $item2>
<sql select item3 from table2 where id like item1 and host like item2>
Do something with item3 that also requires a sql insert
</sql>
</loop>

Method 3
Modify either method 1 or method 2 to use MAX and SKIP values in the SQL statements so I am only processing a few records at a time. I know a lot of this deals with how much memory and the processor speed of the box I am working on but I am looking for some general rules of thumb.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Best Practice Question

Post by John »

A couple of thoughts as there are a few "it depends" involved.

First use ROW on the sqls, so you only keep one row at a time in memory, e.g.

<SQL ROW "select item1, item2 from table1">
...

That should avoid the need to read in chunks, or use your method 2.

Second is a question, is the insert going to either table1 or table2, or a third table? If it's a third table then I'd go with your method 1 and add "ROW" to the sqls. If you are inserting into table1 or table2 it would depend on the effect you want.
John Turnbull
Thunderstone Software
mjacobson
Posts: 204
Joined: Fri Feb 08, 2002 3:35 pm

Best Practice Question

Post by mjacobson »

Thanks, no it is updating a 3rd table. Would you use ROW on the second SQL select statement?

The first SQL statement gets all of the hosts from a table. This could be several 1000. The second SQL statement gets all of the matching URLS for this host. This could be anywhere from 1 to 10 Million rows.
User avatar
John
Site Admin
Posts: 2622
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Best Practice Question

Post by John »

Yes, using ROW on both will help as it avoids saving all of them in memory. You wouldn't use ROW if you intended to loop over the results again after the SQL.
John Turnbull
Thunderstone Software
Post Reply