simultaneous updates

Post Reply
chugwa
Posts: 23
Joined: Thu Oct 03, 2002 2:15 am

simultaneous updates

Post by chugwa »

We're planning a system in which multiple users can be expected to run large updates on the same table simultaneously. The maximum number could be as high
as 10, but realistically I expect no more than 5 on
a regular basis.

Do you have any recommendations on how to handle this
in order to maximize performance?
chugwa
Posts: 23
Joined: Thu Oct 03, 2002 2:15 am

simultaneous updates

Post by chugwa »

To provide some more detail:

The system is a 440mhz SPARC, 1 GB of RAM,
table size 2-5 GB, total DB size of ~10 GB.
User avatar
mark
Site Admin
Posts: 5513
Joined: Tue Apr 25, 2000 6:56 pm

simultaneous updates

Post by mark »

Make sure the key field for update is indexed and that all texis processes are running at the same priority (don't nice any texis process).
chugwa
Posts: 23
Joined: Thu Oct 03, 2002 2:15 am

simultaneous updates

Post by chugwa »

Is there any benchmark data that would give me an
idea of what we should expect in terms of performance?


My benchmark query, running as the only query on the machine, and the only substantial process, pushes the
iowait to 85%+. The query is:
UPDATE foo = 'me' WHERE bar = '' AND Url MATCHES 'http://www.iloo.com/%'

bar is an indexed SMALLINT. Url is a unique indexed VARCHAR

There are aproximately 29k matches on iloo.com.

This is a single processor box, with indexes and tables on the same drive. I'm investigating whether the box is powerful enough for our purposes.

Thanks,

Joel
chugwa
Posts: 23
Joined: Thu Oct 03, 2002 2:15 am

simultaneous updates

Post by chugwa »

More --

The query takes at least 10 minutes to complete. We want
to be able to have others running similar queries at the same time with reasonable performance.

Should that work?
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

simultaneous updates

Post by John »

What type is "foo"? If a varchar, which is what it looks like, then each record updated might move, so all indexes on the table need to be updated for every update, which will be a considerable amount.

What are you really trying to accomplish?
John Turnbull
Thunderstone Software
chugwa
Posts: 23
Joined: Thu Oct 03, 2002 2:15 am

simultaneous updates

Post by chugwa »

foo can be a varchar or a smallint in the actual
query.

What am I trying to accomplish?

I need to establish for management what sort of performance is reasonable to expect from our server
in this context - netra t105, 1 GB RAM, 440mhz.

They expect the ability to run 4 - 5 updates on the
same large table (3 GB now, going to be 10 GB in
initial production, then bigger) simultaneously without impacting performance for other users running searches.

I've been doing database development for 6 years ( but
only 6 mos. on texis). I expect the sorts of queries they want to run to take time, especially in a table
locking system like texis. They do not. I'd guess
that I'm right and they are wrong but I don't really know for sure.

I need to be able to tell them what is reasonable to
expect from an authority they'll respect.

Another Dilbertesque chapter...
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

simultaneous updates

Post by John »

The general answer is it is going to impact performance, how much will depend on the field type that is being updated, and how many indexes you have, and a number of other factors.

If you really want to minimize the impact to searches then you should serialize the updates, so only update is occurring at one time. You only have one CPU anyway, so you won't gain much be running the processes simultaneously.

We usually ask why you would want to run multiple large updates simultaneously while maintaining search performance, as often there is a better way to achieve the ultimate goal.

If you were running your test on a "cold" database, then it may have been having to hit the disk for every record, rather than having some of it cached.
John Turnbull
Thunderstone Software
chugwa
Posts: 23
Joined: Thu Oct 03, 2002 2:15 am

simultaneous updates

Post by chugwa »

We are doing a project that is going to use human reviewers to catalog a part of the web. The reviewers
can be expected to be working simultaneously. Most
of what they're doing is looking for patterns in sets
of webpages and applying batch updates to our database
based on the patterns.

So is it fair to say Texis doesn't recommend running large simultaneous updates on systems that are being used for select querys at the same time?
Post Reply