merge tables from two different profile

Post Reply
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

merge tables from two different profile

Post by KMandalia »

Could you provide steps needed to merge all required tables for searching from two separate profiles for Webinator 5?

Thanks.
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

merge tables from two different profile

Post by mark »

Do you want to merge 2 databases into one on disk then perform searches against that. Or do you want to search 2 databases and merge the answers before presenting to the user?

For merging on disk the html table is needed for normal searching. The refs table is needed for surfing parent-child links.

Merging results is somewhat complicated by pagination issues. But basically you would have to get a pageful of answers from each database and put them all into a temp table (probably a "ram table") then select from that order by rawrank desc to present the answers. You may want to turn off "database frequency" to make the ranks more compatible between databases. To allow parent-child surfing you'd have to record somehow which database each url came from.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

merge tables from two different profile

Post by KMandalia »

Fortunately, I want to merge databases and then perform search.

Basically, I would have either db1 or db2 folder from each profile I want to merge, based on which one is live. And most probably I would just be doing appending so it should be easy enough.

Here is the deal, instead of doing a new walk when I want to add a new url, I just want to create one profile with that url and then append required tables into my main search database.

What I want to know is what are the tables (are html and refs table the only ones that need to be merged, please give a list of any other tables required assuming I will be using all search features). Do I write a script to perform the operation and what would be the script like?

AND THE BIG QUESTION, what will happen to my categories? would I be able to merge the smaller profile containing only 1-2 urls so that it will end up in the categories of my choosing?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

merge tables from two different profile

Post by mark »

Interesting idea. You'll need a script to copy across the databases. The tables (found by looking at all the <sql statements in the search script) are html, refs, and categories. The "live" database for a profile can be found in the options table of the testdb database: "select String sourcedb from options where Profile='sourceprofile'"

refs can come across unchanged:
<sql db=$sourcedb row "select * from refs">
<sql db=$destdb novars "insert into refs values($Url,$Ref)"></sql>
</sql>
You'll use a similar construct to copy the other tables.

For categories you'll need to find the max category in the destination database: "select max(convert(Catno,'int')) Maxcatno from categories". Then add $Maxcatno to every source category before inserting into the destination db.

For html you'll need to add the $Maxcatno to each category for each record before inserting into the destination.
<$ret=(convert($Catno, 'varchar' ))>
<split nonempty row "," $ret>
<$x=($Maxcatno+$ret)>
<$newcatno=$newcatno $x>
</split>
<sum "%s," "" $newcatno>
<substr $ret 1 -1>
<$newcatno=$ret>
<sql novars "insert into html values($id,$Hash, $Size,$Visited,$Dlsecs,$Depth,$Url,$Title,$Body, $Keywords,$Description,$Meta,$newcatno,$Modified, $NextCheck,$Views,$Clicks,$CTR,$Pop,$MimeType,$Charset)"></sql>

Then, of course, you need to update the metamorph index. Rather than calling "dowalk/reindex.txt" which will drop and recreate the whole index it would be better to just update it. See the following for adding a new entry point to update instead of recreate the index. http://thunderstone.master.com/texis/ma ... 3d7e3f7210
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

merge tables from two different profile

Post by KMandalia »

Very cool.

But I didn't quite get the maxcatno part. Say I have category 1,category 2 and category 3 in destination db and I want to merge tables in such a way so that the url from source db ends up in both category 1 and category 3, but not category 2. Will your solution handle that?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

merge tables from two different profile

Post by mark »

My solution assumes categories from the 2 db's don't overlap. It just appends the new ones. If you use the exact same set of categories (in the same order) in both profiles you don't need to bring over the categories table at all and don't need to translate the html Catno field. If you have different overlapping categories in the 2 profiles the problem suddenly gets more complicated.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

merge tables from two different profile

Post by KMandalia »

you got me excited about something that I thought will be complicated and from your response sounded easy to do.

Lets say, I have a category 'Test 1', 'Test 2' and 'Test 3' in destination db.

I design a profile that have only single url in it.
QUESTION:for this profile, if put 'Test 1' and nothing in the url pattern textbox, then put 'Test 2' and my put the single url pattern in the box and then have another category 'Test3' with nothing in it, would it work? (I haven't tried it).What happens when url pattern box is blank and the category name box isn't.

let's say I had 10 urls in one category and then I deleted all 10 of them from the list/edit urls. It wouldn't wipe out the category name, would it?

If it wouldn't then what I want to do with the profile with single url should work, shouldn't it?

If I have same category names and then if I merge and everything works out fine, then would the new url automatically show up listed under categories?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

merge tables from two different profile

Post by mark »

You need to have *something* in the pattern box, even if it won't match anything. Then it should work as desired.

Urls in the database do not control categories, nor do categories control what's in the database. They are simply related to each other if/when a match occurs.
KMandalia
Posts: 301
Joined: Fri Jul 09, 2004 3:50 pm

merge tables from two different profile

Post by KMandalia »

OK, I got your point.

So, I shall put the same category names for the source profile that has only a couple base urls and then include the urls in category i want to match and put some vague pattern in the other category that I am sure won't match to anything and then simply merge it. Do you see any downside of this approach. I am thinking of using this approach now onwards whenever I have new sites to include.

Also, this may not be relevant, but does the refesh walk pick up any new base urls?
User avatar
mark
Site Admin
Posts: 5519
Joined: Tue Apr 25, 2000 6:56 pm

merge tables from two different profile

Post by mark »

Sounds good.

Yes, refresh picks up new base urls. If you're planning on refreshing the merged database make sure your added walks were run with the same options as the one you're merging to because those are the ones that will apply upon refresh.
Post Reply