Using Webinator to index, search and highlight content in a single ad hoc web page

Post Reply
barry.marcus
Posts: 288
Joined: Thu Nov 16, 2006 1:05 pm

Using Webinator to index, search and highlight content in a single ad hoc web page

Post by barry.marcus »

We have never used Webinator in our application (we do have extensive Texis and Vortex experience), so this question may sound kind of naive. But I'm a complete neophyte at this product, so bear with me...

As part of our business workflow we occasionally use several web-based services that serve up text-heavy result pages, the content of which is not part of our normal Texis database. When we need to perform metamorph-like searches on the data in these pages what we currently do is copy and paste the data to be searched from the webpage to a text field in our Vortex app and search/markup the data using something like:

<strfmt "%mbs" $ourSearchCritera $theData>

The problem is not really the speed of the search (although the data is not indexed), but rather that the copy and paste itself is just error-prone and time-consuming. It is not unusual for the result pages to have word counts in excess of 50,000 words. And when we have to do this, we often have to do it repeatedly, with many pages, and it's just an awkward and cumbersome process. The slowness and awkwardness of the process is the problem, and it's often a real bottleneck in our efficiency.

What we would like to have instead is a way to apply a metamorph search to the content of a page that is already resident in our browser, and see the "hits" of the search as highlighted text (i.e., the page text highlighed in a markup) in another browser window, in a PDF, or whatever. This functionality would necessarily have to be applied to single pages already rendered in the browser window, since the pages are constructed by the web-based service as the result of ad hoc queries that we submit to their server. That is, the pages are built "on the fly" per request, and are not really available out on the web for indexing via a Webinator indexing "walk."

The question then is: Is this something that Webinator can do, either out of the box or with some creative tweaking?

Thanks for your help.
User avatar
John
Site Admin
Posts: 2597
Joined: Mon Apr 24, 2000 3:18 pm
Location: Cleveland, OH
Contact:

Using Webinator to index, search and highlight content in a single ad hoc web page

Post by John »

Since the content is only in the browser it isn't something that Webinator can do directly. The simplest option would probably be to save the page to a directory that Webinator could get access to, rather than using copy and paste.

Another option would be create a script that would proxy all the requests through it, and could highlight on the way to the browser. That will be most successful on HTML and possibly PDF pages. A third option would be a browser plugin, although that would be both browser specific and may be difficult to handle PDFs.
John Turnbull
Thunderstone Software
Post Reply