I apologize if this has been covered previously - I haven't been able to find anything definitive in the docs or forums on this.
I am using a document management system (DMS) called KnowledgeTree, as well as a content management system (CMS) called Drupal. Both allow for the definition of a taxonomy and the application of taxonomy terms (basically, keyword metadata) to items managed by the systems. (Documents in the case of the DMS, pages, articles, etc in the case of the CMS). So, in the case of the DMS, I can upload a document (pdf, doc, xls, graphic, audio, video, etc) and then tag it with terms from the taxonomy. The terms are not internal to the document (so wouldn't be picked up on a crawl of the document), but are external to the document. They display on a 'summary' page, and are of course accessible via a database query. So, my task is to figure out how to associate this external metadata with the document data. (Say the document talked about apples and pears, but I wanted to tag it with 'hybrids', even though 'hybrids' were not mentioned in the document), so that when I searched for 'hybrid', I'd get the article as a result. (and additionally, even if I searched for 'apple' or 'pear', I'd be able to categorize the resulting document via applied taxonomy terms, such as 'hybrid').
Any thoughts, ideas, suggestions, etc? It sounds like a 'connector' is ideally what I want, but there don't seem to be connectors for Drupal or KnowledgeTree, and this will be a process that will need to be duplicated for other applications, some of which will be custom. It also seems that perhaps the data load api is an option (I could conceivably query the DMS/CMS databases and then feed Keywords, Category, etc to it?)
Also, on the data load api - is it required to pass the actual body of the item in the Body tag, or can a URL to it simply be passed, such that the appliance will index the document via the URL? (It just seems excessive to have to pass the content of every item that we want to index when it is accessible via URL).
Thanks for the help, and please forgive my newbie-ness!
I am using a document management system (DMS) called KnowledgeTree, as well as a content management system (CMS) called Drupal. Both allow for the definition of a taxonomy and the application of taxonomy terms (basically, keyword metadata) to items managed by the systems. (Documents in the case of the DMS, pages, articles, etc in the case of the CMS). So, in the case of the DMS, I can upload a document (pdf, doc, xls, graphic, audio, video, etc) and then tag it with terms from the taxonomy. The terms are not internal to the document (so wouldn't be picked up on a crawl of the document), but are external to the document. They display on a 'summary' page, and are of course accessible via a database query. So, my task is to figure out how to associate this external metadata with the document data. (Say the document talked about apples and pears, but I wanted to tag it with 'hybrids', even though 'hybrids' were not mentioned in the document), so that when I searched for 'hybrid', I'd get the article as a result. (and additionally, even if I searched for 'apple' or 'pear', I'd be able to categorize the resulting document via applied taxonomy terms, such as 'hybrid').
Any thoughts, ideas, suggestions, etc? It sounds like a 'connector' is ideally what I want, but there don't seem to be connectors for Drupal or KnowledgeTree, and this will be a process that will need to be duplicated for other applications, some of which will be custom. It also seems that perhaps the data load api is an option (I could conceivably query the DMS/CMS databases and then feed Keywords, Category, etc to it?)
Also, on the data load api - is it required to pass the actual body of the item in the Body tag, or can a URL to it simply be passed, such that the appliance will index the document via the URL? (It just seems excessive to have to pass the content of every item that we want to index when it is accessible via URL).
Thanks for the help, and please forgive my newbie-ness!