Our Webinator appliance supports search across hundreds of sites, all managed from our proprietary CMS which defines a bunch of standard "content types" like photos, videos, news releases, events, etc. We would like to train Webinator to recognize these content types so that search results can be filtered or sorted by type. So the general plan is to add some sort of type-identifying markup to the webpages, configure webinator to recognize and store this info somehow as a "type", and then use this parameter in the search interface.
At the moment we are considering marking up our content with schema.org tags because (a) that is an accepted standard with an extensive taxonomy of object types, and (b) it allows multiple object types per page -- e.g. a news release with two embedded photos and a video = four objects on one html page.
Has anyone integrated schema.org markup into a webinator crawl before? Any tips for how we might go about it? Or perhaps some completely different way to accomplish this goal?
Thanks,
Rob
At the moment we are considering marking up our content with schema.org tags because (a) that is an accepted standard with an extensive taxonomy of object types, and (b) it allows multiple object types per page -- e.g. a news release with two embedded photos and a video = four objects on one html page.
Has anyone integrated schema.org markup into a webinator crawl before? Any tips for how we might go about it? Or perhaps some completely different way to accomplish this goal?
Thanks,
Rob