Semantic Analysis Technology

8 November 2008 at 23:23 Leave a comment

I attended “Semantic Analysis Technology: in Search of Categories, Concepts & Context“, the fourth ISKO UK KOnnecting KOmmunities event on 3 November 2008 at University College London.

First up were presentations from two vendors, Luca Scagliarini and Jeremy Bentley.

Scagliarini argued that information discovery suffers from information overload and information underload due to a lack of meaning-based text processing. Free text search and shallow automatic linguistic analysis did not do the job, but a ‘deep semantic analysis’ based on the analysis of relationships and ‘understanding’ the meaning that is encoded in the relationships between verbs, prepositions and nouns demonstrates potential.

Bentley reviewed key information organisation issues – unstructured information, the doubling of number of resources every 19 months, ‘findability’ problems and and the how black box solutions may not do the job. He discussed the relevance of metadata and taxonomies built specifically to reflect the way an organisation workss.

Later, practitioners presented – Rob Lee and Helen Lippell, Karen Loasby and Silver Oliver.

Lee talked about Muddy Boots, a BBC project to support the BBC’s remit to link to more external sources. Lee illustrated how structured datasets in the public domain could be used to contextualise and index BBC resources and exploit the semantic richness to link to find meaningful external links.

Lippell, Loasby and Oliver discussed three different implementations of auto-categorisation systems, demonstrating advantages and issues with each approach. The approaches were:

  • using Verity Intelligent Classifier (VIC) and a taxonomy with a set of rules that could be finely tuned
  • applying a rule-based automatic classification system combined with the author’s review and corrections to produce BBC content that could be described in detail. The approach
  • a “statistical-based auto-categorisation” project designed to connect and cross-reference distributed BBC content and resources horizontally

Entry filed under: categorisation, semantic web. Tags: .

Martin Belam’s taking the “Ooh” out of Google still not optimised

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


November 2008
« Oct   Jan »

Twitter Updates

%d bloggers like this: