Ooffee2 Proposal

From IKS Project

Jump to: navigation, search

Contents

Early Adopter Proposal

Ooffee has been an adopter of FISE/Stanbol from the start.

During the last EA workshop (Paris), Florent André demonstrated an integration of Stanbol with Lenya.

This proof of concept integration was developed for the R&D department of the EDF group.

This first use-case involved :

  • A specific (energy domain) thesaurus embedded as an Entityhub referenced site
  • A specific development of an UIMA engine for Stanbol.
  • A desktop widget that used Apache Stanbol
  • This UIMA engine will be contributed to Apache Stanbol (Florent is now an Apache Stanbol contributor).

Desktop widget may be contributed in future if relevant.


From this first experience, Ooffee identify the enterprise's specific thesaurus management, improvement and maintenance as one of the key part in information enhancement.


With the European funded project Linked Heritage Ooffee will go a step forward in this direction : one of the main goal of the Linked Heritage project is to provide a website that offer to user the possibility to import their thesaurus, modify them and map them to any other already imported thesaurus. In this application, user will be Europeans culture related organization (museums, libraries,...).


At the end, all this thesauri will be used by Europeana for content and search processing.

Targeted CMS is not still decided, but the use of Apache Stanbol components is sure.

Use-Case

User A have a specific thesaurus in CSV, text or rdf format.

User A import his thesaurus. This thesaurus is store in SKOS in Stanbol.

When import is ok, user have an easy to use interface for visualize his thesaurus, modify and enhance it.


Please, find here some first draft screen-shots of the thesaurus editing tool  :


When improvements and modifications are done, user can create mapping between his graph and another one already imported (say by user B).

On the user interface the mapping will be done with just 2 clicks : Graph mapping UI: the user click on a concept on the left sided thesaurus and then on a concept on the right sided thesaurus : a mapping is created on this two concepts.

When a certain amount of thesaurus are imported, the website will offer automatic mapping and thesaurus similarities detection for newly imported thesaurus.


Others planned features are :

  • multi-thesaurus search : when a user search for a term, the system display all thesaurus that contains that term.
  • inferencing on thesaurus and mapping : offer to the user new knowledge for existing thesauri and mapping between them.
  • semantic enhancements of content from managed thesaurus.


Validation

Stanbol components used and validate during this project will be :

  • OntoNet : ontologies registration, management and modifications
  • Reasoner/LD-path : propose new knowledge from already imported thesauri
  • OntoNet - Enhancement Engines link : Can we use thesauri managed by OntoNet to do semantic enhancement ?


Validation will be performed on thesaurus stack imported in the website.

For the first demo, 3/4 power-users* will be involved in, at the end more than 15 power-users are planned.

Test data will be existing thesauri that are actually managed by "European Cultural organization" (museum, libraries,... associated with the linkedHeritage project). Some thesaurus will be language specific some others multilingual. Sizes of each thesaurus will be comprised between 40 to 500 triples (approximately).

WP manager of Linked Heritage will evaluate Stanbol performance according to their needs. Cultural content providers will Apache Stanbol's features according to the thesaurus import and management features.

  • power-user means : user from different organization that can import and modify thesauri.

Performance

The terms of the contract are:

  • Start of contract 1 February 2012
  • IKS components for validation Apache Stanbol ontoNet, reasoner, and possibly entityHub.
  • Alpha version (thesaurus import, modification and mapping) available 1 March 2012
  • Validation interview on 1 April 2012
  • Beta version (search, reasoning) available 15 April 2012
  • End of Contract 1 May 2012
  • Total remuneration for this contract is 6500 Euro, excluding VAT.

Visibility

  • Stanbol will be one of the core component of the website.
  • Website is intended to be a public demo, so anyone can access it.
  • Code produce during this contract will be Apache licensed