TB Telco Minutes 2011-10-12

From IKS Project
Jump to: navigation, search


  • Alessandro Adamou
  • Ali Anil Sinaci
  • Suat Gonul
  • Bertrand Delacretaz
  • Reto Bachmann
  • Massimo Romanelli
  • Sebastian Germesin
  • Olivier Grisel
  • Rupert Westenthaler
  • Fabian Christ

3P Reports

From Ali

  • Progress:
    • Standalone search component is integrated into Contenthub. Initial version is committed.
    • Contenthub implementation is improved by integrating Embedded Solr.
    • Content management interface is improved. Content items can be submitted along with several constraints. They can be edited and removed now.
    • Faceted search is implemented.
    • Several bugs are resolved.
  • Problems: None
  • Perspectives:
    • Focus on semantic search.
    • Possible integration with LMF.
    • Fast response times for search.

From Olivier

  • Progress:
    • much better integration of ehancer engines + entityhub with Nuxeo

(avoid useless calls + various UI improvements)

    • built a large multilingual DBpedia 3.7 index suitable for demos
    • worked on a generic connector for the proprietary Temis text analytics engine
  • Problems:
    • Various minor bugs but Rupert is quick at fixing them as we go
  • Perspectives:
    • Test analysis in Arabic with Temis (should be demoable during GAM)
    • Integrate topic categorization prototype as Stanbol engine + dedicated referenced site
    • Work on the switch to the Annotation Ontology (to be discussed)?

From Sebastian

  • Progress:
  • Problems:
    • we should bring VIE more prominent into one of the next IKS-RI releases.
  • Perspectives:
    • more widgets
    • table editing widget
    • tutorial for how to use VIE
    • tutorial for how to use VIE widget(s)
    • We started work on an extension for Google Chrome that leverages semantic annotations on a webpages and presents entity-type-specific actions, like, e.g.: http://www.google.com/related/, combining several VIE widgets together.

From Bertrand and Reto

  • Progress:
    • Sling/Stanbol prototype [1]: using annnotate.js/hello/VIE, successfully ported REST service from stanbol.
    • Completed the industry evaluation questions at [2], and announced it on the iks-wip list. Deadline for industry partners feedback is November 10.
  • Problems:
    • We suspect we'll get "I have no idea what this does" responses for several components listed at [2], I'm planning on using some of our Task 6.2 budget to help incrementally improve documentation and samples before making another evaluation round.
  • Perspectives:
    • Continue work on the Sling-Stanbol prototype as an example application

for Stanbol, and write an article for the iks blog.

From Fabian

  • Progress:
    • Updated Stanbol's JSON-LD output to support JSON-LD 1.0 spec
  • Problems: none
  • Perspectives (ordered by priority):
    • Rupert pointed me to an issue on how to handle properties that have a list of values of multiple types in the JSON-LD API
    • Release IKS 5.1 with WebVIE and Reasoners
    • Plan and design what to do next with the Firefox extension
    • Blog more about IKS/Stanbol. Start with a handbook.
    • Improve performance of JSON-LD serialization impl by avoiding the many map copy actions that are currently taking place

From Alessandro and Enrico

  • Progress:
    • Committed stable version of improved reasoner modules (STANBOL-185)
    • Started implementation of reasoning background jobs (STANBOL-343)
    • Simplified dependencies and improved unit tests in Stanbol reengineer. Leap towards stability / IKS inclusion (STANBOL-301 STANBOL-309)
    • Added some documentation on Stanbol Rules to docs repository
    • Started optimizations in Clerezza storage for OntoNet (STANBOL-332 STANBOL-337)
    • Moved Explanation WIP component to Stanbol trunk (only built in kres profile for the time being) (STANBOL-347)
  • Problems: none that wasn't solved during progress yet
  • Perspectives (ordered by priority):
    • Pre-configure IKS ontology network using ontologymanager registries
    • Native OntoNet implementation in Clerezza
    • Simplify OntoNet session architecture
    • Review FactStore code and establish integration endpoints with OntoNet and Reasoners
    • Schema-based summarization support in explanation
    • RESTful API for ontologymanager/registry
    • reasoners API to accommodate LMF features for Stanbol integration
    • Unify ontologymanager/web and ontologymanager/store/web RESTful APIs

From Suat

  • Progress:
    • Created a base bundle list that is extended by all launchers
    • Created partial bundle lists for Stanbol components that are included in the full launcher and updated launchers accordingly
    • Updated documentation for CMS Adapter
  • Problems:
    • Should partial bundle lists consider dependencies of the bundles within them? For example some engines included in the partial bundle list of Enhancer have dependencies to Entityhub. We didn't reach a conclusion with Rupert on this issue.
  • Perpectives:
    • WAR file creation for Stanbol
    • A functionality for CMS Adapter that enables submission of documents from content repositories to Contenthub

From Rupert, Andreas and Szaby

  • Progress (Rupert):
    • Reimplementation of the TaxonomyLinkingEngine to make it more modular (STANBOL-303)
    • Multilinguality (NLP in all Lanugages supported by OpenNLP, Lanugage based lookup for Entities, ...)
    • corrected remaining Encoding issues (STANBOL-329)
    • Started to improving multilingugal indexing for the SolrYard (STANBOL-331)
    • Implementation of CORS (allows multi-site scripting) (STANBOL-105)
    • Reimplementation of the "application/json+rdf" serialiser for Clerezza to improve performance dramatically (will create Issue/submit patch after further testing)
  • Progress (Andreas)
    • All Engines Documented
    • Some Blogs about Stanbol/IKS usage scenarios (Europeana, multi-linguality)
    • Testing of multilingual keyword extraction.
  • Progress (Szaby)
  • Problems (Andreas):
    • Restructuring the Overview Page for Documentation
    • Missing README.rd for some Components
  • Perspectives (Rupert):
    • Stanbol Enhancement Structure: This will replace the still used FISE enhancement structure; The current Idea is to base it on Annotation Ontology[1]. (STANBOL-3)
    • Wirte a document that describes how the Stanbol Enhancer will use this Ontology to encode Enhancement
    • Define necessary extensions (e.g. we will need to add support for Suggestions)
    • Stanbol Enhancer Base Infrastructure: At the GAM meeting in Istanbul we outlined some nice extensions to the Enhancer base Infrastructure that got never implemented
    • Conetent Adapter patter: Implementation based on Apache Tika, but also allowing to use external services (e.g. a CMS that has already the different Versions)
    • Multiple Enhancement Chains: Create "named" configurations that are than exposed with the same REST API under /engines/{name}/
    • Improve Error Handling: A single Error of an Engine should not cancel the whole enhancement Process (rater include Errors and Warnings within the Metadata)
    • LMF Stanbol Integration: Current plan is that this will be the main topic during the next Developer Meeting (co-located with the GAM)
    • converting the existing LMF to OSGI
    • integration of the LMF semantic search with the current ContentHub implementation
    • Better support for NLP:
      • To share metadata as created by NLP engines (Tokens, POS tags, Chunks, Stemmed Tokens ...) we need to provide a much more efficient infrastructure as the current in-memory RDF graph. I think the Lucene Analyzer pipelines are a good example of how such a model might look like. Currently I do not have an idea how this could be nicely integrated into the Stanbol Enhancer however the AnalyzedText class [2] can be used an example how such data would look alike.
      • Improve NLP by using Lucene language analyzing capabilities[3]
  • Perspectives (Andreas):
    • Completing Documentation for Components (Modules like Rules, Reenginer, Reasening, ...)
    • Creating more usage scenarios
    • Proper Benchmarking of the Stanbol Enhancer (Engines, Multilinguality ...)
  • Perspectives (Szaby):
    • making VIE, annotat.js and the bookmaklet more stable.
    • further improvements to Proggis

Online Minutes

[14:07:44] Fabian Christ: Components need a proper README to make documentation easier
[14:08:34] Reto Bachmann: the clerezza gets entitied and updates cache using the dbpedia sparql-endpoint, but this often doesn't work
[14:09:57] Reto Bachmann: Reto Bachmann thinks this could be somtehing for the hackathlon
[14:10:07] Sebastian Germesin: regarding the missing documentation: can we specify a list of all component that have missing or poor documentation and then have persons specified responsible for adding the documentation?
[14:10:40] Fabian Christ: Olivier raises the problem that the provided default dbpedia index is much to small for realistic use case scenarios
[14:10:50] Alessandro Adamou: did I get it correctly that Andreas has a template in mind for structuring component documentation?
[14:10:53] Reto Bachmann: Sebstian: see the questionnaire on wiki
[14:11:14] Fabian Christ: Add a big index by replacing the default small one automatically
[14:11:15] Bertrand Delacretaz: Reto means http://wiki.iks-project.eu/index.php/Adoption-Questionare
[14:13:32] Sebastian Germesin: @reto, bertrand: thanks, but I'm not sure whether that helps the developer of the corresponding tool. in my case: I can answer question 1, whether I know what the VIE widgets main goal is, with a "yes", don't know if an early adoptor can....
[14:13:45] Reto Bachmann: if all else fails we could use tinybundles to create a bundle on the fly
[14:22:57] Fabian Christ: http://wiki.iks-project.eu/index.php/IntegrationHackathonSalzburg
[14:24:57] Fabian Christ: everyone should add/edit the hackathon planning in the wiki
[14:27:46] Fabian Christ: Olivier asked for status on CROS (?)
[14:28:20] Olivier Grisel: CORS
[14:28:28] Olivier Grisel: Cross Origin Resource Sharing
[14:31:33] Olivier Grisel: https://code.google.com/p/annotation-ontology/
[14:32:13] Reto Bachmann: +1
[14:42:47] Fabian Christ: discussed launcher partial lists

Olivier created a new issue for adding support for ordered results sets in JSON-LD: https://issues.apache.org/jira/browse/STANBOL-350 (useful for content hub and entity hub).