Geonames.org-LocationEnhancementEngine
From IKS Project
This engine creates fise:EntityAnnotations based on the http://geonames.org dataset. It does not directly work on the parsed content, but processes named entities extracted by some NLP (natural language processing) engine. This engine creates EnityAnnotations for Features found for named entities in the geonames.org data set. In addition it adds EntityAnnotations for the continent, country and administrative regions for entities with an high confidence level.
Contents |
Processed Annotations (Input)
This engine consumes fise:TextAnnotations of type dbpedia:Place. More concrete it filters for enhancements that confirm to the following two requirements and consumes the text selected by the TextAnnotations:
?textAnnotation rdf:type fise:TextAnnotation . ?textAnnotation dc:type dbpedia:Place ?textAnnotation fise:selected-text ?text
Here an example for such an TextAnnotations selecting the text "Vienna" form the content "The IKS community Workshop will take place in Vienna".
urn:enhancement:text-enhancement:id1
a fise:TextAnnotation , fise:Enhancement ;
dc:type
dbpedia:Place ;
fise:selected-text
"Vienna"^^xsd:string ;
fise:selection-context
"The IKS community Workshop will take place in Vienna"^^xsd:string ;
fise:start
"46"^^xsd:int ;
fise:end
"52"^^xsd:int ;
fise:confidence
"0.9773640902587215"^^xsd:double ;
fise:extracted-from
urn:content-item:id1 .
Typically such enhancements are created by engines that provide named entity extraction such as the openNLP-NamedEntityExtractionEnhancementEngine.
Created Enhancements (Output)
The LocationEnhancementEngine creates two types of EntityAnnotations. First it suggests Entities for processed TextAnnotations and second it creates EntityAnnotations for the hierarchy of regions the suggested Entities are located in. Suggested Entities are connected with the "dc:relation" attribute to the TextAnnotation they enhance. EntityAnnotations representing the hierarchy define a dc:requires attribute to the EntityAnnotation.
Entity Suggestions
Entity suggestions are EntityEnhancements that suggest Features of the geonames.org dataset for an processed TextAnnotation. This suggestions are currently only calculated based on the fise:selected-text of the TextAnnotation. The following example shows three EntityAnnotations for the TextAnnotation used in the above example. See the fise:relation statements at the end of each of the two EntityAnnotations.
TODO: link to the used WebService
The first Entity found in the geonames.orf dataset is the capital city in Austria with an confidence level of 1.0:
urn:enhancement:entity-enhancement:id1
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"1.0"^^xsd:double ;
fise:entity-label
"Vienna"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/2761369/ ;
fise:entity-type
geonames:Feature , dbpedia:Place , dbpedia:Settlement , dbpedia:PopulatedPlace , geonames:P.PPLC ;
fise:extracted-from
urn:content-item:id1 ;
dc:relation
urn:enhancement:text-enhancement:id1 .
With lower confidence levels there are a lot of other populated places with the name "Vienna" found in the geonames.org dataset.
urn:enhancement:entity-enhancement:id2
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"0.42163702845573425"^^xsd:double ;
fise:entity-label
"Vienna"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/4496671/ ;
fise:entity-type
geonames:Feature , dbpedia:Place , dbpedia:Settlement , dbpedia:PopulatedPlace , geonames:P.PPL ;
fise:extracted-from
urn:content-item:id1 ;
dc:relation
urn:enhancement:text-enhancement:id1 .
urn:enhancement:entity-enhancement:id3
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"0.42163702845573425"^^xsd:double ;
fise:entity-label
"Vienna"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/4825976/ ;
fise:entity-type
geonames:Feature , dbpedia:Place , dbpedia:Settlement , dbpedia:PopulatedPlace , geonames:P.PPL ;
fise:extracted-from
urn:content-item:id1 ;
fdc:relation
urn:enhancement:text-enhancement:id1 .
Entity Hierarchy Enhancements
Entity Hierarchy Enhancements describe the regions that contain suggested Features based on the geonames.org dataset. Enhancements describing this hierarchy are added for all suggested entities with a confidence level above the value of "eu.iksproject.fise.engines.geonames.locationEnhancementEngine.min-hierarchy-score". The default value for this property is 0.8. The hierarchy web service provided by geonames.org is used to calculate the regions:
The following example shows the entity hierarchy enhancements for the suggested entity for Vienna (Autria). Please note the dc:requires relation to this EntityAnnotation at the end of each of the following enhancement.
First the enhancement for the continent Europe:
urn:enhancement:entity-hierarchy-enhancement:id1
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"0.42163702845573425"^^xsd:double ;
fise:entity-label
"Europe"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/6255148/ ;
fise:entity-type
geonames:Feature , dbpedia:Place, geonames:L.CONT ;
fise:extracted-from
urn:content-item:id1 ;
dc:requires
urn:enhancement:entity-enhancement:id1 .
Next the enhancement for the country "Austria", classified as an independent political entry within geonames.org
urn:enhancement:entity-hierarchy-enhancement:id2
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"0.42163702845573425"^^xsd:double ;
fise:entity-label
"Austria"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/2782113/ ;
fise:entity-type
geonames:Feature , dbpedia:Place, dbpedia: AdministrativeRegion, geonames:A.PCLI ;
fise:extracted-from
urn:content-item:id1 ;
dc:requires
urn:enhancement:entity-enhancement:id1 .
Now three enhancement describing the different hierarchies of administrative regions within Austria. First the "Bundesland", next the "Stadtteil" and last the "Gemeindebezirk".
urn:enhancement:entity-hierarchy-enhancement:id3
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"0.42163702845573425"^^xsd:double ;
fise:entity-label
"Vienna"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/2761367/ ;
fise:entity-type
geonames:Feature , dbpedia:Place, dbpedia: AdministrativeRegion, geonames:A.ADM1 ;
fise:extracted-from
urn:content-item:id1 ;
dc:requires
urn:enhancement:entity-enhancement:id1 .
urn:enhancement:entity-hierarchy-enhancement:id4
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"0.42163702845573425"^^xsd:double ;
fise:entity-label
"Politischer Bezirk Wien (Stadt)"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/2761333/ ;
fise:entity-type
geonames:Feature , dbpedia:Place, dbpedia: AdministrativeRegion, geonames:A.ADM2 ;
fise:extracted-from
urn:content-item:id1 ;
dc:requires
urn:enhancement:entity-enhancement:id1 .
urn:enhancement:entity-hierarchy-enhancement:id5
a fise:EntityAnnotation , fise:Enhancement ;
fise:confidence
"0.42163702845573425"^^xsd:double ;
fise:entity-label
"Gemeindebezirk Innere Stadt"^^xsd:string ;
fise:entity-reference
http://sws.geonames.org/2775259/ ;
fise:entity-type
geonames:Feature , dbpedia:Place, dbpedia: AdministrativeRegion, geonames:A.ADM3 ;
fise:extracted-from
urn:content-item:id1 ;
dc:requires
urn:enhancement:entity-enhancement:id1 .
The last two hierarchy levels are no longer valid for the meaning of "Vienna" as selected by the TextAnnotation, but added, because the geonames.org dataset locations the Feature of cities exactly in the center. However if the TextAnnotation would describe a precise address such hierarchy levels would completely make sense.
Configuration
The LocationEnhancementEngine provides currently three configurations
- min-score (default = 0.5): The minimum score (confidence) that is required for entity suggestions
- max-location-enhancements" (default = 5): The maximum numbers of entity suggestions added (regardless if there would be more results with a score > min-score.
- min-hierarchy-score (default = 0.8): The minimum score (confidence) that is required that hierarchy enhancements are added for an suggested entity. To add hierarchy enhancements for all suggested entities min-hierarchy-score needs to be set to a value smaller equals than min-score.

