Speed up your algorithms (python)

Run Forrest! Run !

Overview

This is the last tutorial! Implementing algorithms, we all know that scaling is important and that performances can become a bottleneck when trying to do so. Furthermore, using an external knowledge base and thus passing by a communication layer, the amount of data to transfer can be an issue. In such a way, the use of strings to interact with Ontologenius is not the more suitable one to have the best performances possible... This is why Ontologenius proposes an interface based on indexes as integers! More than reducing the amount of data to transfer, that provides a significative gain in Ontologenius internal algorithms and allows you to do the same on your side.

The mini-project that we will do will be the adaptation of the second tutorial algorithm, by using indexes..

Run Forrest! Run!

The ontology manipulator

Until now, we have used an OntologyManipulator to access the different methods to explore the ontology, but using concept's identifiers (strings).

To access the index version of the API, we will now use an OntologyManipulatorIndex. If it seems to be the exact same one as the original one, this is normal. Indeed, the interface is the same (same members' name and same functions' name) but all the exploration queries use indexes instead of identifiers, meaning integers instead of strings. All related classes, based on index use, can be found at the bottom of the left sidebar.

Working in multi instances mode, you steal have to use an OntologiesManipulator and insteed of using the get method to extract an OntologyManipulator, you can use the getIndex method to extract an OntologyManipulatorIndex.

Passing from indentifiers to indexes

Before moving ahead, one should ask itself "But how can I know the index of a concept?", and this is an excellent question...

Indexes are attributed by Ontologenius to any concepts. It is a kind of internal identifier used by Ontologenius to ease the concepts manipulation. In such a way, from one run to another, we cannot ensure that a given concept will have the same index... It is thus important to first query Ontologenius for the indexes.

The actual only difference between an OntologyManipulator and an OntologyManipulatorIndex, is the presence of a "conversion" member. The conversion member is an instance of a ConversionClient which allows you to pass from identifiers to indexes and inverse.

To better understand, let's take an easy example. Launch ontologenius with the launch file of the second tutorial and in a Python terminal execute the following:

from ontologenius import OntologyManipulatorIndex

onto = OntologyManipulatorIndex()
onto.close()
path_index = onto.conversion.classesId2Index('path')
print(path_index)
path_indexes = onto.individuals.getType(path_index)
print(path_indexes)
path_identifiers = onto.conversion.individualsIndexes2Ids(path_indexes)
print(path_identifiers)

In this example we have asked for the index of the "path" concept, then we have queried all the individuals of this type, but using the index, and finally, we have converted the result into identifiers to display it to the user! Easy!

Literals and names

In index-based use, two special cases hold: the literals and the names in natural language.

First of all, the names in natural language do not use indexes and should thus be used with strings as usual.

At the difference, the literals used with data properties have indexes associated with them. However, these indexes are negative! This specificity can be useful to easily discriminate literals to usual concepts.

Nevertheless, if concepts have positive indexes and literals have negative ones, what about zero? It holds as an empty string as used in the identifier-based queries, meaning no answer.

Time to code!

It is now your turn to code. Take the final version of tutorial 2 and modify it to use indexes. A solution is provided on the next page of the tutorial.

While coding keep in mind that conversion between identifiers and indexes should be kept as the interface with your algorithm (initialisation and result display) and should be avoided in it.

As python is not a typed language, the algorithms will not have to be modified...