Semantic ontologies in KoBo

Dear KoBo Community,

We are working on a project that aims to make the data collected for one project reusable beyond a specific project context. We use semantic ontologies to tackle this shortcoming and bridge the gap between specific isolated datasets. The data collected with our implementation is semantically annotated based on popular ontologies like OBOE and OM2. We want to add the functionality described above in the KoBo toolbox. The functionality would also benefit the KoBo framework as a whole by facilitating the inclusion of semantic information. Please let us know the best way to implement this functionality in KoBo. We were wondering if we can work with the Kobo community and implement it in the core KoBo framework, or we can develop a plugin mechanism that would let us achieve this goal. The last resort would be to fork the KoBo repos and implement the functionality independently.

Please refer to the explanation of an example of the exported data in XML format below. While creating a form in the form builder, each question in the form is semantically annotated. This semantic information is saved along with the form. When the final data is exported, this semantic information is also attached to the data.

Example export below has two elements: name and height. The name is just a string. The element height is of type numeric and should receive ‘length’ as its semantic property. The characteristic attribute of the element height is referencing the OBOE semantic ontology and demonstrating that this field is of type ‘length’. The unit attribute references OM2 ontology, which defines that the value(158) of the height is given in ‘centimeters’.

<root xmlns="http://example.org">
	<dataset>
		<row>
			<name characteristic="Identifier" unit="None" creator="Khan">Khan</name>
			
			<height characteristic="http://ecoinformatics.org/oboe/oboe.1.2/oboe-characteristics.owl#Length" unit="http://www.ontology-of-units-of-measure.org/resource/om-2/centimetre" creator="Khan">158</height>
		</row>
	</dataset>
</root>

This semantically annotated data export can make the survey results easily discoverable and reusable. We have already implemented this functionality in the ODK framework(as a fork). Please find the extended repos: Semantic Data Collection · GitLab

I would request the KoBo community and developers to please help us to figure out a way forward that would optimize our work in implementing the functionality.

2 Likes

@khansaifmohd93, kindly please be informed that KoBoToolbox is currently working on dynamic data updating (which should solve your query). Will let you know when the same is available.

Hi,

I’d be really interested in this functionality, and already sort of ‘hack it together’ in my forms by including a specific ontology link in the ‘description’ section of questions in my library. Not ideal, but better than nothing.

Ideally I would have at least two fields that are semantically annotated as a minimum:

  1. The measurement itself (e.g. plant height, crop growth stage, disease score, etc - can you tell I work in agriculture? :slight_smile: )
  2. The units for the measurement (mm, cm, etc) which I can also add to my library. For example, I measure many things in mm…

The third thing which would be the cherry on the cake would be an ability to link a question to a DOI so I can refer to a published method. For example, there are many ways of measuring crop growth stage and disease score above, and I have questions in my library for all of them - each with their own individual row names…

Just a few thoughts. Hope this helps… Is there any way I can follow your progress?

Thanks,

Aislinn

1 Like

Welcome to the community @aislinnpearson! Will keep you updated when it’s shaped.

1 Like

Hey @Kal_Lam,

Can you please explain a little about the dynamic data updating that you have been working on? Or you can also point me to a document related to the development and implementation. I want to know some details and check if that is really what we want here.

Regards
Saif

@khansaifmohd93, you could follow it through this GitHub issue:

Hey @Kal_Lam,

Thanks for the prompt reply. Dynamic data attachments are about reusing questions among different forms but our requirement is a bit different. We want to semantically annotate the data and make that data reusable in different scenarios.

For example, suppose an attribute “weight”- It can be semantically annotated by giving characteristic as ‘Weight’ and annotate the unit information as well in that attribute. Please refer to the XML snippet in the question for better understanding. Then this data can be exported in various formats like CSV, XML, TTL, etc. preserving the semantic annotations.

Please let me know if this kind of functionality is in the pipeline otherwise advise us how could we proceed forward to implement this functionality.

Hi @khansaifmohd93, as far as I know, this kind of functionality is not in the pipeline, however if this is a feature that will benefit the wider community then we can certainly look at integrating it in the future. I’ll bring in @jnm here for further comments on that.

You are of course welcome to develop this feature independently and submit a PR that we can review — or even just run a local version of kobo with your custom changes if they are unlikely to be merged into master.

3 Likes

It’s not in the pipeline, but I’ll pull in @tinok and @ig_rebollo. With any new feature like this, we need to understand:

  1. Who is going to do the work? Ultimately, unless you maintain a fork—which is nobody’s preference—the core team will have to review the changes and merge them into the main KoBo repositories, but perhaps most of the UX and coding could be done elsewhere. This depends on what resources are at your disposal, e.g. design/technical labor vs. funding. The first task is clarifying the requirements and estimating; then, there’s the further work of fine-tuning UX design, doing the software development, and testing.

  2. What’s the user experience going to be like? I think this summary gets the basic idea through clearly enough:

    While creating a form in the form builder, each question in the form is semantically annotated. This semantic information is saved along with the form. When the final data is exported, this semantic information is also attached to the data.

    …but we have to work through the details of how this will be presented in the form builder, and how it can be made useful to the largest audience. Can we allow for arbitrary tagging of questions? Can we take this opportunity to improve our treatment of HXL, which currently can only be specified in XLSForm? What are the exports going to look like: is this for XML exports only? (Currently, we don’t really have an XML export mechanism the way we do for XLSX or CSV; we basically just cough up the XML submissions as they were received from the client, i.e. Enketo or Collect).

  3. How do we build it, and how long will it take? Which code changes are needed and where? Does this affect the shared XLSForm or XForm specifications that are maintained by the larger ODK community that extends beyond KoBo?

These are questions we’d answer together, but perhaps, @khansaifmohd93, you could give an idea of your thoughts regarding (1) to begin. This is the best place to discuss the details of any new feature, but if you’d like a more private venue to discuss the business aspects, then you are welcome to write to info@kobotoolbox.org.

3 Likes

I have shared a document that highlights the organizational structure and summarizes the requirement of the functionality we want to develop via email at info@kobotoolbox.og.
I would request the KoBo team to take a look at the document that might helps to understand our development plans. Although we don’t have direct funding, we will contribute to requirement management and code development. We will require some insights and expertise from KoBo developers while developing and testing the code. It will be a valuable contribution that will benefit the community as a whole.
We would want to discuss the development in an online meeting. We can provide a medium for the meeting as well. Please let us know the preferred time and the medium of meeting for the discussion.

1 Like