Teaching AI to Think Like an Expert: Knowledge Graphs and RAG in Climate Networks

Published on
4 min read (~1.7k tokens)

A global climate funding program needed a more strategic way to capture and uptake learnings. Embedded within a broader knowledge management and digital transformation effort, we had the chance to pilot AI-powered approaches to accelerate the discovery and sharing of insights from project reports throughout the stakeholder network.

The more information we get, the less knowledge we have

Three persistent knowledge management challenges across the stakeholder ecosystem were assessed.

  1. Structuring knowledge. How do we organise knowledge so it can be easily found and reused in the stakeholder network?
  2. Finding learnings. Is the knowledge I need already available somewhere in the stakeholder network?
  3. Locating expertise. Who holds the relevant knowledge in the stakeholder network?

Structuring context for document intelligence: a practical perspective

Professional Generative AI use stands or falls on one thing: can experts rely on it? The deeper one moves into expert knowledge, the more important it becomes to carefully structure and provide the right context to an LLM. In a test together with monitoring and evaluation (M&E) experts — covering around 250 project evaluation reports — we explored how knowledge graphs and Retrieval-Augmented Generation (RAG) can become core skills of AI agents: finding the right knowledge, in the right documents, for the right task and the right people.

Phase 1 – Conceptual Modeling and Domain Scoping

We began by collecting and analyzing strategic documentation related to funding priorities, project life-cycle processes, and internal tagging systems. Existing glossaries and keyword lists used by various business units were reviewed and consolidated into a preliminary conceptual map, which was refined in consultation with senior management and domain leads (Monitoring & Evaluation and Knowledge Management units). A selected fragment of this model was then formalized as the starting point for terminology development.

The image shows a sneak peak of the board modelling the knowledge sources gathered from the analaysis and review of the strategic documentation.
A part of the concept model realized after review of strategic documentation.

Phase 2 – Formalization of Terminology using SKOS

Screenshot of the Protégé user interface taken during SKOS vocabulary creation.
Terminology creation in Protégé.

The team selected the Simple Knowledge Organization System (SKOS) standard for formalizing the fragment. The terminology was first developed using Protégé and then imported into Skosmos to facilitate interactive refinement with domain experts.

Screenshot of the terminology as rendered by Skosmos focusing on the Energy Efficiency concept.
Terminology browsed in Skosmos.

Phase 3 – Thesaurus Integration and Mapping

A domain-specific thesaurus was developed from scratch and validated in workshops with experts. To enhance interoperability and avoid duplication, we integrated two publicly available SKOS thesauri commonly used in the domain, the SDG Thesaurus and the CRS/OECD Thesaurus. Mappings across the three thesauri were established by processing glossaries used by domain experts.

Screenshot of the CRS terminology as rendered by Skosmos.
CRS terminology.
Screenshot of the SDG terminology as rendered by Skosmos.
SDG terminology.

Phase 4 – Augmenting Information Retrieval with Automatic Classification

We needed a way to connect document parts (i.e. chunks) to the vocabularies. We adopted SDGBert, a fine-tuned bert-base-uncased model trained on the OSDG Community Dataset (32,000+ labeled entries) deployed using the Hugging Face Text-Embeddings-Inference (TEI) toolkit.

The classification pipeline included:

  1. Splitting documents into textual chunks.
  2. Assigning each chunk to one or more SDGs based on a probability threshold (>0.5).
  3. Aggregating chunk-level predictions to assign document-level SDG relevance scores.

The knowledge graph derived from the mapping between the three vocabularies has been then used to enhance exploration of project evaluation reports and question/answering sessions with contextual information.

The pilot technological stack was based on a Laravel (PHP) application backed by MariaDB for allowing users to browse documents, the search for the RAG component was powered by Meilisearch while the relations that formed the knowledge graph were resolved from the SKOS files using EasyRDF, the same library that power Skosmos. Skosmos was used for terminology navigation and editing.

Phase 5 – Demonstration and Results

A set of 250 documents selected by various stakeholders was processed using the pipeline. Key outcomes included:

  • Significant reduction in time required for document organization and classification.
  • Enhanced multilingual tagging and concept clarity through the SKOS-based terminology.
  • Technical feasibility of integrating controlled terminologies and trained models to enhance knowledge discovery across complex organizational networks.

Takeaways and Learnings

  • The Primacy of Context: This pilot suggests that AI applications for knowledge management within expert networks require rigorous data preparation and the ability to frame user knowledge requests within ‘meaningful’, AI-ready contexts.
  • The Agentic Framework: The combined use of interoperable terminologies (SKOS) and Retrieval-Augmented Generation (RAG) proved to be a promising skill set for agents, enabling them to assist experts in identifying and connecting relevant knowledge across fragmented, multilingual, and multi-organizational document landscapes.

From a technical standpoint, these approaches can be highly effective for knowledge workers—but their impact ultimately depends on the adaptation of the organizational context. Close collaboration between M&E and KM experts and technical teams as well as awareness and endorsement from the top management is essential to make AI-driven workflows work in practice. This is a broader topic that goes beyond this post and warrants its own discussion.

AI Transparency

Human · AI Assisted

The content was produced by humans with AI providing minor help (e.g. grammar, translation) or generated segments (e.g. rephrasing or structuring) integrated by the author.