Knowledge Graphs – Part III: Build a Knowledge Graph in Ten Steps

'Download this article in infographic format'

Read - Part I: What is a Knowledge Graph?
Read - Part II: What are Knowledge Graphs Good For? 

The first two articles in this series introduced the concept of a knowledge graph and explained some of their key use cases. This entry offers more practical advice. Many guides exist to teach engineers how to build knowledge graphs. They cover the nitty-gritty details of query languages, ontology design, and data integration. They don’t explain what it takes to create a successful knowledge graph on an organizational level. Today, I intend to fill that void.

Consider this article the businessperson’s how-to guide to knowledge graphs. I will explain in 10 steps the people, processes, and technology needed to construct an effective knowledge graph. Although these steps form a loose sequence, ultimately building a knowledge graph is an iterative process. You may need to cycle between a few steps multiple times before proceeding to the next.

Step 1. Confirm that you need a knowledge graph

Every knowledge graph should start with a business use case. Let the need drive the solution, not the other way around. Knowledge graphs have a lot of applications, but not every use case requires a knowledge graph.  At the end of my last article, I proposed four questions that can be used to determine if a particular use case would benefit from a knowledge graph:

  1. Do the entities I want to know about naturally form networks, i.e., is the data graph-y?

  1. Does this use case require linking metadata or additional context to the data itself?

  1. Are there insights in my data that will only emerge if it's interlinked?

  1. Are the relationships between the data just as important as the data itself?

If the answer to any of the questions is yes, proceed to the next step.

Step 2. Decide on a minimum viable product

Many knowledge graph initiatives fail because the initial vision is too ambitious. Yes, the ultimate goal might be to link all of the data within your organization within an enterprise knowledge graph, but that’s not where to start. Work with end-users to consider what the graph needs to do to solve the business problem you selected, and decide how small the graph could be while still accomplishing the objective. Strictly limit the scope of your graph to that minimum viable level, and prove its value before trying to expand.

Step 3. Determine what information needs to be modeled

Great, now through conversation with business stakeholders you know what the graph needs to do. The next step is to figure out what information it must contain to accomplish that goal. Consulting with domain experts is critical at this stage. Only they know the kinds of data most relevant to the use case. Ask what data they look for most often, what features drive their decision making, and what they’ll need in the graph to do their jobs. Ultimately, the knowledge graph will need to mirror how they already visualize the information in their domain.

Step 4. Find out what work has already been done

Once you know what data you’ll need, see if you can benefit from someone else’s work. The knowledge graph community was founded on open-source principles, so public resources often exist to give you a jump start. These might include domain specific graph-compatible data sets, industry standards, or pre-defined ontologies (more on that in step 7). Don’t reinvent the wheel if you don’t have to. Researching available resources before you get started will save enormous amounts of time.

Step 5. Select people to build the graph

At a minimum, building a knowledge graph means filling two key roles. If this is your first knowledge graph, the odds are high no one on your team has specific knowledge graph expertise. Although knowledge graphs have become increasingly mainstream, knowledge of the relevant languages and technologies remains somewhat rare.

The first persona needed for a successful knowledge graph initiative is a designer. They need to be an expert in knowledge management and ontology development. This person will decide how to organize the data in the graph. They will create the theoretical framework over which your data will be laid. People with this expertise often come from the world of library science, which has been thinking about taxonomies, ontologies, and other knowledge management concepts for longer than any other discipline.

The second role is an engineer. This person will be responsible for realizing the designer’s vision. They need to be fluent in SPRQL (for RDF graphs) or one of the many property graph languages such as Gremlin, GSPQL, or Cypher. They should understand the technical differences between different graph databases and have familiarity with their user interfaces. 

Step 6. Choose the appropriate tools

With your team assembled, it’s time to choose your tech. At minimum you will need:

  • A data store, ideally a graph database that’s already optimized for graph data structures

  • An ontology management tool, these exist to help designers create and manipulate ontologies 

Often dedicated knowledge graph platforms will bundle these capabilities, combining a storage solution with the tools needed to actually construct the graph. You can also take a best-of-breed approach and purchase stand-alone solutions as long as they integrate with one another. Select tools that complement the experience and expertise of your team. If they’re already comfortable with the tools you select, you’ll realize faster time to value.

Step 7. Design an ontology 

Your knowledge graph designer should work closely with domain experts and business users on this step. An ontology is the basic framework of the graph. It defines how concepts relate to one another. Unless this piece is spot on, the graph will be a pain to use. It should reflect the understanding of end-users so that it is intuitive for those who aren’t ontology experts. Once the ontology exists at a theoretical level, the engineer will encode it into the graph.

Step 8. Populate the graph

When the ontology is ready, the engineer should add the actual data to the graph. Although simple from a conceptual standpoint, this step can be very time and labor-intensive. This is the piece most other how-to guides focus on.

Step 9. Integrate the graph with other systems and processes

Congratulations! You now have a graph. But a graph in isolation is useless. Remember your end-user is a businessperson, not the knowledge graph engineer. Once the graph exists, the engineer needs to integrate it with business systems, so users can actually take advantage of it. Depending on the use case, the graph should connect to business intelligence tools, customer relationship managers, enterprise resource planning platforms, and/or content management solutions. 

Step 10. Check that the business need is met

Does the graph accomplish what it needs to?

  1. If not, determine why not and repeat steps

Diagnose what went wrong and try again.  This time it won’t be from scratch, though. Try to figure out on which step things went wrong, then go back and iterate. The work you’ve already done should give you a significant boost.

  1. If so, celebrate then look for other business use cases that could benefit from the graph or extensions of it and repeat steps from the beginning

You did it! You built a successful knowledge graph! Since you’ve proven the value of the approach, it’s time to see what other business problems you can solve. Some may only require small tweaks to your existing graph; others could warrant significant expansions. Just make sure you don’t get ahead of yourself, and keep it grounded in what the business needs.

I’ll be back next month with the final installment of this series, in which I will explore the trajectory of the graph industry. Hint, whether you build your own knowledge graph or not, you’ll soon be using them under the hood of other tools. 

Joe Hilleary

Joe Hilleary is a writer, researcher, and data enthusiast. He believes that we are living through a pivotal moment in the evolution of data technology and is dedicated to...

More About Joe Hilleary