How graph data models and databases can improve PLM software.
Throughout history, eras have been defined by market forces. During the 17th-century Dutch Golden Age, the worth of tulip bulbs skyrocketed to unprecedented levels. For decades, energy resources, particularly oil, dominated as some of the most influential assets. However, in 2017, The Economist posited a paradigm shift with their article, “The world’s most valuable resource is no longer oil, but data.” This assertion sparked widespread interest and intrigue about the value and power of data. While opinions vary, one thing remains clear: data possesses immense potential to shape the future.
In the manufacturing business, data remains a largely untapped goldmine. Industrial companies are literally sitting on gigantic amounts of data that can represent products (design, production, supply chain), business activities (customers, sales, support and maintenance), product usage (a growing segment of connected products) and many other forms and functions. Companies can harness the full potential of this information and understand both its value and its application. In return, they stand to gain immense rewards.
Harnessing the full potential of data is no trivial activity. Just as extracting oil requires a well-organized process and lifecycle, spanning from drilling to delivery for consumption, data follows a similar complex path. Data is collected across various silos, each corresponding to distinct activities and operations. It must then be transformed into a consumable format and integrated with other data sources to extract meaningful insights. Ultimately, this refined data should be accessible to end-users to aid them in their activities and decision-making processes.
PLM, Data Management and Databases
Let’s talk about the history of PLM data management, technology and how it is aligned with modern data management trends.
From Proprietary Databases to Structured Query Language (SQL) and Relational Database Management System (RDBMS)
Over the last 20 to 30 years, the PDM and PLM industry went a long way towards improving data management and developing scalable platforms. The data management architecture of these solutions goes back to the time when PDM/PLM developers didn’t trust and couldn’t rely on commercial database products. Therefore, early PDM and PLM used proprietary solutions, developing a variety of data stores using file formats, embedded databases and management tools. However, the end game of these experiments with proprietary data management tools was to switch to industry standards adopted by large manufacturing companies. The decision was not only technical but also political. IT oversaw technology adoptions in a company, and PDM/PLM needed to pass muster. This is easier to do if you run on top of the industry standards such as IBM, Oracle and Microsoft.
Cloud Was the Cambrian Explosion in Data Management
Over the course of the last 10 to 15 years, we can see explosive growth in the variety of data management solutions and related technologies. It started from global web platforms and other cloud development that separated the technology used to build a solution from the delivery processes in enterprise software products. This change was the biggest contribution to the way the company started to manage product data and develop product lifecycle management systems to support business processes.
Massive development of a variety of data management solutions, databases, data processing tools, data storage, analytics, machine learning and others created an ecosystem contributing to new PLM system development. A combination of new databases and cloud contributed to the foundation of polyglot persistence data architectures that enabled the use of multiple databases at once.
Today, database and data management technology are going through a Cambrian explosion of different options and flavors. It is a result of a massive amount of development coming from data management technologies developed over the last 20 years. The database is moving from “solution” into “toolbox” status. A single database (mostly RDBMS) is no longer a straightforward decision for all development tasks. Which brings a question about how different data management systems can be used efficiently to support the development of new PLM capabilities.
What is a Graph Data Model?
One very important data management trend is the development of graph models and graph databases. The graph data model is a structural representation of data where both entities and their relationships are highlighted. There are three important elements of a graph data model:
- Nodes (or vertices)
- Edges (or relationships)
- Properties
In this model, nodes represent entities or instances of the entities. Edges represent connections between nodes symbolizing relationships. Finally, properties can be attached to both entities, nodes and edges to provide a flexible data management structure. Both nodes and edges can carry properties, which are key-value pairs providing additional information.
Although graphs have been known for many years, recent development of graph tools is very promising from a PLM development perspective. The model can evolve, allowing the addition of new relationships or properties. Unlike traditional data models, graph models prioritize operations that involve traversing through connections, like determining paths between nodes or querying connected entities.
This model is especially suited for scenarios where the relationships between entities are as crucial as the entities themselves, making it a fit for situations like product structure, configurations, data dependencies and flexible knowledge representations.
What is a Graph Database?
A graph database is a type of database that uses graph structures for semantic queries with nodes, edges and properties representing stores of data. Unlike traditional relational databases which arrange data in tables, graph databases focus on the relationships between data and use graph data model as a foundation of data structure.
Because graph databases are using relationships and the mathematical foundations of graphs, these databases provides interesting capabilities that can be highly valuable in PLM. Here is a short list of examples:
- Flexible Schema: Unlike relational databases, which require a strict schema, graph databases are more flexible. This makes them more adaptable to evolving datasets.
- Relationship-centric: Graph databases excel at managing highly connected data. They are specifically designed to highlight and navigate intricate relationships in data.
- Performance: For tasks that involve traversing relationships (like social networks or recommendation engines), graph databases can be much faster than relational databases because they can traverse relationships in constant time irrespective of the data volume.
- Intuitive Data Modeling: For many real-world problems, data models can be more naturally represented as graphs. Think about social networks, organizational hierarchies, or transportation networks.
- Real-time Insights: Given their ability to efficiently traverse many nodes and relationships, graph databases can provide near real-time insights, which are crucial for applications like fraud detection.
- Advanced Analytics: With graph algorithms, one can uncover patterns that are challenging to discern in other types of databases, such as shortest paths, cluster identification or recommendation paths.
- Integration of Diverse Data Sources: Given the flexible nature of graph databases, they are particularly well-suited to integrate data from different sources and of various types.
- Agility: Iterative development is easier with graph databases. As requirements evolve, making changes to the database schema, queries or the application layer can be more straightforward than with other database models.
How Graph Databases Accelerate PLM Software Development
Graph databases can offer transformative value to Product Lifecycle Management (PLM) due to their unique ability to absorb, manage and query highly interconnected data. Here are some specific use cases illustrating the unique value of graph databases in PLM.
Flexible data modeling of relationships
Management of connected data always represented a big problem for traditional data management techniques in PLM. The value of graph databases for such a use case is its flexible data model and rich relationships capabilities. Traversing product structures using Graph Database queries is much more efficient than old-fashioned SQL queries.
Merging of complex data structure
Data coming from multiple siloes during product development, manufacturing and maintenance must be combined. Graph databases provide a unique capability to merge heterogeneous data structures into a single complex data set with the ability to query relationships, impact analysis and more.
Data analytics using graph data science
The ability to perform various analytical queries provide unique value for PLM systems because it brings analytics to active data sets. There is no need for ETL or a conversion of data into another platform. Graph databases run very specific analyses by running various graph specific algorithms.
PLM Data Modeling – File, Table, Tree, Graph
The power of the graph is the ability to retain the rich semantics of the data while providing a very powerful way to query data.
Three of the most popular data paradigms in CAD and PLM are Files, Tables and Trees (hierarchies). However, this information can be transformed into a graph that explains not only the data but relationships and semantic dependencies between data. Capturing these relationships and dependencies allows for building analysis and decision-support tool to help manufacturing companies.
The Future Value of Graph Databases for Product Development and PLM
Graphs and graph databases are quickly becoming a very powerful product data management paradigm. Meanwhile, industrial companies are sitting on a goldmine of data with limited set of options on how to turn it into value and competitive business advantage.
Graph databases can be used for a variety of these tasks. Where graphs and graph databases are powerful is in developing solutions to manage relationships. Where graphs really shine is around relationship analysis. Connections between people, things and companies are where value can be created. Although not every problem is a graph problem, PLM applications and specifically BOM management have multiple applications to using a graph database, especially where other data management solutions are inefficient.
PLM systems are transforming. Modern online platforms are focusing more on how to manage data and less on the applications. The latter has a shorter lifecycle compared to data. Because of that, there is a need for a more flexible and robust data management foundation capable of scaling and supporting data semantics while being a source of analytics on product data.
Oleg Shilovitsky is Co-founder and Chief Executive Officer at OpenBOM, and author of the Beyond PLM blog.