Home » Technical Topics » Knowledge Engineering

One Big Graph and the Interorganization

  • Alan Morrison 
Source: Max Delbrück Center for Molecular Medicine

Warning: I’ll be mentioning web3 in this post.

To understate the obvious, vendors, reporters, and enthusiasts often overpromise when it comes to new data tech such as web3.

I’m using the term web3 in a hopeful way with an expansive definition in mind, one that’s not familiar to most people. When I mention web3, it’d be good if you could put aside the associations with cryptocurrency, blockchain, and an endless, speculation-driven hype cycle for a moment. 

Think instead about the advantages of an open, boundary-free web – shared, intricately connected, automatically extensible infrastructure with fewer hassles, more built-in advantages, and no inherent siloing.

A web where you can post on demand and simultaneously control access–in a way built into the architecture – to all the data you’re posting.

A web that supports serendipitous discovery and rapid community building in many new and different ways.

One big, less centralized peer-to-peer graph allows meta-organizations and inter-organizations to form, evolve, merge, grow and reform organically. 

Unlike the “metaverse” (which won’t be real until boundary-crossing, malleable and interacting organizations become the norm), the technology for this one big graph is here and now. Most folks just haven’t learned how to use it to their best advantage yet, if they’ve learned to use it at all.

One Big Innovation Graph

Let’s sidestep the hype and avoid discussing “decentralized autonomous organizations” or DAOs (neither decentralized nor autonomous). Instead, think about dynamic, machine-assisted, human-in-the-loop inter-organizations–what IDC called the Innovation Graph five years back.

What I found compelling about IDC’s conceptualization of the Innovation Graph when first learning about it was that with data literacy and maturity comes a new form of agility – the ability for individuals and organizations both to skate along edges to different industry nodes of value they hadn’t been able to pursue or capture before.

For example, companies can become data providers when they haven’t been. Consultancies can also use their relationships across industry boundaries to help clients cross those boundaries.

IDC’s Innovation Graph

The most data-literate and mature attack on all fronts now, in the same way that Amazon began to stake out territory in healthcare, logistics, and subscription cloud services ten or fifteen years ago.

Web3 as an Evolution of P2P Networking + Shared File Systems

P2P data networks like the Interplanetary File System (IPFS) are key to the feasibility of an upgrade to peer-to-peer networking. Unlike the W3C standard web, IPFS offers logically derived unique identifiers and content addressing.

Two example advantages of content identifiers (CIDs) and content addressing on their own: 

Eliminating duplication. “The CID created with IPFS provides a digital fingerprint that can ensure authenticity and uniqueness. This simplifies deduplication by creating a single data instance with an immutable CID. And because there is only one copy of each resource, authentication is much simpler.

Since there’s no duplication, IPFS minimizes the storage space consumed by data backups and archives. This creates a massive advantage for any organization archiving their data.”

– From IPFS hardware optimization provider Silicon Mechanics’ site

Eliminating link rot. “Thanks to content addressing, link rot may become a thing of the past. A content-addressed system such as web3.storage is like our key-value-based DNS, with one significant difference: You no longer get to choose the keys. Instead, the keys are derived directly from the file contents using an algorithm that will always generate the same key for the same content.”

– From IPFS provider web.storage’s site

Keep in mind that file systems in general are gaining in capability. The implication is that more developers don’t use databases as much as they used to; they use file systems instead.

In that sense, they can sidestep the disadvantages of databases (such as their inherent siloing, centralization and feature bloat). They can go with one big graph with its built-in file system instead. IPFS in that sense, provides one automated file system for all.

All these freed-up capabilities are available on a different kind of open web now. Although ease of use isn’t often among those capabilities yet, a critical mass of user experience that’s good enough should materialize over the next few years.

Legacy: The Gift that Keeps on Giving (When We Wish It Wouldn’t)

With the scaling advantages of IPFS and the growing ecosystem around it, collaboration could scale at a much more rapid and dynamic clip than it has. What could hold us back? It’s what market analysts call the installed base. 

The installed base in software implies technical debt and inertia on a giant, monolithic scale. It’s the Software Wasteland, all the code that’s the single purpose and just slightly different enough not to interoperate, all the architecture that still gets in the way and keeps cropping up in today’s cloud services. It’s all the methodologies that our educational system keeps reteaching, the provincial siloing in the umpteenth editions of relational database textbooks that students still have to rent, the dearth of .edu coursework on the commonalities of content, data, and knowledge graphs that should all be treated as one….. 

It’s the packaged software monoliths who adopted microservices a decade or more ago and were reborn as….cloud monoliths, with cloud software suites and SaaS and duplicated data sprawl and all the other modern dilemmas.

The biggest challenge innovators face today is to maintain their integrity and continue to progress despite the temptations. It’s easy to give in and take the bigger paychecks in legacy markets rebranded as new. The real web3 of One Big Graph without all the inhibitions of incumbency is there for the taking–if we want to fight hard enough for it.