Understanding The Building Blocks of a Distributed Ledger System

Introduction to DLTs

Distributed Ledger technology (DLT) is being hailed as a transformative technology with comparisons being drawn to the Internet in its potential to transform and disrupt industries.  As a “platform” technology for decentralized, trust-based peer-to-peer computing, DLT helps shape new “domain” capabilities, just as computer networking enabled the Internet and creation of capabilities across communication, collaboration and commerce. Like the Internet, it will have far reaching consequences for enterprise architectures of the future.  Not only will DLT transform the technology stack of established domains (witness how Blockchain is transforming identity management infrastructure in the enterprise), but it will also give rise to new architecture paradigms as computing moves to decentralized trust-based networks, for example, in how an enterprise interacts with its business partners, suppliers and buyers.  The Internet took 30 years to have disruptive effects in the enterprise, and DLT’s full impact is expected to play out over similar time frames.

DLT represents a generic class of technologies (Blockchain is a prominent example), but all DLTs share the concept of the distributed ledger: a shared, immutable database that is the system of record for all transactions, current and historic, which is maintained by a community of participating nodes that have some sort of an incentive (usually a token or a cryptocurrency) to maintain the ledger in good standing.  The emergence of DLT’s can be traced to back to the original blockchain applications, Bitcoin and Ethereum.  Various other distributed ledger applications have emerged to solve specific industry/domain issues: R3’s Corda in financial services, Ripple for payments, etc.  Innovation in the DLT space is proceeding at a feverish pace.  The well-established DLT based networks can be essentially segmented based on two dimensions: how ledger integrity is guaranteed through validation, and whether the ledger is private or public.

DLT and Enterprise Architecture

As participants to DLT based networks developed by industry utilities or consortiums, organizations may not have a strong need to master internal architecture design and trade-offs associated this such a platform.  However, the architecture community in those organizations will still be required to understand how the networks they are participating in work, to the extent required to understand the implications for their organizations.  Furthermore, as intra-company applications of DLT become mainstream, enterprise architects will be increasingly called to provide perspectives on most optimal design of the underlying technology.  As DLT moves from innovation labs into the mainstream enterprise, architects will need to  start preparing their organizations for accepting DLT-based applications into the organizational landscape.  A good place to start for the enterprise architects will be to understand just what the DLT technical architecture encompasses.  This involves understanding what building blocks comprise a DLT system, and what architectural decisions need to be made.

The Building Blocks of a DLT System

To understand a complex technology such as DLT, it may be helpful to draw parallels to the TCP/IP stack for computer networking, which Blockchain has been compared to in the past (The Truth About Blockchain).  While there may not be a straight one-to-one correspondence between the Internet’s OSI model and the DLT architecture, drawing the parallel helps one understand conceptually how the building blocks fit together.  The OSI model is a generic architecture that represents the several flavors of networking that exist today, ranging from closed, proprietary networks to open, standards-based. The DLT building blocks provide a generic architecture that represents the several flavors of DLTs that exist today, and ones yet to be born.

In theory, it should be possible to design each building block independently with well-defined interfaces for the whole DLT system to come together as one whole, with higher level building blocks abstracted from the lower level ones. In reality, architectural choices in a building block influence those in other building blocks e.g., choice of a DLT’S data structure influences the consensus protocol most suitable for the system.  As common industry standards for DLT architecture and design develop (Hyperledger is an early development spearheaded by The Linux Foundation) and new technology is proved out in the marketplace, a more standardized DLT architecture stack will perhaps emerge, again following how computer networking standards emerged.  There is value, nevertheless, in being able to conceptually view a DLT system as an assembly of these building blocks to understand the key architecture decisions that need to be made.

Key Architectural Tradeoffs in DLT Systems

Architecting a DLT system involves making a series of decisions and tradeoffs across key dimensions.  These decisions optimize the DLT for the specific business requirement: for some DLT applications, performance and scalability may be key, while for some others, ensuring fundamental DLT properties (e.g., immutability and transparency) may be paramount.   Inherent in these decisions are architectural tradeoffs, since the dimensions represent ideal states seldom realized in practice.  These tradeoffs essentially involve traversing the triple constraint of Decentralization, Scalability, and Security.

Decentralization reflects the fundamental egalitarian philosophy of the original Bitcoin/Blockchain vision i.e., the distributed ledger should be accessible, available and transparent to all at all times, and that all participating nodes in the network should validate the ledger and thus have the full ledger data.  Decentralization enables trustless parties to participate in the network without the need for central authorization.  Scalability refers to the goal of having appropriate level of transaction throughput, storage capacity of the DLT to record transaction data, and the latency for the transaction to be validated and recorded once it is submitted.  Scalability ensures that appropriate performance levels are maintained as the size of the network grows.  Finally, Security is being able to maintain the integrity of the ledger by warding off attacks or making it impossible to maliciously change the ledger for one’s benefit. Fundamentally, this dimension reflects a security design that is inbuilt into the fabric of how the ledger operates, and not rely on external ‘checking’ to ensure safety.

Bringing It Together: DLT Building Block Decisions and Architectural Tradeoffs

Applying the architectural decisions to the DLT system allows one to come up with different flavors of DLT systems, each making tradeoffs to navigate the triple constraint described above.  Traversing the sides of the triangle allows one to transcend different DLT architecture styles with the vertices of the triangle denoting most pure architectural states seldom realized in practice.  For example, systems like Bitcoin and Ethereum aim to tend toward Vertex A maximizing Decentralization through their decentralized P2P trustless model, and Security through their consensus building and validation methods that prevent malicious attacks (although both Bitcoin and Ethereum have been shown to have other security vulnerabilities), but sacrifice much in terms of Scalability (Bitcoin’s scalability woes are well-known, and Ethereum is only slightly better).  On the other hand, permissioned DLTs, such as Corda, aim to tend to Vertex C maximizing Scalability and guaranteeing Security, but sacrifice Decentralization (by definition, permissioned DLT’s are not transparent since they restrict access and also validation is provided only by a set of pre-authorized validating nodes), and also may suffer other security issues (both the trusted nodes and the central authority in a permissioned DLT system can be attacked by a nefarious party).  DLT variations such as Bitcoin Lightning Network and Ethereum Raiden tend toward Vertex B, aiming to use off-chain capabilities to improve Scalability of traditional Blockchain and Ethereum networks, while preserving Decentralization (despite some recent concerns that these networks have a tendency to become centralized in the long run), although their off-chain capabilities may require additional Security capabilities (they also partially move away from the Blockchain’s decentralized security apparatus).   Let’s examine how these tradeoffs come into play at the level of DLT building blocks.

Layer 3: Ledger Data Structure

Ledger Data Structure encapsulates decisions around how the distributed ledger is actually structured and linked at a physical level e.g., chain of blocks, a graph, etc.  Additionally, it captures decisions around how many ledger chains there are, and specifies if the nodes carry the entire or just a part of the ledger.  In traditional Blockchain, the ledger is structured as a global sequential linked list of blocks instances of which are replicated across all participating nodes.  This design goes hand in hand with the Proof of Work consensus protocol that traditional Blockchain has in ensuring high levels of Decentralization and Security- since each node has current instance of the global ledger chain, and there is decentralized consensus building for block validation (although, a few security vulnerabilities with Blockchain have come to the forefront and Proof Work is susceptible to centralization due to economies of scale in mining).  As we know, this design takes a toll on Scalability – Blockchain can process only a few transactions per minute and time required for processing a block is high (Bitcoin generates a new block every 10 minutes).

Some new designs are coming with alternate data structures that improve Scalability & Performance, such as NXT’s and SPECTRE’s DAG (directed acyclic graph) of blocks, which mine DAG blocks in parallel to allow for more throughput and lower transaction time, and IOTA’s Tangle, the so called “blockless” DLT’s that get rid of block mining altogether and rely on a DAG of transactions to maintain system state and integrity.  These new designs have to be implemented and used at scale, with many of these designs having their own set of challenges (some claim they will continue to rely on some form of centralization to gain scale, and also have security related challenges).  However, DLT community’s interest has been high: IOTA’s Tangle has been creating a buzz in the DLT circles has a possible serious contender in the IoT world (since its data structure and protocol is well suited for handling volumes of continual streams of data), and several blockless DLT startups have been born lately.

Tinkering with how the ledger data is stored across nodes represent another opportunity for gaining in Scalability.  For example, sharding, a concept fairly well established in the distributed database world, is coming to DLTs.  Applied to DLTs, sharding enables the overall Blockhain state to be split into shards which are then stored and processed by different nodes in the network in parallel – allowing higher transaction throughput (Ethereum’s Casper utilizes sharding to drive scalability and speed).  Similarly, Scalability can be improved by having multiple chains, possibly private,  to enable separation of concerns: “side chains” enable processing to happen on a separate chain without overloading the original main chain.  While such designs improve Scalability, they move away from DLT’s  vision of enabling democratic access and availability to all participants at all times, and also present Security related challenges, part of the reason why widespread adoption of sidechains has been slow.

Layer 2: Consensus Protocol

Consensus protocol determines how transactions are validated and added to the ledger, and the decision-making in this building block involves deciding which specific protocol to choose based on the underlying data structure and objectives related to the triple constraint. Proof of Work, the traditional Blockchain consensus protocol, requires transactions to be validated by all participating nodes, and enables high degree of Decentralization and Security, but suffers on Scalability.  Alternative protocols, such as Proof of Stake, provide slightly better Scalability by changing the inventive mechanism to align more closely with the good operation of the ledger.  Protocols such as those based on Byzantine Fault Tolerance (BFT), which have been successfully applied to other distributed systems, are applicable to private ledgers, and depend upon a collection of pre-trusted nodes.  Such protocols sacrifice Decentralization to gain in Scalability.

Ethereum’s Raiden and Bitcoin’s Lightning Network are innovations to drive scalability to Ethereum and Bitcoin respectively by securely moving transactions off the main chain to a separate transacting channel, and then moving back to the main chain for settlement purposes – the so called “Layer 2” innovations.  This design allows load to move off of the main ledger, however, since transactions occuring on the channel are not recorded on the ledger, it sacrifices Security as the transacting channels need additional security apparatus not part of the original chain, as well as Decentralization (since channel transactions are not accessible to participants).

A number of other protocols and schemes to improve scalability and security are in the works, many of which are variations of the basic PoW and PoS, and which envision a future comprising not one single ledger chain, but a collection of chains.  For example, Kadena, which uses a PoW on a braid of chains, EOS which uses a delegated PoS, and Cosmos Tendermint, which uses BFT-based PoS across a universe of chains.

Layer 1:  Computation and App Data

DLT resources such as storage and computation come at a premium, and it costs real money to submit transactions in a DLT systems.  In the topmost layer, therefore, the architectural decisions deal with providing flexibility and functionality related to data storage and computation – essentially how much of it should reside on-chain, and how much off-chain.  Additionally, this layer deals with decisions around how to integrate the DLT with events from the real world.

For computation, Bitcoin Blockchain and Ethereum provide constructs for putting data and business logic to be executed on-chain, and Ethereum is far advanced than Blockchain in this since it offers “smart contracts”, which is essentially code that is executed on the chain when certain conditions are met.  There are obviously advantages to doing all computation on chain: interoperability between parties and immutability of code, which facilitates trust building.  There is, however, a practical limit to how complex smart contracts can be, a limit that is easily reached.  Offloading complex calculation to off-chain capabilities allows one to leverage the DLT capabilities in a cost-effective and high performing manner.  TrueBit,  on online marketplace for computation, enables a pattern in which complex resource-intensive computation can be offloaded to a community of miners who compete to complete the computation for a reward and provide results that can be verified on-chain for authenticity.  While this provides upside in terms of Scalability and Decentralization, there are Security related implications of using off-chain computation, an area of active research and development.

What applies to computation, also applies to data storage in the DLT world.  While Blockchain and Ethereum provide basic capabilities for storing data elements, a more suitable design for managing large data sets in DLT transactions is through off-chain data infrastructure providers or cloud storage providers while maintaining hashed pointers to these data sets on-chain.  Solutions like Storj, Sia, and IPFS aim to provide a P2P decentralized secure data management infrastructure that can hook into DLTs through tokens and smart contracts, manage data and computation securely through such technologies as Secure MPC (multi party computation).  Similar to off-chain computation, off-chain storage has upside in terms of Scalability and Decentralization, however, there are security and durability related implications.

What provides immutability to the distributed ledger (its deterministic method of recording transactions) is also its Achille’s heel: it is difficult for the ledger to communicate with and interpret data it gets from the outside non-deterministic world.  Oracles, services which act as middle men between the distributed ledger and the non-DLT world, bridge that gap and make it possible for smart contracts to be put to real world use.  Various DLT oracle infrastructures are in development: ChainLink, Zap, Oraclize, etc.  that provide varying features; choosing the right oracle architecture is thus extremely crucial for the specific use case under consideration.  Similar to off-chain data, oracles provide upside in terms of Scalability and Decentralization, however there are security and data verifiability related concerns.

Untitled

Conclusion

These are still early days for the DLT technology, and the many improvements that need to happen to make DLT commercially implementable are yet to come.  Beyond scalability and security, DLTs face a number of hurdles in enterprise adoption, such as interoperability, complexity and lack of developer friendly toolkits.  The future is probably going to be a not just one ledger technology here or there, but a multitude, each optimized for the specific use case within an organization, and even superstructures such as chains of chains connected with oracles, middleware and such.  And these structures will not replace existing technology architecture either; they will exist alongside and will need to be integrated with legacy technologies.  Like networking, DLTs will give rise to new processes, teams, and management structures.  Enterprise architects will play a central role in facilitating the development of DLT as a true enterprise technology.