Skip to main content

· 5 min read
Aravinda Kumar R C

How was Terraform built?

Terraform

People think that Terraform was an original idea from Hashicorp. But it was not. The idea of Infra as Code was from AWS when Cloud Formation was introduced in the year 2011. Hashicorp took the idea and built Terraform. Terraform was first released in the year 2014.

Hashimoto himself has said that Terraform was inspired by AWS Cloud Formation here. He says

"In 2011, AWS introduced CloudFormation ... But what I thought we really needed was an open source, cloud-agnostic solution ..."

In 2017 terraform became the most popular IaC tool. And from there on, Hashicorp has been building a lot of products around Terraform and more importantly on their language HCL. They are valued at 5 billion dollars now. Then why did they change the license of Terraform? The obvious and only answer is to stamp out competition to their Hashicorp Cloud Platform (HCP).

Terraform says 95% of the commits come from their staff. But still a tool that is entirely built using golang libraries that are open source, is now not open source. So we still feel bad about it. Hashicorp could have considered focusing more on marketing their HCP instead of changing the license of Terraform.

What is the new license?

On 11th August 2023, Hashicorp announced that they are changing the license of Terraform from Mozilla Public License (MPL) to Business Source License. They published a blog post about the change here. This very confusing blog post made one thing clear, that Terraform will be free to use for individuals and all companies, even partners like microsoft and aws. But it will not be free for any company that competes with Hashicorp.

End users can continue to copy, modify, and redistribute the code for all non-commercial and commercial use, except where providing a competitive offering to HashiCorp. Partners can continue to build integrations for our joint customers. We will continue to work closely with the cloud service providers to ensure deep support for our mutual technologies. Customers of enterprise and cloud-managed HashiCorp products will see no change as well.

This seems to be a very fair position to take. But the problem is the vagueness or boundlessness of terms of the BSL. Effectively making it a proprietary license of Hashicorp that they can change at any time.

Don't listen to the person who is saying "Fork it"!

BSL

Now all the competitors of Hashicorp might be thinking of forking Terraform. A few days ago, a new open source fork of Terraform was created by digger.dev called OpenTerraform, but it was taken down in a day for unknown reasons.

The problem with forking Terraform is not Terraform but the providers. Terraform has a lot of providers that are built by the community and more importantly by the cloud providers themselves.

And these providers are hosted in the hashicorp organization on github. They are still MPL licensed but what if Hashicorp decides to change those licenses too?

The question is what can be forked?

What people confuse with is, terraform is an application to run HCL code written for infrastructure development. HCL is a different github project and it is still MPL licensed. Even if we create a tool that can read HCL and run it, Hashicorp can make HCL a proprietary language in the future. Adding a license to a language is not a new thing. Oracle has done it with Java. This too adds a lot of uncertainty to the forking of terraform.

Another problem is the Terraform Registry

Terraform registry is a place where all the providers are hosted. Along with a lot of modules created by the community, companies and partners. This too has to be cloned and hosted somewhere else.

Competitors are stirring

The competitors of Hashicorp are stirring. They have created a manifesto. But it appears to be the very thing that Hashicorp is trying to avoid with the new license change. I don't think Hashicorp will change their mind because of this. And I am glad that they didn't call it "mein terraform".

What's the right thing to do?

What to do

As a competitor: Wait and watch, the only way for you is to wait for a change from Hashicorp. If they don't change, you can't do anything.

As a devops engineer: I think terraform is a great tool and it is still free to use for me and my company. So I will continue to use it. But I will also keep an eye on the competitors and their offerings. And will prepare my management for a change if needed.

As a golang developer: I think HCL is a cool language with a parser and backend written in Go. But the design of the language has not evolved much in the direction of a pure platform engineering defacto. Mainly because of Hashicorp's aversion to competitors profiting. So I do think forking HCL (along with terraform) and creating a new language that is more exciting and pledged to be open source is the right thing to do.

· 4 min read

A distributed key-value store is a system that stores data in a key-value format across multiple nodes. It is a fundamental building block for distributed systems and is used in a variety of applications like caching, configuration management, and service discovery.

Single Node Key-Value Store vs Distributed Key-Value Store

A Single Node Key-Value Store is a simple system that stores key-value pairs in memory or on disk.

Limitations:

  • Single point of failure
  • Only vertical scaling is possible

A Distributed Key-Value Store overcomes these limitations by distributing data across multiple nodes.

Advantages:

  • High availability and fault tolerance
  • Horizontal scaling
  • Load balancing

Challenges:

  • When there are multiple nodes, how do we decide which node to store the data on?
  • How do we ensure that the data is consistent across all nodes?
  • How do we handle node failures?
  • How do we handle network partitions?
  • In any distributed system, we need to consider CAP theorem.

What is CAP theorem?

CAP theorem states that in a distributed system, we can only achieve two out of the following three guarantees:

  • Consistency: All nodes see the same data at the same time.
  • Availability: Every request gets a response, even if some nodes are down.
  • Partition tolerance: The system continues to operate despite network partitions.
CAP Theorem

System Requirements for a Distributed KV Store

  • Scalability: The system should be horizontally scalable, allowing the addition of new nodes to handle increased load.
  • Consistency: Ensure strong consistency guarantees across all nodes in the system, even in the presence of failures and network partitions.
  • Fault Tolerance: Implement mechanisms to handle node failures gracefully and maintain data availability and integrity.
  • Concurrency: Support concurrent read and write operations efficiently while maintaining data consistency.
  • Partitioning: Implement data partitioning strategies to distribute key-value pairs evenly across multiple nodes.

System Design

Components

  1. Client: The client sends read and write requests to the distributed key-value store.
  2. Load Balancer: Distributes incoming requests across multiple nodes to ensure even load distribution.
  3. Node: Each node in the distributed key-value store is responsible for storing a subset of the key-value pairs.
  4. Replication Manager: Handles replication of data across multiple nodes to ensure fault tolerance and data availability.
  5. Consistency Manager: Ensures strong consistency guarantees across all nodes in the system.
  6. Failure Detector: Detects node failures and triggers recovery mechanisms to maintain data availability and integrity.
  7. Partition Manager: Determines which node should store a given key-value pair based on a partitioning strategy.
info

Points 3, 4, 5, and 6 can be implemented using a Consensus Algorithm like Raft or Paxos.

Algorithm of my choice: Raft

Raft is a consensus algorithm that ensures strong consistency guarantees in a distributed system. It elects a leader among the nodes, which is responsible for coordinating read and write operations. Raft ensures that all nodes in the system see the same data at the same time, even in the presence of failures.

Short Overview of Raft

  1. Each node in the system can be in one of three states: Leader, Follower, or Candidate. There is only one leader at any given time.
  2. The Leader sends heartbeats to Followers to maintain its leadership status.
  3. If a Follower does not receive a heartbeat within a certain time frame, it becomes a Candidate and initiates an election.
  4. The Candidate requests votes from other nodes in the system to become the Leader.
  5. Once a Candidate receives a majority of votes, it becomes the Leader and coordinates read and write operations.
  6. Each Node stores a log of all write operations, and the Leader replicates this log to all Followers to ensure data consistency.
  7. When a new node joins the system, it requests the log from the Leader to catch up with the current state.
  8. When a failed node recovers, it requests the missing log from the Leader to synchronize its state with the rest of the system.

Datastructure used in the system

  1. Key-Value Store: A simple HashMap data structure to store key-value pairs.
  2. Log: A simple index-based array to store write operations in the system.

Live Demo - Scenario 1: Replication of Data

Live Demo - Scenario 2: Scaling the System

Live Demo - Scenario 3: Network Partitioning

Demo not available for this scenario.

Conclusion

In this blog post, we explored the design of a distributed key-value store using Raft as the consensus algorithm.

Code Repository

https://github.com/aravindarc/dizkv