Introduction

What is virtual time and why do we need it?

As distributed systems have progressed and been adopted over the last decade, there have been numerous technologies in different segments like databases, caches, message queues, etc which are built on top of other frameworks which abstract away the difficulty of managing distributed systems. One of the most important and difficult things to manage in distributed systems is managing synchronicity using time.

Some common forms of synchronization techniques in distributed systems are block-resume, abortion-retry, lookahead-rollback. …


I was exploring Neo4j and came upon this video where Jim Webber, Chief Scientist at Neo4j, explains these numbers:

125x = 48y = 3z is the ratio of the cluster size(number of instances) required for a similar data store functionality where x=MongoDB, y=Cassandra and z=Neo4j

and

20x = 50y = 0.33z is the ratio of the disk size required for a similar data store functionality where x=MongoDB, y=Cassandra and z=Neo4j

The blog post will look to cover the internals of how the data is stored and accessed in Neo4j and why it is a serious contender for a certain type…


Suppose we have a C lib where we have defined various data structures and methods. Due to some reason or constraint, there needs to be a Golang process which has to reuse the structures mentioned in the C lib.

Apart from accessing the C defined elements from the Go code, another topic which is more important is to understand the difference between both the C and Go runtimes.

Go’s runtime does all the memory management tasks like memory allocation and memory freeing for the processes and deciding which object’s memory should escape to the heap. …


I had worked on Elasticsearch back in 2015, when it was more known for its text searching capabilities using inverted indexes. As I looked to pick it up again last year for another project, I saw that Elasticsearch had added core support for other data types from text like numbers, IP addresses, geospatial data types, etc.

As I looked to understand the main differences which could allow optimized search over such data types, I stumbled upon BKD trees. Surprisingly, there is not much written about BKD trees apart from a white paper and some blogs. …


I recently heard about Maglev, the load balancer that Google uses in front of most of its services. I wanted to get a short gist on the matter to understand the reason why Google had to create its own load balancer and the optimizations that they took in order to actually run a load balancer at Google’s scale. To my surprise, I couldn’t find many articles which actually brought out the main reasons for Maglev’s existence. I had no other option but to go through the research paper submitted by the Google team. …


Kubernetes is the de-facto container management system for all sorts of distributed workloads. Known for its extensibility and community support, there are numerous plugins for multiple use cases.

The most unpredictable element for any distributed system is the network. A distributed system is as strong as its weakest link. As networks go down, it leads to various well known problems like thundering herd, split brain, etc. Most applications are today built with the assumption that anything and everything can go down. …


This post will cover a concise implementation of how to open live pcap sessions on any network device and reading the incoming packets on that interface. In the end, the post will display how to parse the packets appropriately to get the required information.

We use libpcap in the implementation to listen to the packets on the network device. The same can also be used to directly read from a pcap file instead of live sessions. …


Most full-fledged web frameworks come with ORMs built in. ORMs or Object Relational Mappings help to map the programming language data structures to actual data stores without having to worry about the underlying data source.

This helps to abstract the data store interfaces which helps in migrating to a separate data store more of a configuration knob and doesn’t require any change to the actual codebase. ORMs also help in connection pooling, managing database connections, validations, etc.

In this post, we will be assuming a base knowledge of ORMs and we will be looking at how to integrate a Statistics…


This post will mainly revolve around the comparison between different implementations of Routers in the HTTP based frameworks.

Let’s first go over what routers are in the context of a HTTP framework.

Most frameworks today implement the MVC pattern or at least something similar to it. Even lightweight frameworks which don’t actually implement any design pattern have multiple built in features which usually doesn’t require custom logic by the application developer.

When a request is received by the framework handler, the framework reads the URI path and dispatches the request to the user defined action along with the request context…


Linux processes interact with virtual memory and not the physical memory. Every process has a notion that it is the only process running in the system and hence, has unlimited access to the memory present in the system.

Various processes may have the same virtual memory address space but it doesn’t collide because the kernel takes care of the virtual memory to physical memory mapping. An example when a process may have to share it’s virtual memory is when it spawns threads, or threads of execution.

The process doesn’t have permission to access certain parts of the address space which…

Gaurav Sarma

Currently working at VMware

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store