Containers are a disruptive technology and is being adopted by startups and enterprises alike. Whenever a new infrastructure technology comes along, two areas require a lot of innovation - storage & networking. Anyone who is adopting containers would have faced challenges in these two areas.
Flannel is an overlay network that helps to connect containers across multiple hosts. This blog provides an overview of container networking followed by details of Flannel.
What is Docker?
Docker is the world’s leading software container platform. Developers use Docker to eliminate “works on my machine” problems when collaborating on software with co-workers. Operators use Docker to run and manage apps side-by-side in isolated containers to get better compute density. Enterprises use Docker to build agile software delivery pipelines to ship new features faster, more securely and with repeatability for both Linux and Windows Server apps.
Need for Container networking
- Containers need to talk to the external world.
- Containers should be reachable from the external world so that the external world can use the services that containers provide.
- Containers need to talk to the host machine. An example can be getting memory usage of the underlying host.
- There should be inter-container connectivity in the same host and across hosts. An example is a LAMP stack running Apache, MySQL and PHP in different containers across hosts.
How Docker’s original networking works?
Docker uses host-private networking. It creates a virtual bridge, called docker0 by default, and allocates a subnet from one of the private address blocks defined in RFC1918 for that bridge. For each container that Docker creates, it allocates a virtual ethernet device (called veth) which is attached to the bridge. The veth is mapped to appear as eth0 in the container, using Linux namespaces. The in-container eth0 interface is given an IP address from the bridge’s address range.
Drawbacks of Docker networking
Docker containers can talk to other containers only if they are on the same machine (and thus the same virtual bridge). Containers on different machines cannot reach each other - in fact they may end up with the exact same network ranges and IP addresses. This limits the system’s effectiveness on cloud platforms.
In order for Docker containers to communicate across nodes, they must be allocated ports on the machine’s own IP address, which are then forwarded or peroxided to the containers. This obviously means that containers must either coordinate which ports they use very carefully or else be allocated ports dynamically.This approach obviously fails if container dies as the new container will get a new IP, breaking the proxy rules.
Real world expectations from Docker
Enterprises expect docker containers to be used in production grade systems, where each component of the application can run on different containers running across different grades of underlying hardware. All application components are not same and some of them may be resource intensive. It makes sense to run such resource intensive components on compute heavy physical servers and others on cost saving cloud virtual machines. It also expects Docker containers to be replicated on demand and the application load to be distributed across the replicas. This is where Google’s Kubernetes project fits in.
What is Kubernetes?
Kubernetes is an open-source platform for automating deployment, scaling, and operations of application containers across clusters of hosts, providing container-centric infrastructure. It provides portability for an application to run on public, private, hybrid, multi-cloud. It gives extensibility as it is modular, pluggable, hookable and composable. It also self heals by doing auto-placement, auto-restart, auto-replication, auto-scaling of application containers. Kubernetes does not provide a way for containers across nodes to communicate with each other, it assumes that each container (pod) has a unique, routable IP inside the cluster. To facilitate inter-container connectivity across nodes, any networking solution based on Pure Layer-3 or VxLAN or UDP model, can be used. Flannel is one such solution which provides an overlay network using UDP as well as VxLAN based model.
Flannel: a solution for networking for Kubernetes
Flannel is a basic overlay network that works by assigning a range of subnet addresses (usually IPv4 with a /24 or /16 subnet mask). An overlay network is a computer network that is built on top of another network. Nodes in the overlay network can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network.
While flannel was originally designed for Kubernetes, it is a generic overlay network that can be used as a simple alternative to existing software defined networking solutions. More specifically, flannel gives each host an IP subnet (/24 by default) from which the Docker daemon is able to allocate IPs to the individual containers. Each address corresponds to a container, so that all containers in a system may reside on different hosts.
It works by first configuring an overlay network, with an IP range and the size of the subnet for each host. For example, one could configure the overlay to use 10.1.0.0/16 and each host to receive a /24 subnet. Host A could then receive 10.1.15.1/24 and host B could get 10.1.20.1/24. Flannel uses etcd to maintain a mapping between allocated subnets and real host IP addresses. For the data path, flannel uses UDP to encapsulate IP datagrams to transmit them to the remote host.
As a result, complex, multi-host systems such as Hadoop can be distributed across multiple Docker container hosts, using Flannel as the underlying fabric, resolving a deficiency in Docker’s native container address mapping system.
Integrating Flannel with Kubernetes
Kubernetes cluster consists of a master node and multiple minion nodes. Each minion node gets its own subnet through flannel service. Docker needs to be configured to use the subnet created by Flannel. Master starts a etcd server and flannel service running on each minion uses that etcd server to registers its container’s IP. etcd server stores a key-value mapping of each containers with its IP. kube-apiserver uses etcd server as a service to get the IP mappings and assign service IP’s accordingly. Kubernetes will create iptable rules through kube-proxy which will allocate static endpoints and load balancing. In case the minion node goes down or the pod restarts it will get a new local IP, but the service IP created by kubernetes will remain the same enabling kubernetes to route traffic to correct set of pods. Learn how to setup Kubernetes with Flannel undefined.
Alternatives to Flannel
Flannel is not the only solution for this. Other options like Calico and Weave are available. Weave is the closest competitor as it provides a similar set of features as Flannel. Flannel gets an edge in its ease of configuration and some of the benchmarks have found Weave to be slower than Flannel.