What is virtual machine and virtualization?
A physical machine is a computer that has CPU memory hard disk and a network connection. In the context of virtualization, the physical computer is called a host, and the virtual machine is called a guest. Virtualization is the process of creating an emulation of a physical machine using a special software result which is known as a virtual machine. This special software used here is called a hypervisor. We can create as many VMs as we require as long as the host CPU RAM and other resources allow. RAM has almost always been the main limiting factor in the case of virtualization. All the VMs in a single host share the same resources of the host and yet each VM works independently. Also, a virtual machine can be configured to use different operating systems as well as a different type of CPU and storage drives.
When it comes to cloud infrastructure, the virtual machine has been the code to standard for many of its advantages however we have an alternative to the virtual machine that is more economical, lightweight, and more scalable. That is when docker comes into the picture.
What are Docker and Containerization?
Docker is a container-based technology that lets users develop distributed applications. It is a platform used to containerize software using which we can easily build our application, package them with the dependencies required for application into the container and further these containers are easily shipped and run on the other machine. So docker is simplifying the DevOps methodology by allowing developers to create templates called images using which we can create these lightweight virtual machines called as containers and this process is called Containerization. Docker is making things easier for software organizations by giving them the capability to automate the infrastructure, isolate the applications, maintain consistency, and improve resource utilization.
Now, what was the need to introduce docker when we already had virtual machines? Let's understand how they complement each other!
The major differences between docker containers and virtual machines come with operating support, security, portability, and performance.
Operating system support:
The basic architecture of virtual machines and docker containers varies in their operating system supports.
Containers are hosted in a single physical server having the host operating system which is being shared among them but on the other hand, the virtual machines have a host operating system and separate guest operating system inside each virtual machine irrespective of the host operating system. The guest operating system can be linux windows or any other operating system.
The docker containers are suited for situations where we want to run multiple applications over a single operating system kernel but if we have applications or servers that are required to run on different operating system flavours then virtual machines is recommended.
Sharing the host operating system among various containers make them very light which in turn helps them boot up in just a few seconds hence the overhead to manage the container systems is very low as compared to that of virtual machines.
Security:
Since the host kernel in docker is shared among the containers, the container technology has access to the kernel subsystems as a result of which a single vulnerable application can hack the entire host server providing root access to the applications and running them with super user privileges is therefore not recommended in docker containers because of these security issues. On the contrary virtual machines are unique instances with their own kernel and security features. So they can run applications that need more privileges and security.
Portability:
Docker containers are like self-contained packages that can run the required application since they do not have a separate guest operating system. They can be ported across different platforms easily. The containers can be started and stopped in a matter of a few seconds compared to that of VMs due to the lightweight architecture. This makes deployment of docker containers easy and quick on servers. On the other side, virtual machines are individual server instances running with their own operating systems. Docker containers are clear winners here.
Performance:
Docker and virtual machines are intended for different purposes so it's not fair to measure the performance equally but the lightweight architecture makes docker containers less resource intensive than the virtual machines due to which containers can boot up very quickly compared to that virtual machines. In containers, the resource usage such as CPU, memory, and input-output varies with the load or traffic in it unlike the case of virtual machines. There is no need of allocating resources dedicatedly and permanently to containers. Duplicating and scaling up the containers is also an easy task as compared to that of VMs as there is no need of installing an operating system in them.
It is often observed that docker is considered better than a virtual machine but we need to understand that in spite of having a lot of functionalities and being more efficient in running applications, docker cannot replace virtual machines. Both containers and virtual machines have their own benefits and drawbacks and the ultimate decision will depend on our specific needs but there are some general rules of thumb, that is virtual machines are a better choice for running applications that require all of the operating system resources and functionalities, where we need to run multiple applications on servers or have a variety of operating systems to manage. Whereas the containers are a better choice when our priority is to maximize the number of applications running on a minimal amount of servers but in many situations the ideal setup is likely to include both. The flexibility of virtual machines and the minimal resource requirements of containers work together to provide environments with the maximum functionality.
Limitations of Docker and Container Orchestration:
With docker, we can run a single instance of the application with a simple docker run command. But what happens when the number of users increases and that instance is no longer able to handle the load, we have to deploy additional instances of our application by running the docker run command ourself multiple times. We have to keep an eye on the load and performance of our application and deploy additional instances ourselves not just that, we have to keep an eye on the health of these applications and if a container was to fail we should be able to detect that and run the docker run command again to deploy another instance of that application.
Container orchestration is a solution for that. It consists of a set of scripts and tools that can help host containers in a production environment. A container orchestration solution consists of multiple docker hosts that can host containers in way that if one fails, the application is still accessible through the others. The container orchestration solution easily allows us to deploy hundreds or thousands of instances of our application. Some of the orchestration solutions can help us automatically scale up the number of instances when users increase and scale down the number of instances when the demand decreases. Some solutions can even help us in automatically add additional hosts to support the user load and not just clustering and scaling. The container orchestration solutions also provide support for advanced networking between the containers across different hosts as well as for load-balancing user requests across different hosts. They also provide support for sharing storage between the hosts as well as support for configuration management and security.
Some of the Orchestration solutions are docker swarm from Docker and Kubernetes from Google.
Docker swarm is really easy to set up and get started but it lacks some of the advanced auto-scaling features required for complex production applications.
Kubernetes is a bit difficult to set up and get started but provides a lot of options to customize deployments and has support for many different vendors. Kubernetes is supported by all public cloud service providers like AWS, Azure, and GCP. Running an application has become easy with Kubernetes using the Kubernetes CLI known as cube control. We can run a thousand instances of the same application with a single command. Kubernetes can scale it up to the next thousand of instances with another command. Kubernetes can be even configured to do this automatically so that instances and the infrastructure itself can scale up and down to manage the user load. Kubernetes can upgrade all these instances of the application in a rolling upgrade fashion one at a time with a single command if something goes wrong. It also helps us rolling back to these images with single command.