A region, is a geographical region on the planet, potentially multiple datacenters in close proximity, networked together. Those datacenters are sometimes called availability zones. An availability zone, has its own independent power and networking. It is set up to be an isolation boundary. If one availability zone goes down, the other continues working. The availability zones are typically connected to each other through very fast, private fiber-optic networks.

Within the availability zone, the VMs are deployed on machines, that are organized in racks. Each rack has its own router. The virtual machines on one single physical machine may run multiple containers.

When an incoming request comes to the endpoint, it is usually first delivered to a load balancer to route the traffic to an instance of a service. The goal is to run the code on different VMs that are not close to each other to reduce the chance of single point of failure. The unit of single point of failure is called a fault domain. With this hierarchy, when:

  • a region goes down, everything inside the region is down.
  • an availability zone goes down, everything inside the availability zone is lost.
  • a rack goes down, it is the PCs that are lost.
  • a PC goes down, it is the VMs on it that are lost.