Scaling
The Art of Scalability (http://theartofscalability.com/) book uses a scale cube model to describe three primary scaling patterns for an application, as shown in the following diagram. The x-axis of the cube represents horizontal scaling; that is, deploying the same instance of the application by just cloning them and front-ending by a load balancer to distribute the load evenly among the instances. This scaling pattern is quite common for handling a high number of service requests. The z-axis of the cube addresses scaling by data partitioning. In this case, each application instance deals with only a subset of data. This scaling pattern is particularly useful for applications where the persistence layer becomes a bottleneck:
The y-axis of the cube addresses scaling by splitting the application by function or service. This pattern relates directly to the microservices pattern. The face of the cube created using the xy-axis combines the best practices of scaling for microservices-based architecture. Microservices are identified by splitting an application by bounded contexts (y-axis) and scaled by cloning each instance (x-axis). For microservices, cloning is done by deploying multiple instances of a service container.