Micro-Services Design Patterns

The core concepts and related design patterns

26. Pipes and Filters

Decompose a task that performs complex processing into a series of discrete elements that can be reused, and parallel processing.

This pattern can improve performance, scalability, and reusability by allowing task elements that perform the processing to be deployed and scaled independently.

A key advantage of the pipeline structure is that it provides opportunities for running parallel instances of slow filters, which enables the system to spread the load and improve throughput.

The filters that make up a pipeline can run on different machines, which enables them to be scaled independently and take advantage of the elasticity that many cloud environments provide. A filter that's computationally intensive can run on high-performance hardware, while other less-demanding filters can be hosted on less-expensive commodity hardware.

It can support the parallel process and resiliency, that decrease the shutdowns and increase availability.

It may produce level of complexity, So if it's simple to decompose your logic with need for that, go for this pattern.

Using the Pipes and Filters pattern together with the Compensating Transaction pattern is an alternative approach to implementing distributed transactions. You can break a distributed transaction into separate, compensable tasks, each of which can be implemented via a filter that also implements the Compensating Transaction pattern. You can implement the filters in a pipeline as separate hosted tasks that run close to the data that they maintain.

27. Bulkhead

The Bulkhead pattern is a type of application design that is tolerant of failure. In a bulkhead architecture, elements of an application are isolated into pools so that if one fails, the others will continue to function. It's named after the sectioned partitions (bulkheads) of a ship's hull. If the hull of a ship is compromised, only the damaged section fills with water, which prevents the ship from sinking.

Partition service instances into different groups, based on consumer load and availability requirements. This design helps to isolate failures, and allows you to sustain service functionality for some consumers, even during a failure.

The benefits of this pattern include:

Isolates consumers and services from cascading failures. An issue affecting a consumer or service can be isolated within its own bulkhead, preventing the entire solution from failing.
Allows you to preserve some functionality in the event of a service failure. Other services and features of the application will continue to work.
Allows you to deploy services that offer a different quality of service for consuming applications. A high-priority consumer pool can be configured to use high-priority services.

28. Service Registry pattern

Brief Description

The Service Registry pattern is a design approach where a central registry maintains a list of available service instances, enabling dynamic service discovery in a microservices architecture. Services register themselves upon startup and deregister when shutting down. Clients query the registry to locate and communicate with services without hardcoding their locations.

Objectives

Dynamic Service Discovery – Automatically detect available service instances.
Decouple Services – Avoid hardcoded service endpoints.
Load Balancing – Distribute requests across multiple instances.
Resilience – Handle service failures by rerouting requests.
Scalability – Easily scale services up/down without manual reconfiguration.

Added Values

Improved Fault Tolerance – Registry detects and removes failed instances.
Simplified Service Communication – Clients don’t need to know service locations.
Support for Auto-Scaling – Works well with Kubernetes/OpenShift scaling.
Reduced Configuration Overhead – No manual endpoint updates.

Tools & Technologies

Service Registry Tools:
- Eureka (Netflix) – Simple, self-contained service registry.
- Consul (HashiCorp) – Offers service discovery + health checks.
- Zookeeper (Apache) – Distributed coordination service.
- OpenShift Service Mesh (Istio + Kiali) – Advanced service discovery.
OpenShift Features:
- Kubernetes Service Discovery (Built-in DNS-based discovery).
- OpenShift Service Mesh (Istio-based for advanced routing).

Flow of Service Registry in OpenShift

Service Registration:
- When a microservice starts, it registers itself with the registry (e.g., Eureka).
- OpenShift can auto-register pods via sidecar containers (if using Istio).
Service Discovery:
- Clients query the registry to get the latest list of available services.
- Load balancing (e.g., Ribbon, OpenShift’s built-in load balancer) routes requests.
Health Checks & Deregistration:
- Registry periodically checks service health.
- Unhealthy services are removed automatically.
Dynamic Updates:
- As services scale up/down, the registry updates its records.

29. Service Discovery pattern

Brief Description

The Service Discovery pattern enables microservices in a distributed system to dynamically locate and communicate with each other without hardcoded configurations. It consists of:

A Service Registry (central database of available services).
A Discovery Client (used by services to register and discover others).

This pattern is essential in cloud-native architectures (like OpenShift/Kubernetes) where services scale dynamically.

Objectives

Dynamic Location Tracking – Automatically detect service instances.
Decoupling – Remove hardcoded IP/URL dependencies.
Load Balancing – Distribute traffic across healthy instances.
Resilience – Handle failures by rerouting requests.
Scalability – Support auto-scaling without manual updates.

Added Values

Fault Tolerance – Unhealthy services are automatically removed.
Reduced Configuration – No need to manually update endpoints.
Cloud-Native Compatibility – Works seamlessly with Kubernetes/OpenShift.
Traffic Optimization – Enables client-side or server-side load balancing.

Tools & Technologies

Consul
OpenShift Service Mesh (Istio + Kiali)

Flow of service discovery

Registration:
- Services register themselves in the registry (e.g., Eureka) on startup.
- In Kubernetes, pods are automatically registered as endpoints via labels.
Discovery:
- Clients query the registry to get a list of available instances.
- Load balancing (e.g., Ribbon, Istio, or OpenShift’s built-in LB) routes requests.
Health Checks:
- The registry periodically checks service health (HTTP, TCP, or custom probes).
- Failed instances are removed from the registry (using readiness/liveness).
Dynamic Updates:
- As services scale up/down, the registry updates in real time.

Load Balancer strategies

Round Robin: Equal distribution
Weighted Round Robin: According to the server configuration, Adaptive and dynamic
IP Hash: Connect a specific client IP with the server IP
Random: Random selection
Least Connections: the server with the fewest connections, Adaptive and dynamic
Least response time: Fast response server, Adaptive, and dynamic

30. Leader Election

Select leader, if died, elect another one.

Coordinate the actions performed by a collection of collaborating task instances in a distributed application by electing one instance as the leader that assumes responsibility for managing the other instances.

This pattern can help to ensure that tasks do not conflict with each other, cause contention for shared resources, or inadvertently interfere with the work that other task instances are performing.

There are several strategies for electing a leader among a set of tasks in a distributed environment, including:

Selecting the task instance with the lowest-ranked instance or process ID.
Racing to acquire a shared, distributed mutex. The first task instance that acquires the mutex is the leader. However, the system must ensure that, if the leader terminates or becomes disconnected from the rest of the system, the mutex is released to allow another task instance to become the leader.
Implementing one of the common leader election algorithms such as the Bully Algorithm or the Ring Algorithm. These algorithms assume that each candidate in the election has a unique ID, and that it can communicate with the other candidates reliably.

The opposite graph show the four pillar resiliency and stability patterns, that should be applied to improve the solution stability:

Retry Pattern: the existing pattern, how may times to try the same behavior, to prevent failure
Circuit Breaker: how to stop the unlimited trials for the same requests/jobs to prevent stuck and locks.
Scheduler Job: to compensate failed, and incomplete transactions due to the retry failure and circuit breaker
Leader election: at infrastructure level

Collaboration between all mentioned design patterns should take place, specially at the large scale solution.

31. Deployment Stamps

The deployment stamp pattern involves provisioning, managing, and monitoring a heterogeneous group of resources to host and operate multiple workloads or tenants. Each individual copy is called a stamp, or sometimes a service unit, scale unit, or cell. In a multi-tenant environment, every stamp or scale unit can serve a predefined number of tenants. Multiple stamps can be deployed to scale the solution almost linearly and serve an increasing number of tenants. This approach can improve the scalability of your solution, allow you to deploy instances across multiple regions, and separate your customer data.

Deployment stamps can apply whether your solution uses infrastructure as a service (IaaS) or platform as a service (PaaS) components, or a mixture of both. Typically IaaS workloads require more intervention to scale, so the pattern might be useful for IaaS-heavy workloads to allow for scaling out.

Because of the complexity that is involved in deploying identical copies of the same components, good DevOps practices are critical to ensure success when implementing this pattern. Consider describing your infrastructure as code, such as by using Bicep, JSON Azure Resource Manager templates (ARM templates), Terraform, and scripts. With this approach, you can ensure that the deployment of each stamp is predictable and repeatable. It also reduces the likelihood of human errors such as accidental mismatches in configuration between stamps.

Geodes

Deploy the service into a number of satellite deployments spread around the globe, each of which is called a geode. The geode pattern harnesses key features of Azure to route traffic via the shortest path to a nearby geode, which improves latency and performance. Each geode is behind a global load balancer, and uses a geo-replicated read-write service like Azure Cosmos DB to host the data plane, ensuring cross-geode data consistency. Data replication services ensure that data stores are identical across geodes, so all requests can be served from all geodes.

It's very important for goverenmental and IOT products, as it have a large volume of data that should be distributed.

32.Sharding

Divide a data store into a set of horizontal partitions shards.

Each shard contains unique rows of information that you can store separately across multiple computers, called nodes. All shards run on separate nodes but share the original database's schema or design.

This pattern can improve scalability when storing and accessing large volumes of data, Like country citizens as an example.

It can be distributed by areas, that will enhance the performance, keep security, but will take effort in analytical models and solution upgrades.

It's a special type of distributed database, as alit manage different content, not different objects.

Sharding types

Hashing, Third Image
Directory Sharding Lookup, First Image
Geo-Sharding
Range, Second Image

You should consider the following:

1. Consolidation model

2. Analytical model

3. Sync intervals

Alternatives

Partitioning
Replication
Vertical scaling

33.Throttling ( and Rate limiting)

Throttler: In the simplest form of API throttling, the throttler would be part of the API server, and it would monitor the number of API requests per second and minute, per user, or per IP address based on user authentication.

Rate limitter : is the practice of limiting the number of requests that can be made to an API within a specific time period.

Control the consumption of resources used by an instance of an application, an individual tenant, or an entire service. This pattern can allow the system to continue to function and meet service level agreements, even when an increase in demand places an extreme load on resources.

As a guide, do not provide information without request or exceeding the customer/user needs.

Processing consumes the traffic and become expensive than on prime model.

There're many strategies available for handling varying load in the cloud, depending on the business goals for the application. One strategy is to use auto-scaling to match the provisioned resources to the user needs at any given time. This has the potential to consistently meet user demand, while optimizing running costs. However, while auto-scaling can trigger the provisioning of additional resources, this provisioning isn't immediate. If demand grows quickly, there can be a window of time where there's a resource deficit.

An alternative strategy to auto-scaling is to allow applications to use resources only up to a limit, and then throttle them when this limit is reached. The system should monitor how it's using resources so that, when usage exceeds the threshold, it can throttle requests from one or more users.

This pattern should be used for governance, and known by infrastructure team, as it affects the operational cost, which is shared between the teams.

So, the advantages of this pattern is It controlling the cost, and enable preventing DDoS.

If your solution does not require throttling, do not enable it.

Why Do Businesses Implement Rate Limiting?

Preventing overloading of servers: Helps prevent overloading of servers by controlling the rate at which requests are received. By restricting the number of requests made within a certain time frame, you can maintain the stability and responsiveness of your servers.
Protecting against malicious attacks: Protects against malicious attacks, such as denial of service (DoS) attacks, which are intended to flood servers with excessive requests. By limiting the rate at which requests can be made, you can prevent these types of attacks from causing damage.
Managing resources and costs: Manages resources and costs by controlling the usage of APIs. By limiting the number of requests that can be made, you can use your resources in the most efficient way and avoid incurring unnecessary costs associated with excessive API usage.

34. Sidecar pattern

Plan for extending your functionality beside your core one.

Deploy components of an application into a separate process or container to provide isolation and encapsulation. This pattern can also enable applications to be composed of heterogeneous components and technologies.

This pattern is named Sidecar because it resembles a sidecar attached to a motorcycle. In the pattern, the sidecar is attached to a parent application and provides supporting features for the application. The sidecar also shares the same lifecycle as the parent application, being created and retired alongside the parent.

Example:

Infrastructure API. The infrastructure development team creates a service that's deployed alongside each application, instead of a language-specific client library to access the infrastructure. The service is loaded as a sidecar and provides a common layer for infrastructure services, including logging, environment data, configuration store, discovery, health checks, and watchdog services.

Applications and services often require related functionality, such as monitoring, logging, configuration, and networking services. These peripheral tasks can be implemented as separate components or services.

If they are tightly integrated into the application, they can run in the same process as the application, making efficient use of shared resources. However, this also means they are not well isolated, and an outage in one of these components can affect other components or the entire application. Also, they usually need to be implemented using the same language as the parent application. As a result, the component and the application have close interdependence on each other.

A sidecar service is not necessarily part of the application, but is connected to it. It goes wherever the parent application goes. Sidecars are supporting processes or services that are deployed with the primary application.

The sidecar pattern is often used with containers and referred to as a sidecar container or sidekick container.

Dr. Ghoniem Lawaty

Tech Evangelist @TechHuB Egypt

Page updated

Google Sites

Report abuse