It's best practices in developing web applications that guarantee smooth implementation, scaling, and operationalization of the solution.
It has three phases of evolution:
12-Factors app for building a scalable solution
15-Factors app for meeting cloud native and microservices architecture
Moving forward for cloud native architecture
It's best practices in developing web applications that guarantee smooth implementation, scaling, and operationalization of the solution.
The development comes from Heroku, the PaaS platform, as principles for building cloud-native scalable applications, so it comes from the real practices, not a theoretical point of view.
Scalability
Environment consistency
Flexibility
Cloud Native
Operationalization
Continious improvement
Having one single repository for source code, managing different branches that should be deployed on each environment, and a feature-based branch in order to have the proper level of traceability.
Multiple codebases for different environments, instead of One codebase tracked in version control, many deploys.
Simplify Version control
Simplify collaboration
Enablement for release train
Master branch for production
Development branch
Staging branch
Feature-based branch
Fixes branch
Ad-Hocs branch
NA
Isolate solution dependencies, instead of using the system's internal components, as it should be explicitly declared and isolated.
It can be:
Operating systems components
Solution dependencies components
As implementation practices, you have package.json in Node.js, Pom.xml in Java, and build. gradle in Java, as well as .csproj in C # projects.
Old Approach: Relying on system-level packages, manual installs, instead of declaring all dependencies explicitly and isolating them.
Docker and containerization
Immutable infrastructure: No change after deployment, you should rebuild and redeploy.
Ensure consistency across environments.
Readiness for portability and shifting the solution from environment to another
Messaging components at OS level.
Java SDK version
NA
Store configuration in the environment, not the code.
Old Approach: Hardcoded configs in code or files (e.g., web.config, .env), instead of storing configuration in environment variables.
Separates config from code; enhances security.
Increase maintainability
Increase operationalization
Environment independent configuration
Configurations: In config maps
Database connection strings
Average price calculation strategy
Default currency
Secrets: in secret maps, vaults, with auto-rotation enablement
Passwords
Secret Keys
Encryption Keys
External configuration design pattern
Realtime reconfiguration design pattern
Each service that you are using should not take effort if we need to replace it.
For example, if you have to replace the SMS gateway service, you do not need to reimplement it again within the code, which implies implementing the configuration practice.
This principle, called Interchangeability, means change with zero coding effort.
Hardwired service bindings (e.g., fixed DB address in code), instead of treating backing services as interchangeable resources.
Facilitates swapping services without code changes.
Decrease coupling
Switch from local SQL Server to cloud DB without code change.
Switching the SMS gateway without code change
Use Redis instead of in-memory cache
Switching Kafka Vs RabbitMQ
Think it this way: Config = “How should I behave?”, Backing Services = “Who should I connect to?”
Both are using configurations
Config related to runtime behavior, while backing services related to dependencies
Config not related to swappable, while backing services is focused on that
The ownership of the config is the application team, while the ownership of the backing service is external systems
change effect: config affects the behavior, while the backing service focuses on what apps connect to.
External configuration store
Realtime reconfiguration
At this Principle, We segregate between the 3 phases:
Build:
Convert code into a runnable package, which includes: Compilation, bundling assets, and resolving dependencies.
Developer responsibilities
CI takes place here to make sure that the build has the related governance practices according the the agreed quality gate
Release:
Release = Build + dependencies
Combine the build with config (env vars) to create a release, Including Assign version, attach config/secrets.
DevOps responsibilities/CI Pipeline
CD:
To be manual, especially in critical solutions like banking
Run:
Release = Release + Environment configuration
Execute the app in an environment using the release package.
Like Kubernetes or Docker run
DevOps responsibilities/
Manual steps, no separation between build/release/runtime, instead of Separate build (compile), release (package), and run (execute)
Separation of concern
Enhances deployment reliability
Rollback capabilities
Secure deployment process by eliminating deploy anytime
Pipeline CICD
Docker Image build
NA
When the application is stateful, it relies on saved user session data.
Your solution should be implemented in stateless sessions, in order to serve multiple requests, without additional overheads on the server.
Long-running monolithic apps with internal session state, instead of Stateless processes, disposable and scalable.
The stateful model will be on the database level, and the handling for the state will be injected at the implemented business layer, in order to:
guarantee that the solution will run properly in case it needs to be restarted.
Application of the retry pattern
Application of scheduler jobs pattern
Resiliency, especially in microservices architecture
in stateless architecture, the status is not recorded in the code, it's stored in a different layer:
Enterprise Caching layer
Database layer
Web APIs for integration with the mobile solution
Background scheduler jobs
Transaction compensation design patterns
Retry
Circuit breaker
Scheduler agent
Cache-Aside
Moving forward to self-contained services, without the need to an application server, as Parent process, like Apache or IIS, as the old approach requires an application server, and it binds the application to a specific IP and port.
In this principle, the application/services will be self-exposed via specific port binding, even hardcoded, or configured from external configuration.
No potential conflicts, as the architecture will be implemented as following:
Each container has its own IP
Each container is self-contained with a port
Ingress routing reroutes the external routes to internal routes:
https://api.myapp.com → goes to service-api (port 8080)
https://auth.myapp.com → goes to service-auth (also 8080)
Depends on external servers like Apache/Nginx to serve the app, instead of Self-contained service exposing a port (e.g., via Kestrel in ASP.NET Core and container-based microservices )
Simplify selecting communication ports, for example, if you have 3 servers, with 3 apps, and lots of microservices, then use port binding for external calls to these self-contained services' ports using routing in the load balancer
Simplifies service discovery and interaction.
Portability, as you have hundreds of microservices, you will not be able to configure each one, which makes it suitable for containerization
Increase Security: Because you have specified which ports can talk to this container
NodeJS Apps: express.listen(5000)
Service discovery
This principle focuses on the ability of the software for scale out and scale down according to the workload, without modification in the solution architecture.
Within OpenShift, you can configure the minimum PODs, the maximum PODs, and when to extend the PODs.
So, the infrastructure level detects the workload and creates new instances of the application to serve operational needs.
Scaling by adding threads or servers manually (Vertical Scaling), instead of scaling out using multiple processes/instances per workload type(Horizontal Scaling).
Enhances application performance and availability
Enhance scalability
OpenShift MinMax scaling
Scaling trigger: CPU utilization > 70%
Design Phase
Business SLA
Use architecture assumptions
Givings
Expected concurrent Requests per second: 5000 Requests
Business SLA: 0.01 milliseconds
Request size 1 K
Resolution
Concurrency=RPS×Response Time (sec) ==5000×0.01=50 concurrent requests
Define Pod Capacity:
H = 500(POD CPU)/50(Request) = 10 concurrent requests/second
(based on 500m CPU and 512MB RAM — typical light app)
Number of Pods=Concurrency/H =50/10=5 Pods
Estimate Bandwidth: Bandwidth=RPS×Request Size=5000×1KB=5MB/sec
Development phase
Load testing for 1 POD
Resize the mode;
Production phase
Load testing
Growth and resize
Service discovery: New replicas are registered dynamically to the load balancer
The objectives of this principle are:
Easy to start up
Easy to scale out
Easy to scale down
Easy to shut down, with the cleaning resources
Slow startup
Unsafe shutdown
Poor resilience instead of Fast startup and graceful shutdown
Cost of outage and shutdown for upgrades and hot fixes
Zero downtime deployment, as you stop sending requests to old PODs once the new PODs are up and running (Graceful approach), using Signterm and Sigkill
Robust Horizontal scaling
App responds to termination signals (SIGTERM)
SIGTERM → App can intercept, finish its work, and clean up resources (gracefully).
SIGKILL → App is immediately killed by OS. No time for cleanup.
Stops accepting new requests
Finishes in-flight work (requests, jobs, etc.)
Closes open resources (DB, queues, etc.)
Exits cleanly within allowed time (e.g. 30s)
Stateless design
Preload dependencies
Use liveness to ensure self-healing.
Use readiness to ensure a graceful startup and deployment.
Liveness property in YAML file: means pod is not ready to receive, kill, and create a new one
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 15 #first inspection for the pod will be after 15 seconds
periodSeconds: 10 #next after 5 seconds
Readiness property in YAML file: Means pod is ready to receive requests
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5 #wait 5 seconds till receiving first request
periodSeconds: 5 #next after 5 seconds
Cache aside
Service Discovery
Pod Created
→ Container(s) start launching.
Startup Probe Executes (if defined)
→ Waits for the app to fully initialize.
→ Readiness/Liveness are paused during this phase.
Startup Probe Passes → Liveness & Readiness Start
Readiness Probe: Controls traffic routing (is the app ready?).
Liveness Probe: Ensures health over time (is the app alive?).
If Readiness fails → pod removed from service
If Liveness fails → pod is restarted
Keep development, staging, and production as similar as possible
Make the environment similar, for the same dependencies, tools, and database, otherwise, containerization.
The objectives of this principle is to prevent any risks when moving the code from development to the production environment, as a results from the variation between environments, different components and factors.
Different tools/versions/configs between dev and production, instead of Keep development, staging, and production environments as similar as possible
Reduces environment-specific bugs.
Same containers
Same Backing services (Like database type, message queues, ..)
NA
Treat logs as event streams
As a large-scale solution, in order to confirm that your solution is alive, in a healthy state, you should have a continuous stream of logs, that capture what is running in your solution.
It should be independently implemented, like EFK.
Event streaming
Realtime analysis
Centeralized Log
Can be handled as data pipeline
Collect: Using FluentD
Transport: Kafka/Redis stream
Analysis: Logstash
Store: ElasticSearfch
Visualize/Alert: Kibana/Grafana/Prometheus
Retention
Store logs in local files or ignore them, instead of Stream logs to stdout/stderr and aggregate externally
Facilitates centralized monitoring and debugging
Visualization
Transparency and control
Governance
EFK (Elastic search, FluentD, Kibana)
ELK (Elastic search, Logstack, Kibana)
Loki + Grafana
Sidecar design pattern:
Collect Docker logs
Application logs
Errors
Info
Traces
Access Logs
Security logs
Traffic Logs
Run admin/management tasks as one-off processes.
short-lived, separate from main app.
The job that runs once, and is no longer used for it, like:
Data migration
Update the database structure for the new deployment
Data cleansing
You have 2 types of functionalities:
Business functionalities, which should be embedded within your business modules
Administrative functionalities, which should be embedded within your business modules, to enable the operationalization of the solution, like:
Managing users
Data cleansing
Backup/restore operations
Manual admin tasks on production machines (risky), instead of Run admin tasks as one-off processes inside the app context,
Keeps admin tasks consistent with the application environment.
Fast startup and scale-up for the solution.
Scaling the PODs
Related Design patterns:
Scheduler jobs: Can run one-off jobs
Adding 3 new factors:
API-First architecture/strategy is a new architecture approach the shift the mindset of software development for APIs to consider APIs as independent product, with different journeys in order to achieve business objectives, independent from the UX journey that may mislead/abuse the API architecture and design.
You can just imagine it if you just think that your final product is the APIs, and you will commercialize it alone, without UX.
It means, you have API as a product, so you have:
Product manager
Product Roadmap
And product lifecycle
As example, the target persona of your product (APIs) which will use your product, like mobile app, web app, and smart watch.
So, We are starting with the API, not the application.
So, It's about "How will you image the solution upon the API first, not how to build the API upon the design first solution".
Async Development
Reduce development Cost
Reduce time-to-market
Same like software engineering lifecycle, passing all journeys as the opposite diagram mentioning, considering that your product is APIs, including its documentation and journeys.
As a pre-requisites, We should build the culture, which implies the following principles:
Your API is a product
Foundational design, not ad hoc retrofit
Team collaboration and impact
API-first supports microservices
The API contract
Step 1: Create API Micro-Service architecture according to DDD, which each object has the following attributes:
Who Am I? I'm Employee Class
What I know? I know my code, name, birthdate, current salary, current role.
What I do? I can Create employee, delete employee, activate employee, deactivate employee, update employee, and block employee, and unblock employee
What is my state? like : blocked employee, Unblocked employee, Active employee and Inactive employee
Step 2: Determine key domains
Step 3: Determine each domain lifecycle
Step 4: Model your architecture domains
Step 5: Model your architecture sequence diagrams
Step 6: Model your architecture state-chart diagrams
Step 7: Develop CRUD APIs according to data models characteristics
Step 8: Create the journeys according to the sequence diagrams, with validating the pre-requisites
Step 9: Create APIs to validate product states
Step 10: Create Security model for the APIs
API Improvement model
As opposite figure, We have consumer and publisher, They are shared improvement the model, due to the required improvement after design, as a nature of any product lifecycle.
Are You API First Company?
Since you have the following characteristics, you can consider yourself as API-First company:
You have APIs that operate and maintain your data models
You are providing you APIs as independent product
You make APIs available to your customers and partners as a source of your revenue stream
You know how to Manage and discover your APIs
You have standardized processes to build APIs
Your APIs is independent from any UX design
API versioning is a practice in software development that involves managing and maintaining different versions of an Application Programming Interface (API). An API is a set of rules and protocols that allows one software application to interact with and request services or data from another software component, such as a web service or library. API versioning is essential to ensure that changes and updates to an API do not break existing clients or applications that rely on it.
Compatibility
Preventing Breaking Changes
Client Isolation
Sunset enabling
Business security and governance (Like creating API banking platform using API first, and then building the application upon)
So, it meet the strategic approach for the enterprise driven from the business transformation team, according to the following perspectives:
How will you build your upcoming products
How will your external entities integrate with your product
How will you monetize your product
How APIs will cover your overall business
Different deployment
Headers
URI routing
Level 01:
•Change management
•Control integration touch points by TL
•Announcement model for integration touch points
•Application of unit testing for integration packages
Level 02:
Apply selected Versioning streategy
Enable support period for version minors (Quarter based)
Backward compatibility according to release management process, which may be monthly, or quarterly or yearly
Quarterly code refactoring process to reset minor version backward compatibility
Handle legacy Versions : Responses and stoppage
Level 03:
Application of design patterns: Aggregation pattern, to reduce the touch service, which gives the team the freedom for changing the signature
API Catalog Model completeness, which should be deployed on API management platform, like SWAGGER, including its documentation
Level 04:
Provisioning model for monetization: which will be used in the access of authentication and authorization, for the given tokens, to enable who can do what, and enrich monetization of your exposed APIs
The practice is called API Version Consolidation or API Version Aggregation is mandatory when applying the API first approach.
Reduce maintenance overhead: Maintaining many versions leads to duplicated logic and higher testing costs.
Improve developer experience: Consolidated APIs are easier to understand, use, and document.
Encourage standardization: Forces consistency across versions, reducing divergence.
Simplify security and governance: Fewer versions mean fewer attack surfaces and simplified auditing.
Deprecate old versions with clear timelines
Review versions different scenarios
Study scenarios consolidation approach
Announce
Cut-off date
Staging testing
Deploy together
Use semantic versioning (v1, v2, etc.).
Support backward compatibility within reason.
Communicate changes transparently with API consumers.
Telemetry means automatic gathering for measurements that are used in data analysis during the application journey, in the operation platform, so it's different than logging.
At Telemetry, We have different measurements and analytics at each layer.
The objectives:
Observability
Incident Response
Performance Optimization
Security Monitoring
Tools:
Logging: ELK (Elasticsearch + Logstash + Kibana), Loki
Metrics: Prometheus, Grafana
Tracing: Jaeger, Zipkin, OpenTelemetry
All-in-One: Datadog, New Relic, Dynatrace, Azure Monitor
Why it's different from 11th Factor: Log?
Logs factor focus only on collecting logs, not metrics
Telemetry focus on full observability of the infrastructure and solution
Logs focus on what happen, while Telemetry focus on why and how offten amd where?
Telemetry
Logs: EFK
Metrics: Prometheus style in Java Spring Boot (micrometer + Prometheus)
Enable distributed tracing: Opentelemetry + Jaeger
traces visualization: Jaeger UI
There are five new approaches:
Containers (Docker)
Orchestration
Microservices architecture
Serverless architecture
DevSecOps
In software engineering, containerization is operating-system–level virtualization or application-level virtualization over multiple network resources so that software applications can run in isolated user spaces called containers in any cloud or non-cloud environment, regardless of type or vendor
Kubernetes orchestration allows you to build application services that span multiple containers, schedule containers across a cluster, scale those containers, and manage their health over time. Kubernetes eliminates many of the manual processes involved in deploying and scaling containerized applications
Adding security at the pipeline level, which includes SAST (Static Application Security Testing), in order to detect potential vulnerabilities before going to production.
One of the most welknown tools is SonarCube (Sonar Cloud), which provide on-prim, and cloud-based services, that can be injected on the pipeline.
How it works?
Enable on code branch
Create a quality gate
Analyse code manaually or on build
Manage the build pipeline according to the quality gate
Dr. Ghoniem Lawaty
Technology Evangelist