A data architecture describes how data is managed during its journey, from collection through to transformation, distribution, and consumption. It sets the blueprint for data and the way it flows through data storage systems. It is foundational to data processing operations and (AI) applications.
In includes the following:
Data Mesh: a modern decentralized data architecture paradigm that treats data as a product, with domain-oriented ownership and self-serve infrastructure, especially useful in large, complex organizations. As an organization, you should have clear modeling of your data architecture, what databases you have, the relationship between them, and the hosting nodes.
Conceptual data models: Database level:
They are also referred to as domain models and offer a big-picture view of what the system will contain, how it will be organized, and which business rules are involved
Logical data models: Schema and relations level:
They are less abstract and provide greater detail about the concepts and relationships in the domain under consideration.
Integration between different data models
Physical data models: Entities level:
They are less abstract and provide greater detail about the concepts and relationships in the domain under consideration. One of several formal data modeling notation systems is followed. These indicate data attributes, such as data types and their corresponding lengths, and show the relationships among entities. Logical data models don’t specify any technical system requirements.
Data has a use case, since it represents a solution that has specific objectives, which pass through the following lifecycle:
User inputs
Integration landscape
Content Crawling
Anti-Corruption Layer
To keep a smooth running between different co-worker independent databases, an anti-corruption layer should take place
Integration consideration should take place
Data health checker
How do you incrementally check the integrity between your data
How do you report it
How do you resolve it
Business rules
Validation rules
Consistency check
Data quality techniques
Data format:
Relational: like Postgresql, Oracle, SQL Server, and MySQL
Non-Relational: like Mongodb, DynamoDB, and Oracle Nosql
Hybrid model
Location
Cloud
On prime
Hybrid mode
Architecture Model
Single database
Multiple distributed databases
Containerized DB
Data analysis:
What are the expected tools, reports, and analytics types
Or will design built-in business components in the application architecture phase
How to support the decision-making using these data
ML/DL Models
Analysis types:
Real-time: so it will need different tools
Periodic: to consider the data warehouse design
To access data models, we have different approaches:
Create views on the database levels, whatever the technical methodology, using views, functions, procedures, or synonyms
APIs: Which provide a communication layer over the database, to be exposed to the external parties.
Exported Format: like tab and comma delimiters
Web services
Replication Models
Database segregation model:
Separate database
Virtual private database
Data access and security model from both the application layer and database layer
Encryption for sensitive data
Connection security
API Security
Crawling prevention
How do We keep versions of data considering the growth over time?
How do We test it
Do We use only backup and restore
Or keep historical changes for the data within the logs database
Data architecture model
Database types
Open data governance
Open data execution approach
Open data format
Means, how to enable the access for smooth and easy access for different personas (Non technical people), to meet business objectives.
So: it needs, self-service analytics tools, education about the data by enabling different data dictionaries for the model, and building data over business objectives, not to present the complexity of the data models.
Well-defined objects
Well-defined data models details:
Object type
Attributes
Data types
Domain models
Sample
Data model attributes:
Business Rules Management:
Anti-Corruption Layer:
Data Quality practices:
Completeness
For text values
For numeric
For dates
Correctness
Domains
Anomaly detection
Consistency
Cross models validation
Cross Modules validation
Data Democratization
Enterprise Architect
Archimate
Dr. Ghoniem Lawaty
Tech Evangelist