Data Architecture

Entity-Relationship Diagrams for Financial Services: From Customer to Trade

April 17, 2026

The entity-relationship diagram is the oldest data modelling artefact in professional use and remains the most useful. In financial services, the ERD is where domain complexity and regulatory demand collide. A diagram that looks clean in a non-financial domain collapses under the weight of party hierarchies, product variations, time-variance, and the many-to-many relationships that real-world finance demands.

This post covers the ERD patterns that work in regulated financial services, the pitfalls that recur, and the specific modelling decisions that either enable or prevent downstream regulatory reporting. The framing applies to banking, insurance, asset management and payments. The details differ but the patterns are the same.

Why the ERD matters in regulated scope

Regulators do not ask for ERDs directly. They ask for consistent regulatory figures, complete trade reporting, defensible risk calculations and auditable customer journeys. Every one of those requests depends on a coherent data model. When the model is right, the data feeds are straightforward and the numbers reconcile. When the model is wrong, the firm spends its life reconciling between systems whose representations of the same real-world entity disagree.

The ERD is the conceptual artefact that makes the disagreements visible before they become expensive. It sits alongside the data dictionary and the data lineage as the core data governance triangle. For a broader discussion of why the data layer is the binding constraint on AI enablement, see data layer as constraint on enterprise AI.

Modelling parties: the first hard problem

Every financial services data model has to represent parties (customers, counterparties, intermediaries, beneficiaries, related parties). The naive approach models parties as a single entity with a type code. The real world does not fit this shape.

The party entity pattern

The pattern that scales models the party as an abstract entity with three specialised sub-entities: natural person, legal entity and arrangement (trust, partnership, fund, syndicate). Each sub-entity has attributes specific to its type. The relationships between parties (employer, beneficial owner, controller, authorised signatory, directorship) are separate relationship entities with their own attributes: start date, end date, jurisdiction, evidence source.

This model handles the questions a regulated firm actually has to answer: who ultimately owns this account, who is authorised to act on behalf of this fund, who benefits economically from this structure. None of these can be answered without a proper party model, and all of them are demanded by AML regulations, the FATF recommendations and sanctions regimes.

Corporate hierarchies

Legal entities form hierarchies: parent, subsidiary, branch, joint venture. The modelling decision is whether to represent hierarchy as a direct parent relationship on the legal entity or as an external relationship entity. The relationship-entity approach wins because hierarchies change: acquisitions, divestments, restructurings. A relationship entity with effective dates supports time-travel queries. A direct parent column loses history.

The LEI system provides globally unique identifiers for legal entities and should be a required attribute on every legal entity in the model. Cross-reference to LEI first, internal IDs second.

Modelling products: the second hard problem

Financial products are the other source of modelling complexity. A credit card, a mortgage, a fund share class, a derivative, a structured note. The naive model has one table per product type, which works until a corporate customer holds ten different product types and the customer view needs a union across ten tables.

The product hierarchy pattern

Model product as a generalisation hierarchy. A top-level product entity with common attributes (product type, currency, opening date, closing date, status). Specialised sub-entities for product families (loan, deposit, derivative, investment) and further specialisation for specific products (mortgage, overdraft, fixed-income, equity). The depth depends on the firm, but two or three levels are typical.

This lets you answer portfolio-level questions (total exposure to counterparty X) without forcing product-specific detail into the cross-cutting query. The specialised attributes live on the specialised entity where they belong.

Time-variance in products

Product terms change. Interest rates reset, margins reprice, structures amend. The model must handle this through a versioning or bi-temporal pattern. Every product instance has a state that is valid from one timestamp to another. Queries specify "as at" a reporting date, and the query returns the state that was in force at that date.

This is essential for regulatory reporting. A FINREP report as at 31 December 2025 must use the product terms in force on 31 December 2025, not the terms as they are today. Without temporal modelling, this is a reconstruction exercise every quarter.

Modelling transactions: the third hard problem

Transactions are where the volume is, which means modelling decisions here have the biggest performance and regulatory consequences.

Transaction as an event

Model transactions as immutable events with a clear lifecycle. A payment is a sequence of events (initiated, authorised, cleared, settled, reconciled). Each event is a distinct record with its own timestamp, actor and evidence. Amending a transaction does not overwrite the original; it creates a new event that references the original.

This supports the audit trail that regulators expect. Under MIFIR transaction reporting, MIFID II order record keeping, and PSD3 dispute resolution, the firm must be able to reconstruct the complete state of a transaction at any point in its lifecycle. Event modelling makes this trivial. Mutating a transaction record makes it impossible.

Transaction classification

Classification is where policies bite. A transaction is a "suspicious activity" or not, is "high-risk" or not, is "reportable under MIFIR" or not. Do not encode classification as attributes on the transaction entity, because classifications change. Model classification as separate entities with effective dates and version control. The transaction record stays immutable. The classification evolves.

Time-variance everywhere

If there is a single lesson from regulatory data modelling, it is that time-variance is not optional. Customers change name. Products change terms. Classifications change. Policies change. Counterparty legal structure changes. Jurisdiction of residence changes. Every one of these changes must be modelled with effective dates, not overwritten.

The practical approach is either temporal tables (database-native time versioning) or explicit effective_from and effective_to columns on every entity where temporal behaviour is required. The pattern is more verbose but pays back when the regulator asks a "point in time" question.

Common ERD failure modes

Failure: the monolithic customer entity

The customer entity has 180 attributes, half of them null for any given row. The reason is that the same entity is trying to represent retail customers, corporate customers, prospects and counterparties. Fix: split the entity along the natural type boundaries. The cost of a second entity is smaller than the cost of a table where every query has to filter on type code and null-handling logic.

Failure: many-to-many collapsed into one-to-many

A common shortcut. A customer has many accounts. The model puts customer_id on the account entity. Fine, until a joint account has two customers. Now the model needs a relationship entity, and every query that assumed one-to-one has to be revisited. Fix: model many-to-many relationships as relationship entities from the start, even if the immediate use case is one-to-many. The cost of premature generalisation is smaller than the cost of retrofitting.

Failure: products as a type code

One table, product_type column. Credit cards and interest-rate swaps in the same table. Unworkable. Fix: generalisation hierarchy. Common attributes on the parent, specialised attributes on the children.

Failure: relationships with no cardinality

The ERD shows a line between customer and account with no cardinality. Is it one-to-one? One-to-many? Many-to-many? The line is ambiguous. Fix: explicit cardinality on every relationship, with optionality indicated. If a relationship is one-to-many but the "many" side is often zero, that is a different model from a relationship where the "many" side is always at least one.

Failure: the ERD that does not show attributes

Some teams present ERDs without attributes to reduce clutter. At the conceptual level this is fine. At the logical level it hides the decisions that matter. Fix: logical ERDs include attributes, data types and key indicators. If the diagram is too cluttered, split it into subject-area diagrams rather than hiding information.

Tooling

ERDs can be built in a dozen tools: Lucidchart, draw.io, ERwin, SqlDBM, Sparx EA, dbdiagram.io. The tool is less important than the discipline. What matters is that the ERD is version-controlled, that it is treated as the source of truth for the logical data model, and that changes to the schema flow through the ERD rather than around it.

For teams practising model-driven development, tools like ERwin and Sparx EA let the ERD generate DDL and documentation automatically. This is the pattern that scales: the diagram is the model, and the database is derived from the diagram.

Connecting the ERD to the rest of the artefact stack

The ERD does not stand alone. It connects to:

The data dictionary: every entity attribute has a dictionary entry.
The data lineage: every entity attribute is sourced from a specific system.
The data flow diagrams: show the movement of entities between processes.
The business rules catalogue: business rules reference entities and attributes by their ERD names.

When the artefacts are coherent, the data governance stack is coherent. When they disagree, the disagreements compound. Our business analysis service covers the artefact stack as an integrated set rather than individual documents.

External references

The DAMA-DMBOK body of knowledge is the canonical reference for data modelling practice. The FpML standard and the ISO 20022 data dictionary provide industry-standard models for specific financial domains. The BIAN service domain model provides a reference service architecture for banking that can anchor the logical ERD.

The short version

ERDs in financial services are harder than ERDs in generic domains because the parties are more complex, the products are more varied, and time-variance is not optional. The patterns that work are abstract party entities with typed specialisations, product hierarchies with time-versioned terms, transactions as immutable events, and temporal modelling everywhere.

Get the ERD right and the downstream artefacts (dictionary, lineage, reporting) become tractable. Get it wrong and every downstream artefact is subtly inconsistent, and those inconsistencies surface as regulatory findings years later.

Our business analysis service and regulatory compliance transformation offerings cover the ERD work as part of the broader data governance stack for regulated firms.

Ready to do the structural work?

Our AI Enablement engagements are built around the five pillars in this article. We start with a focused diagnostic, then redesign one priority workflow end-to-end as proof — including the data layer, decision rights, and governance machinery.

Explore the AI Enablement service

Monthly newsletter

More like this — once a month

Get the next long-form essay on AI enablement, embedded governance, and operating-model design straight to your inbox. One considered piece per month, written for senior practitioners in regulated industries.

No spam. Unsubscribe anytime. Read by senior practitioners across FS, healthcare, energy, and the public sector.

Related insights

Data Architecture

The Data Dictionary as a Regulatory Artefact: BCBS 239 and Beyond

Why the data dictionary is one of the most consequential artefacts in regulated delivery, how to structure it, and the common failure modes that lead to regulatory findings.

April 17, 2026

Data Architecture

Data Flow Diagrams That Satisfy GDPR and DORA

How to build data flow diagrams that pass privacy impact assessments, DORA third-party scrutiny, and internal audit. Notation, scope, and the pitfalls that turn DFDs into shelfware.

April 17, 2026

Business Analysis

Given-When-Then Acceptance Criteria for Regulated Product Teams

How to write acceptance criteria using Given-When-Then that are testable, audit-ready, and connected to the regulatory obligation. Patterns, anti-patterns, and examples from financial services.

April 17, 2026