Building Data Foundations for AIModule 1 of 7

Module 1 of 714% through

Module 1

Reporting Data vs Action Data — The Distinction That Decides Everything

Why the data your enterprise has is fine for dashboards and useless for AI, and what changes when you re-architect for action.

Module 1 — 90-second video overview

The single distinction that decides whether your AI compounds

If you remember nothing else from this course, remember this distinction:

There are two kinds of data inside your organisation, and the difference between them decides whether AI compounds for you or stays stuck at the surface.

The two kinds are:

Reporting data — captured to describe what happened, so people can analyse it and decide.
Action data — captured to drive what happens next, inside a workflow, in real time, without a human translation step.

These look superficially similar. They often live in the same warehouses. They are often owned by the same data team. They often pass through the same pipelines. But they are designed for fundamentally different purposes, and most enterprise data is built for the first kind. AI needs the second.

This module is about why that matters and how to recognise the gap.

What reporting data tolerates

Reporting data is built for human consumption. It powers dashboards, board packs, finance reconciliations, regulatory submissions, audit trails, and business intelligence. Because humans are the consumers, reporting data can tolerate a long list of imperfections:

Latency. Daily, weekly, or monthly refreshes are fine. Nobody is reading the dashboard in real time.
Missing fields. Nulls can be hidden, defaults can be applied, footnotes can be added.
Inconsistent definitions. Each report can normalise its own view. "Active customer" can mean three different things in three different reports as long as each report is internally consistent.
Manual reconciliation. The end-of-month close exists precisely because data needs human cleanup before it's reportable.
Multiple sources of truth. Different reports can use different sources for the same concept and nobody minds, because the human reading them can sort it out.

None of this is bad. It's how reporting data is supposed to work. The problem is that it isn't suitable for AI.

What action data requires

Action data is built for system consumption. It powers automated decisions, in real time, inside live workflows. Because systems are the consumers, action data has to meet a much harder set of requirements:

Predictable, low-latency freshness. Measured in seconds or low minutes, not "by the next business day."
Complete fields, every time. Missing values either fail the decision or produce garbage. There is no human to sanity-check the result.
Standardised definitions, enforced at capture. "Active customer" must mean exactly one thing across every system that touches it. If two systems disagree, the workflow produces contradictions.
Single source of truth, traceable. Every value used in an automated decision must be traceable to its origin, with full lineage observable in production.
Captured at the point of action. Created inside the application or workflow as the work happens, not reconstructed from logs or exports afterwards.

None of these are exotic. All of them are absent in most enterprise data layers, because the enterprise data layer was built for reports and reports never demanded them.

A worked example

Imagine a bank that wants to deploy an AI system to triage incoming customer complaints, route them to the right team, and flag the ones at risk of becoming regulatory complaints under FCA Consumer Duty.

With reporting data, the bank has:

A complaints database refreshed nightly from the call centre system
A customer database refreshed every six hours
A products database with definitions that differ slightly from the call centre's product taxonomy
A complaints classification field that is sometimes filled in by the agent and sometimes left blank
A regulatory complaints register maintained by the compliance team in a separate system
Lineage that exists on paper but not in production

The AI use case looks great on slides. In practice, the model can't be deployed because:

The data is too stale to triage incoming complaints in real time
The product taxonomy mismatch produces nonsense matches in 20% of cases
The classification field is missing for 35% of records
The regulatory complaints register can't be linked back to the source complaint with confidence
Nobody can trace why the model produced a specific recommendation, which fails the FCA's expectation of decision auditability

With action data, the same bank would have:

Complaints captured into a normalised event stream the moment the customer reports them
A single product taxonomy enforced at the point of capture across all systems
Required fields enforced before the complaint can be saved, with structured override paths for edge cases
Customer context joined into the complaint at capture, not reconstructed later
A regulatory classification path that is part of the same event stream, not a parallel system
Full lineage observable in production, so any decision can be reconstructed end-to-end

The first version produces a stalled AI project. The second version produces a working triage system that compounds over time. Same bank. Same complaints. Same AI capability. Different data layer.

Why this is invisible to most executives

The reporting/action distinction is invisible to most executives because both kinds of data look the same from the C-suite. They live in the same warehouse. They are owned by the same CDO. They produce the same dashboards. The reports come out on time, the BI team is functioning, the data quality scores look fine.

The gap only becomes visible when something tries to act on the data instead of read it. By then, the AI initiative is already underway, the budget is already committed, and the team is already heads-down trying to make a use case work that the data layer cannot support.

This is the moment when executives say "I thought our data was good." It is good — for the use it was designed for. It is not good for the use you are now putting it to.

What you'll learn next

In Module 2 we'll cover the five characteristics of an AI-ready data layer in detail, with concrete patterns and anti-patterns from real engagements. By the end of Module 2 you'll have a working diagnostic for any dataset in your organisation — is this data built for reports or is it built for action?

Back to

Course Overview

Module 2: The Five Characteristics of an AI-Ready Data Layer

Monthly newsletter

Stay current between modules

Subscribe to the monthly essay for long-form analysis on AI enablement, embedded governance, and operating-model design — written for the same audience this course serves.

No spam. Unsubscribe anytime. Read by senior practitioners across FS, healthcare, energy, and the public sector.