Blackstone&, UK AI Lab Project | Proposal for Serco Limited

Section 1 Executive Summary

We have prepared this proposal for Serco using our tried and tested AI Adoption Framework, which is referenced throughout this response.

Why a framework, not a point solution

The AI market is moving too fast for point solutions to remain current. There are countless publicly available prototypes, patterns, and tools emerging every week, and any specific technology choice made today risks being overtaken tomorrow.

What organisations actually need is a framework: a structured, repeatable approach to discovering, evaluating, and deploying AI capabilities that keeps pace with an unconstrained market.

Built for Serco, owned by Serco

The AI Adoption Framework and all of its components would be designed to fit Serco's internal environment, your data classifications, your governance requirements, your existing platforms.

It is left as a capability that Serco can operate independently, without ongoing reliance on Blackstone&. The framework allows organisations to leverage the fast-moving AI market and adopt those technologies in a sensible, responsible, and effective way within the constraints of their organisation.

Key assets referenced in this response

Live Prototype

Collaboration Hub

Working prototype of the AI-powered collaboration use case being tendered.

serco-pulse.lovable.app →

Live Asset

AI Capability Library

156 capabilities across 14 domains, mapped against 6 data classification levels.

ai-capability-library.pages.dev →

Live Asset

Agile Contracting Toolkit

Interactive commercial model showing cost, risk, and scope management.

agile-contracting-toolkit.pages.dev →

Live Asset

Use Case Ingestion

Build Readiness Pack generated from the structured use case intake process.

use-case-ingest.blackstoneand.com →

Section 2 Approach & Methodology

Dual-Track AI Delivery Model

Discover & Frame

Decision
Gate

Experiment & Validate

Decision
Gate

Production Delivery

Use Case
Intake

(Prove Value)

Use Case Card

Experiment
Engine

Experiment Log

Build
Readiness

Build Readiness
Pack

MVP

(Prove Value)

Scale

(Prove Reliability)

Operate

(Sustain Value)

Map Demand
Across Use Cases

Assess Current
Capability Maturity

Capability
Gap Analysis

Capability-Led
Roadmapping

Use Case Layer

Portfolio Layer

Introduction

Most organisations struggle with AI adoption not because the technology fails, but because they build the wrong things, in the wrong order, without knowing what's already available.

The result is a graveyard of disconnected proofs of concept, each one built in isolation, none of them connecting to shared infrastructure, and no discipline for stopping work that isn't delivering value.

At the same time, many organisations struggle to translate the unconstrained potential of AI, what the technology could do, into solutions that can operate within the constraints of an enterprise environment, security, governance, data access, and operational support.

We take a different approach. Our model explicitly bridges these two worlds:

Explore

AI capabilities without artificial constraint during ideation and experimentation

Shape

Solutions that are secure, governed, and scalable in production

Our end-to-end AI delivery model is structured, artefact-driven, and designed to move consistently from idea to production, while building reusable, enterprise-grade AI capabilities that make every subsequent use case cheaper and faster to deliver.

Two layers, running concurrently

Steps 0–5 The Use Case Layer

Governs how each individual idea is discovered, validated, and prepared for build. Is this worth building, and how should we build it?

Cross-cutting The Portfolio Layer

Governs what to build and when across the entire estate. Given everything we know about demand, capability, and infrastructure, what is the most valuable thing to invest in next?

These layers run concurrently. The portfolio view continuously reprioritises as new use cases are submitted, experiments produce evidence, and capabilities are built.

The Use Case Layer

Step 0: Human-Led Discovery & Framing

We start with the real problem, not just the idea.

Before any formal intake, we conduct targeted stakeholder conversations to understand the operational context behind each AI opportunity. This is deliberate: the best use cases come from people who understand the problem deeply but may not frame it in AI terms.

Understand context

The operational reality and the true business problem, not just the surface-level request

Capture outcomes

Jobs-to-be-done, pains, and desired outcomes from the people who live with the problem daily

Identify constraints

Data availability, system dependencies, regulatory considerations, organisational readiness

Surface prior efforts

What's already been tried, what worked, what didn't, avoiding duplication of effort

This ensures every use case that enters the pipeline is grounded in real user needs and operational reality, not abstract ideas or technology-first thinking. It also builds the relationship between the AI Lab and the business units it serves, the Lab is a partner in solving problems, not a ticket queue.

Output: A clear understanding of the opportunity, ready to be structured through the formal intake process.

Step 1: Use Case Intake

We create clarity from day one.

All use cases, whether newly discovered or pre-existing, enter through a single, governed front door. This is a deliberate design choice. A single intake point standardises inputs across the organisation, ensures every opportunity is assessed on the same basis, prevents duplication of effort, and provides full visibility of the pipeline.

Each use case is systematically captured across five pillars:

Pillar	What It Captures	Why It Matters
Value	Problem statement, business impact, strategic alignment, scale and frequency of the problem	Ensures we're solving problems worth solving
Understanding	Current process maturity, workflow clarity, whether success metrics are defined	Reveals whether the problem is well-enough understood to act on
Data	Data types needed, where data lives, data quality, sensitivity level	Determines what's technically feasible and what governance is required
Capability	AI pattern required (classification, RAG, forecasting, etc.), platform capabilities needed, integration requirements	Maps the use case to the technical capabilities it demands
Readiness	Infrastructure readiness, governance requirements, team capability, blockers and dependencies	Shows whether the organisation is ready to support this use case

The Front Door is initially human-led: the AI Lab team works directly with business users to capture and structure each opportunity. Over time, this evolves into a self-serve portal where business users can submit use cases directly, guided by an AI-assisted intake process that asks the right questions and ensures completeness.

Output: A standardised Use Case Card, scored across all five pillars, with a dependency category assigned, required capabilities identified, and blockers surfaced.

Step 2: Experiment Engine

We prove value before we build.

Prioritised use cases do not go straight to development. They enter a structured experimentation process designed to validate, or invalidate, the core assumptions before any meaningful investment is made. This process operates as two connected cycles:

Business Design Cycle

Is this worth building?

Test Cycle

Can this actually work?

The Business Design Cycle

Ideate

We start with a clear understanding of the problem, user context, and desired outcomes, ensuring focus on real operational challenges.

Working sessions with stakeholders, domain experts, and delivery teams rapidly explore solution options. Problems are reframed into opportunity statements, AI intervention points identified, and multiple approaches explored in parallel using proven patterns such as RAG, classification, and workflow automation.

Ideas are quickly assessed against desirability and viability, creating a disciplined funnel from opportunity to testable concept.

Business Prototype

Promising ideas are translated into lightweight, working prototypes, not to prove technical perfection, but to test value.

Prototypes are built rapidly using reusable components aligned to the AI Capability Library (e.g. retrieval patterns, summarisation, speech interfaces), combined with representative data and simple interfaces grounded in real workflows. This allows us to assemble working solutions quickly, rather than building from first principles. Development is strictly time-boxed to maintain pace and avoid over-engineering.

Each prototype is designed to answer three questions:

Does this meaningfully solve the problem? Is the output useful and understandable? Would users adopt this in practice?

Stakeholders engage directly with the prototype, enabling fast feedback, refinement, or rejection before further investment.

Assess

Each use case is evaluated across three lenses:

Desirable, does this solve a real problem and will users adopt it?
Viable, does this deliver measurable business value?
Feasible, can this be built and scaled?

At this stage, desirability and viability take priority. Feasibility is not treated as a hard gate, allowing high-value opportunities to progress even if capabilities are not yet in place, and enabling informed, portfolio-level investment decisions.

The Test Cycle

Hypothesise

Each experiment is defined with precision: We believe that... To verify, we will... We will measure... We are right if...

Experiments use real data, with success criteria defined upfront.

To ensure consistency and repeatability, we apply a standardised evaluation test suite, aligned to common AI capability patterns (e.g. retrieval quality, summarisation accuracy, workflow outputs). This allows experiments to be assessed objectively, rather than relying on subjective judgement. All results are captured in an Experimentation Log, providing a transparent, auditable record of what was tested, learned, and decided.

Experiment & Learn

Experimentation builds confidence over time, it is not a single pass/fail step. After each experiment, confidence is updated across desirability and viability. Multiple targeted experiments are run to reduce uncertainty. Strong signals increase confidence; weak or negative signals trigger refinement or alternative approaches.

The evaluation test suite is reused and extended across experiments, ensuring results are comparable as the solution evolves. If confidence cannot be raised to an acceptable level, the use case is stopped.

Decide

We make evidence-based decisions.

Each use case reaches a formal decision point:

Kill, insufficient value or confidence. Work stops and investment is redirected.
Iterate, promising but inconclusive. Refine and re-test.
Use Case Validated, sufficient confidence to proceed.

Kill discipline is intentional. Stopping weak ideas early protects investment and prevents accumulation of low-value solutions.

The Operating Rhythm

Experimentation runs on a structured cadence to maintain pace and transparency:

Weekly

Define hypotheses & experiments

Daily

Standups, momentum & blockers

Weekly

Learning reviews & direction

Monthly

Decision forums with stakeholders

This ensures continuous learning, visible progress, and shared ownership of decisions.

Experiment Engine, Business Design cycle and Test cycle operating as two connected loops

Step 3: Build Readiness

We formalise before we scale.

Validated use cases are translated into a Build Readiness Pack, a structured, evidence-based handoff from experimentation to delivery, developed collaboratively with business, architecture, security, data protection, and operations teams.

Problem & Value

Validated problem statement and expected outcomes

Learnings & Approach

Key findings and recommended solution path

Capabilities

Required capabilities and platform alignment

Governance

Security, architecture, and compliance requirements

MVP Scope

Scope, risks, and ownership

Recommendation

Proceed, iterate, or stop

This ensures delivery begins with clarity, alignment, and agreed constraints, not assumptions.

Governance is by design, not added later. Existing forums (architecture, security, data protection) are used to validate decisions early, avoiding late-stage blockers.

Step 4: Production Delivery & Scaling

We move directly from validated use case to controlled build. Once approved, the use case enters delivery via the roadmap's "Now" horizon. The focus shifts from validation to execution.

Development begins with an MVP built on production-aligned architecture, governed data access, and reusable platform components. This is not a prototype, it is the foundation of a scalable solution. The Build Readiness Pack feeds directly into delivery, generating structured engineering epics covering data, models, applications, security, and operational readiness. Teams can begin work immediately with clear scope and ownership.

Delivery progresses through three distinct stages:

Prove it works in reality MVP

The solution is deployed to real users with real data. The objective is to demonstrate measurable value in a live context, not a simulated one.

Prove it works reliably Scaling

Before wider rollout, the solution is validated with enterprise stakeholders (architecture, security, DPO, operations). It is then scaled progressively using controlled release strategies. Performance, reliability, and adoption are monitored closely, with decisions to expand, refine, or halt based on evidence.

Scaling is addressed across two dimensions:

Vertical scaling

Increasing volume, performance, and reliability within the use case

Horizontal scaling

Extending the solution across users, business units, and additional use cases

This ensures solutions are not only technically robust, but capable of delivering value at enterprise scale.

Prove it continues to deliver value Operate

The solution transitions into a managed product with defined service levels, monitoring, incident management, and clear ownership across teams. All solutions are built using reusable, standardised components, including shared pipelines, integration patterns, and observability frameworks.

The same discipline applied in experimentation continues through delivery. If a solution does not demonstrate expected value, adoption, or performance, it is refined or stopped, not scaled.

Delivery Acceleration & Reusable Assets

Delivery is accelerated through a set of reusable assets embedded across each stage of the lifecycle. These are not standalone tools, but integrated components used during discovery, experimentation, and production delivery.

Stage	Reusable Assets	Purpose
Discovery & Intake	Use Case Card template, structured intake framework	Standardises inputs, ensures consistent evaluation and prioritisation
Experiment Engine	Prompt templates, RAG prototypes, evaluation harness, Experiment Library	Rapidly test ideas using proven patterns and measurable criteria
Business Prototype	Low-code UI patterns, workflow templates, sample datasets	Quickly create interactive prototypes aligned to real workflows
Build Readiness	Build Readiness Pack template, architecture patterns	Translate validated ideas into delivery-ready specifications
MVP Build	Reference architectures, reusable pipelines, orchestration patterns	Accelerate development using production-aligned components
Scaling & Operate	Monitoring frameworks, logging standards, evaluation pipelines	Ensure reliability, performance, and continuous improvement in production

The Portfolio Layer

We don't prioritise use cases in isolation, we invest at the capability level.

While the Use Case Layer validates individual opportunities, the Portfolio Layer determines what to build and when across the estate. It brings all use cases into a single view, enabling informed, evidence-based investment decisions.

Maximise value by investing in the capabilities that unlock the most impact.

Map Demand Across Use Cases Aggregate capability requirements

Each validated use case defines a set of required capabilities, such as retrieval (RAG), classification, forecasting, or workflow automation.

When aggregated, these requirements create a clear, structured view of demand across the organisation. This allows us to identify common patterns, shared dependencies, and opportunities for reuse, shifting the focus from individual solutions to underlying capabilities.

Assess Current Capability Maturity Evaluate what already exists

In parallel, we assess the current technology landscape, including existing AI solutions, data platforms, integrations, and infrastructure.

This is not a static inventory. Capabilities are evaluated for scalability, governance, and reusability, providing a clear view of what can be leveraged, what requires enhancement, and what is missing entirely.

Capability Gap Analysis

By comparing demand with supply, we identify the capability gap, the set of capabilities that must be built, enhanced, or standardised to support the portfolio.

This reframes the investment question:

"Which use case should we build next?"

→

"Which capability unlocks the most value across multiple use cases?"

Feasibility is assessed at this level, considering dependencies such as data platforms, infrastructure, security, and governance.

Business Case Development & Value Bundling

Use cases are not progressed in isolation. As part of portfolio management, we translate prioritised opportunities into structured business cases, aligned to Serco's investment and governance processes.

Now (Short-term)

Clear, near-term value using existing capabilities. Focused on quick wins, efficiency gains, and early adoption.

Next (Medium-term)

Requires targeted capability investment (e.g. data readiness, integration, orchestration). Combines delivery of use cases with capability build.

Later (Long-term)

Dependent on more advanced or emerging capabilities. Positioned as strategic opportunities, not immediate commitments.

Each business case includes expected business outcomes (e.g. time saved, cost reduction, improved contract performance), delivery scope and dependencies, required capability investments, indicative cost vs value profile, and success metrics and adoption assumptions.

Value Bundling

Where multiple use cases rely on the same underlying capabilities, we group them into investment bundles rather than assessing them independently.

Shared RAG capability → supports reporting, risk identification, and knowledge access

Shared orchestration layer → enables multiple operational use cases

This allows capability costs to be amortised across multiple use cases, stronger more compelling business cases, and avoidance of duplicated investment. Instead of funding isolated use cases, Serco invests in capabilities that unlock multiple outcomes.

Capability-Led Roadmapping

The portfolio layer continuously integrates new evidence, validated use cases, experiment results, capability maturity assessments, into an evolving, evidence-based roadmap.

Summary

This approach enables Serco to move beyond isolated AI initiatives and instead build a coherent, scalable AI capability. This end-to-end model delivers five outcomes:

Systematic identification and prioritisation

Every AI opportunity is captured, assessed, and compared on the same basis. Nothing falls through the cracks. Investment goes where the evidence points.

Validation before commitment

No use case reaches production without passing through structured experimentation and evidence-based decision gates. This protects against the most common AI failure mode: building something nobody needs.

Scalable, production-ready delivery

Solutions are built on reusable components and shared infrastructure, not as isolated projects. Each use case strengthens the platform for the next one.

Maximum return on capability investment

The Capabilities Library and Gap Analysis ensure that infrastructure investments are strategic, building Enabler capabilities that unlock the broadest set of future opportunities, not just solving one problem at a time.

Internal capability, not external dependency. Knowledge transfer is embedded in every ceremony, every artifact, and every handoff point. The methodology is designed to be owned and operated internally. Our success is measured by whether you can run the next use case without us.

Section 3 AI Foundations & Infrastructure

1

Implementation Approach

2

Operating Model

3

Reference Architecture

4

Data Classification

5

Core Capabilities

6

Recommendations

Focusing Statement

What are the key considerations for establishing target architecture and implementation approach for Serco's global AI infrastructure. Responses should as a minimum address the areas below and state any assumptions and prerequisites.

Operating model
Evolving reference architecture
Responsible AI controls
RAG / knowledge layer
Data foundations
Model strategy
LLMOps / MLOps
Security and compliance
Monitoring and observability

Implementation Approach

Our high level implementation approach moves us from analysis, to solution delivery and finally to platform expansion:

Understand what exists

A critical input to target architecture is identifying what we can scale from existing Serco infrastructure versus what needs to be implemented new.

Our maturity assessment (Section 1) evaluates these bright spots across all four divisions.

→

Deliver a key use case

We leverage those bright spots and add additional capabilities by delivering a prioritised use case into production.

This helps us understand exactly what it takes to ship an AI product at Serco, through our defined approach and methodology.

→

Build out the platform

Through the assessment and delivery, we build out repeatable patterns and core infrastructure.

The delivery team evolves into the AI Platform Team, supporting accelerated delivery of use cases that adhere to core standards and guardrails.

The future state recommended operating model, who owns what, and how the platform serves product teams building AI agents and solutions, is defined first in this section. When combined with our approach and methodology for delivering use cases is what will enable Serco's vision for this engagement.

We then introduce the AI Capability Library, which operates as the backbone for our evolvable reference architecture. Building on our experience of establishing Internal Developer Platforms for central government and beyond, this is a key enabler for accelerating product delivery while maintaining key guardrails.

Finally, we outline how we evaluate the core capabilities of the reference architecture, and the questions we may need to ask along the way.

Operating Model: Products & Platform

Adopting a Products & Platform Structure

Our recommended operating model is driven by our experience in deploying Product and Platform teams across the public and private sectors. This approach provides product team autonomy and accelerated product delivery while still operating within the required guardrails. An example of how this could be introduced at Serco is visualised below:

Collaboration Hub

Product team

Bid Agent

Product team

Contracts Agent

Product team

Security Agent

Product team

• Initially self-serve

• Key touchpoints from build readiness onwards

• Continuous feedback

Select / Build / Iterate

AI Capability Library

Agentic PatternsAtomic Capabilities

Curate

AI Platform Team

Build

Infrastructure

Operate

Shared services

Govern

Guardrails + quality

Evaluate

Tech radar + lifecycle signal

Enable

Sandboxes + DX + training

Builds and operates

Global AI Infrastructure

Gateways

Stores

Orchestration

Guardrails / Evals

Monitor / Observe / Alert

Consumes

Common Global Infrastructure

IAM

APM

Data

Cloud

ITSM

Immutable principles, ethical · architectural · operational

There are clearly demarcated ownership boundaries and interaction modes in this approach.

Global product teams will own their specific AI products. The user experience, the business logic and domain knowledge.
The AI platform team owns the shared infrastructure and the AI Capability Library that surfaces it.
Existing or adapted Serco technology teams can continue to own the common capabilities the AI platform may need to consume such as IAM, API management, SecurityOps etc.

The product teams will consume capabilities from the library, aligning to an InnerSource model. The platform team curates the library, operates the infrastructure, and embeds governance structurally so that product teams operate within guardrails by default. Product teams can choose NOT to use a specific capability, but in general that approach will be slower for them and they will need to justify their rationale at Build Readiness.

The AI Platform Team

The Serco AI Platform Team will have five core responsibilities:

Build AI infrastructure
Operate this infrastructure as common capabilities with defined availability, performance, and cost targets
Govern by embedding security, responsible AI, data classification, and audit into the platform itself
Evaluate emerging technologies via the tech radar and process demand signals from product teams
Enable the product teams through sandboxes, documentation, onboarding, and a frictionless experience

AI Platform Lead

Roadmap ownership
Library curation
Key relationships
Facilitation

ML/AI Engineers

Model gateway
Agent orchestration
Evaluation harnesses
Deployment pipelines

Data Engineers

Embedding pipelines
Vector stores
Knowledge graphs
Document ingestion

Security

Compliance controls
Data classification
Responsible AI
Audit and logging

The team would be a joint partnership with Serco. Blackstone& will provide expertise and accelerators including the AI Capability Library, Serco will provide domain knowledge and additional architectural resources. Over the engagement, the ratio shifts through paired delivery. We lead, then co-lead, then support, then step back. The end state is a Serco-owned platform team operating without external dependency. This is outlined further in Knowledge Transfer.

Ways Of Working with Serco Teams

Product Teams interact with the Platform Team and Capability Library throughout the AI Adoption Framework outlined in Section 1.

At hypothesis and shape the interaction is lightweight with product teams browsing the capability library and accessing sandboxes to run their experiments.
At build readiness the platform team actively engages. We help teams to map their needs against the library, identifying what's on the golden path versus what doesn't exist yet, and ensuring architecture and governance alignment. This is where we jointly decide whether a gap should be fast-tracked into the golden path, added as niche, or treated as product-specific.
Through build, iterate, and validate product teams build on the golden path where possible. Deviation is permitted where the use case requires it, but governed. The platform team knows, tracks it, and learns from it. Sprint reviews and demos create shared visibility.
At live and operate the platform team will monitor whether the capabilities are providing the required value and performing to expectations.

This creates a continuous feedback loop. Product teams will surface real-world needs, the platform team evaluates and responds, and the library evolves from delivery experience, not from a purely theoretical architecture.

Support

AI products can degrade gradually through quality drift, retrieval relevance decay, or model provider changes that alter output characteristics. Monitoring tools such as DataDog and LangSmith provide advanced capabilities that capture not just token usage and latency but drift, which we like to track as part of our delivery pipelines.

The platform team monitors and maintains shared infrastructure. Gateway availability, vector store performance, model routing, guardrails enforcement, pipeline reliability. Platform incidents are managed through Serco's existing ITSM processes with defined SLAs.
The product team monitors and maintains product-specific quality. Response accuracy, user satisfaction, domain relevance, and business value.

The platform provides the monitoring and evaluation tooling with the product team defining what "good" looks like for their domain and acts on the signals.

Escalation between levels is defined: if product-level quality degrades and the root cause is a platform capability (e.g., vector search latency, model performance regression) then it escalates to the platform team. If the root cause is product-specific (e.g., outdated grounding data, prompt drift) then the product team resolves it using the platform's evaluation and versioning tools.

Post go-live, the platform team also manages the evolution cycle. Quarterly tech radar reviews, golden path updates, capability deprecation with migration support, ensuring that live products are not disrupted by platform evolution.

Reference Architecture

A static reference architecture document can become outdated the moment it is published, regardless of how well it is designed. Model capabilities are advancing on a quarterly basis and new patterns (agentic workflows, tool-use orchestration, multi-modal reasoning) emerge faster than an ARB (Architecture Review Board) can evaluate them. The reference architecture must be a living, consumable, evolvable artefact rather than a static document.

Reference Architecture Evolution

An architecture that maintains currency means Serco will never be locked into yesterday's decisions as new opportunities emerge. For this to be truly effective, developed solutions need to maintain evolvability as a key architectural principle. This enables solutions to (e.g.) swap out models via a simple config change.

Golden Path

Assessed, approved, production-ready.

The recommended, supported way to build AI products. Pre-approved security posture, established pipelines, shared documentation. Deviation permitted but governed.

Surfaces

↔

AI Capability Library

The living reference architecture.

Agentic patterns and atomic capabilities with data classification governance. What the platform team builds is surfaced here for product teams to consume.

Feeds

↔

Radar

Top-down, Continuous evaluation of emerging models, tools, and patterns.

Bottom-up, Demand signals from product teams at every lifecycle stage, sandbox experiments, build gaps, live performance.

Monitoring, Feasibility and Alerting

Our Capability Library enables organisations to act on what's new without destabilising what's already working. Top-down and bottom-up radars capture what's emerging and what's relevant to Serco's context. The radar helps to filter signals from noise.

Product demand

As new product teams come through the adoption lifecycle, they surface capability gaps at build readiness. Some gaps are fast-tracked into the golden path. Some are added as niche capabilities. Some are product-specific and stay that way. The platform team makes a deliberate decision for each.

Technology radar

In parallel, the platform team systematically evaluates emerging models, tools, and patterns on a quarterly cycle (assess, trial, adopt, hold). This ensures the platform stays current with advances in the field, not just reactive to product team requests.

Over time, the capability library and golden path become richer, the agentic patterns become more mature, and new product teams get to production faster because more of what they need already exists.

The Blackstone& Capability Library

Blackstone& developed the AI Capability Library as a curated catalogue of composable, pre-approved patterns and capabilities that product teams can browse, select from, and build on.

Serving the same purpose as Spotify Backstage does for standard software development, the AI Platform Team curates a set of "golden paths" that help teams navigate the complexity of their solution build.

AI Capability Library, browsable catalogue of agentic patterns and atomic capabilities

When the Platform Team operationalises new capabilities (this could be a new model via the gateway, a new retrieval approach or a new guardrails capability for example) it appears in the library as something teams can consume. The library is always the current state of what is available and approved.

The library is an existing accelerator we will bring to this engagement.

Two levels of abstraction

Agentic Patterns

Reusable templates for building agents. Each pattern pre-wires the orchestration, memory, tool access, and guardrails an agent type needs. A product team picks a pattern, configures it for their domain, and gets a working agent with governance already embedded. Based on the provided use cases, possible starting points could be:

Pattern	What It Does	What Comes Pre-Wired	Serco Use Cases It Suits
Knowledge Worker	Answers questions from a document corpus	RAG orchestration, source citation, confidence scoring, human escalation	Collaboration Hub, Resource Mapping Agent
Analytical	Monitors, analyses, and reports on data	Scheduled triggers, dashboard integration, threshold alerting, reporting	Finance Genie, Contract Risk Agent, Operational Management Genie
Process Automation	Executes multi-step workflows with system integrations	Approval gates, audit logging, rollback, human-in-the-loop at defined decision points	HR Agent, Complaints Processing, Smart Payroll
Scanner	Continuously watches sources and surfaces relevant information	Continuous ingestion, relevance scoring, notification routing	Bid Scanner, Regulation Scanner, Market Scan Agent

These are suggested starting points. As Serco delivers more use cases, patterns will be refined and new ones will emerge from delivery experience.

Atomic Capabilities

The individual building blocks that agentic patterns compose from. These are what the infrastructure surfaces as consumable services and are outlined in more depth in the Core Capabilities section.

Category	Capabilities	Powered By
Retrieval & Reasoning	Document Q&A, semantic search, summarisation, multi-document reasoning	RAG / knowledge infrastructure
Extraction & Action	Structured data extraction, classification, tool calling	Models and orchestration
Safety Enablers	PII redaction, data classification, anonymisation	Enabler capabilities that unlock others safely
Agent Runtime	Agentic workflows, agent memory, state management	Orchestration and knowledge infrastructure

Each capability is classified by maturity tier (Enabler, Foundational, Desirable or Niche) reflecting how broadly proven and supported it is.

Data Classification & Governance

Every capability in the library is tagged with the data classification levels it's approved for. The same capability (document Q&A, for example) works differently depending on how sensitive the data is:

For lower sensitivity a cloud-hosted commercial model via API may be fine
For higher sensitivity a sovereign-hosted model with stricter guardrails, audit logging, and access controls may be required

This means security and compliance decisions are built into the library, not managed through separate review processes. In practice:

A product team declares their data classification level
The library shows them what's approved for that level
The model gateway enforces it at runtime with no manual checks needed

Serco's specific classification scheme will be established through an initial discovery period and aligned to the library in the first weeks of the engagement.

The Golden Path

The golden path concept originates from platform engineering, as pioneered by companies like Spotify through their Backstage internal developer platform. It's the recommended, supported way to build. Not a mandate, but a strong default that makes the right thing the easy thing.

For Serco, the golden path will be the combination of capabilities, patterns, and infrastructure that the platform team has validated and proven through delivery. Teams who follow it get:

Pre-approved security and compliance posture
Established pipelines and monitoring
Shared documentation and community support
Faster time to production

This matters in Serco's operating environment where building outside proven, governed patterns creates risk. The golden path reduces that risk by default.

Deviation is permitted. A computer vision use case has genuinely different needs to a document Q&A agent. But deviation is governed with product teams needing to justify their rationale at build readiness, and the platform team tracks it. If multiple teams deviate in the same direction, that's a signal to update the golden path.

Immutable Principles

Before any technology choices are made, we agree a set of principles with Serco that govern all subsequent decisions. The architecture evolves continuously; the principles do not.

Category	Principle	What it means in practice
Ethical	Human oversight for consequential decisions	Agents don't make high-impact decisions alone and humans stay in the loop where it matters
Ethical	Transparency proportionate to risk	The higher the stakes, the more explainable the AI must be
Ethical	Fairness and bias monitoring	Outputs are continuously checked for bias, not just at launch
Ethical	Privacy by design	Data protection is built in from the start, not bolted on later
Architectural	Composability over monoliths	Small, swappable components, not large, tightly coupled systems
Architectural	Abstraction at interfaces	Product teams are insulated from infrastructure changes happening underneath
Architectural	Data sovereignty by default	Data stays in-jurisdiction unless explicitly approved otherwise
Architectural	Open standards preferred	Avoid vendor lock-in; make it possible to change direction
Operational	Everything observable	If it runs, it's monitored, no black boxes
Operational	Everything auditable	If an agent acts, there's a record of what it did and why
Operational	Progressive rollout	Changes are deployed gradually, not all at once
Operational	Capability building over dependency	Serco teams own the outcomes, we build capability, not reliance

Establishing Core Capabilities

Build the platform through delivery

The fastest and most reliable way to establish Serco's core AI capabilities is to build them through the delivery of a real use case, starting with (e.g.) the Collaboration Hub.

The AI platform team begins by delivering the use case end-to-end. Through that delivery, our team, working in partnership with Serco, makes real technology choices, builds real infrastructure, and solves real problems.

The components that emerge (RAG pipelines, vector stores, model routing, guardrails, monitoring) become the first capabilities in the library and the foundation of the golden path.

This approach means:

Technology choices are validated through delivery, not theoretical evaluation
The first agentic pattern is proven before other product teams need it
The team builds operational muscle on a real product before needing to scale it to support multiple teams
Serco engineers are embedded from day one, building capability through the work itself

Once the use case is live, the team pivots from product delivery to platform operation. It supports the next wave of product teams (Bid Agent, Contract Risk, etc.), building out additional patterns and capabilities as demand requires. The capability library grows from delivery experience, not from a theoretical roadmap.

In parallel, the platform team establishes the technology radar for systematic evaluation of emerging capabilities beyond what current delivery demands.

Considerations when building agents

Serco's use case list is heavily agent-focused. Deploying agents at enterprise scale introduces cost and quality challenges that aren't addressed in a standard infrastructure checklist, but matter enormously in production.

Agent Cost Economics

A single agent request can trigger 3 to 10 model calls. Without active management, costs spiral. We have seen implementations where expensive models were hardwired in so the cost to serve made no economic sense.

Our approach: intelligent model routing, prompt caching, and cost-per-outcome tracking rather than cost-per-token.

In practice, these techniques reduce agent operating costs by 70 to 90 percent compared to naive implementations.

Context Engineering

Prompt engineering gets the attention, but context engineering determines quality. How retrieval results, system instructions, memory, and user input are assembled matters as much as the prompt itself.

Poor context assembly is the most common cause of hallucination. Our platform provides tooling to inspect, debug, and optimise context construction.

Self-Adaptive Feedback Loops

Static agents degrade over time. We advocate architectures where agents observe their own outcomes, detect what's working, and refine strategies through structured feedback.

Persistent memory across sessions, reflexion patterns, and continuous evaluation against quality baselines. The goal: agents that get better with use, not worse.

Agent Memory Architecture

Serco's agents need more than single-session context. A Bid Agent that remembers successful patterns, a Contract Risk Agent that builds knowledge of recurring risks.

A layered memory system: working (current session), episodic (past interactions), semantic (learned knowledge). Memory management is an active design concern, not an afterthought.

These are considerations we will address through delivery of the initial use cases, building the right patterns into the platform from the start rather than retrofitting them later.

Capability Recommendations

Ultimately, our recommendations will acknowledge Serco's existing infrastructure (AWS, Databricks) and be validated through discovery and our initial use case delivery.

Capability area	What it provides	Decisions to validate
RAG / knowledge layer	Answer questions from documents. Find relevant content by meaning. Summarise and reason across multiple sources.	Vector store: Databricks Vector Search vs pgvector vs OpenSearch. Knowledge graph: Neo4j vs Amazon Neptune. Chunking strategy tuned per document type.
Data foundations	AI-ready data from Serco's existing estate. Classification before data enters the AI layer.	Integration boundary with Databricks programme. Classification taxonomy aligned to Serco's governance. Which source systems to connect first.
Model strategy	Right model for the task. Governed by data classification level. Product teams never choose a model directly.	Hosting per classification level. Cost/performance benchmarks on Serco tasks. Sovereignty requirements per jurisdiction.
LLMOps / MLOps	Version-controlled prompts and chains. Automated evaluation before deployment. Rollback if quality drops.	Evaluation criteria per domain. Prompt and context engineering tooling. Integration with Serco's existing CI/CD.
Security and compliance	Protection embedded in every layer. Every interaction logged and auditable. IP controls on what reaches external models.	Serco's data classification scheme. Compliance requirements per jurisdiction. IP policies per model provider. Supplier access model.
Responsible AI	Ethical controls built into the platform, not bolted on. Human-in-the-loop where it matters. Guardrails on every interaction.	Policy alignment with Serco's risk and ethics teams. Autonomy boundaries per domain (justice, health, defence). EU AI Act applicability. Guardrails baseline configuration.
Monitoring and observability	Visibility into quality, cost, reliability, and drift. Early warning before users notice degradation.	Quality baselines established through early delivery. SLO targets (start internal, tighten over time). Alerting thresholds per capability. Integration with Serco's observability stack (DataDog?).

Section 4 Collaboration Hub Use Case

This section details how we would deliver the Collaboration Hub as the first use case through our methodology, from problem definition through to cost estimate. Each area below addresses a specific RFP requirement, demonstrating how our approach applies in practice to a real use case that we have already prototyped.

1

Problem Definition

2

Data & Integration

3

Technical Approach

4

Delivery Approach

5

Responsible AI

6

Knowledge Transfer

7

Cost Estimate

Collaboration Hub

Problem Definition

The Problem

Serco's contract portfolio represents a unique institutional knowledge base built over decades. Today, that knowledge is fragmented across contracts, teams, and regions, inaccessible when it matters most.

→

The Solution

The Collaboration Hub transforms this into a searchable, clearance-aware intelligence platform, enabling knowledge reuse, faster decision-making, and cross-contract learning at scale.

The objective is not just to surface information, but to enable users to synthesise insight, apply reasoning, and take action across fragmented systems. This aligns to our Discovery & Framing and Use Case Intake steps, ensuring opportunities are assessed consistently and aligned to Serco's AI platform and capability model.

Note on assumptions

As described in our methodology, we would typically engage with a broad set of stakeholders to properly assess build readiness, intended value, and the most effective way to build, understanding data, integrations, readiness, and capability dependencies.

In the absence of that discovery, we have made assumptions based on the use case documentation provided in the RFP and the Databricks data programme architecture. What follows would be validated and refined through the Discovery & Framing phase.

Output

Using our assumptions about the outcome of the discovery work around the use case, we have created the Collaboration Hub Build Readiness Pack which can be seen here: Collaboration Hub, Build Readiness Pack (PDF)

Data & Integrations

What We Connect

Unifies access to distributed enterprise knowledge, enabling users to retrieve, synthesise, and act on information across systems through a single interface.

→

How We Build It

Built as part of Serco's AI platform, using reusable ingestion, retrieval, and access control capabilities rather than point-to-point integrations.

Primary Sources

SharePoint, contract reports, client documentation, governance artefacts
Microsoft Teams, discussions, risk logs, operational updates
Databricks / Data Platform, structured datasets (KPIs, SLA metrics, contract metadata)

Additional Sources (as capability scales)

Document management systems and file storage
Internal knowledge bases
Email (where appropriate and governed)

Ingestion & Processing

Extraction

API-based ingestion from SharePoint, Teams, and data platforms. Support for PDF, Word, Excel, structured tables.

→

Transformation

Document parsing, chunking, metadata enrichment (type, contract, date, owner), and entity normalisation.

→

Indexing & Knowledge Layer

Vector store embedding for semantic retrieval with metadata for filtering, security trimming, and traceability.

→

Update Model

Batch ingestion for MVP. Event-driven or scheduled updates for near real-time refresh as the solution scales.

Identity & Access Control

Access is governed through enterprise identity systems, aligned to Serco's security model:

Integration with Azure AD / Entra ID for authentication
Role-based access control (RBAC) applied at query time
Retrieval constrained to documents the user is authorised to access
Alignment with existing SharePoint and system-level permissions

Access control is enforced before retrieval and before agent execution. Implemented as a deterministic policy layer (not prompts or model inference), ensuring reliability and auditability.

Integration Patterns

API-based

Standard connectors to SharePoint, Teams, and data platforms. Reusable services exposed as platform capabilities.

Event-driven

Trigger updates when documents are created or modified. Enable timely refresh of indexed knowledge.

Application

Expose via UI (as in the prototype). Optional integration into Teams, internal portals, and existing tools.

Data Considerations & Constraints

We explicitly account for real-world enterprise data challenges: variability in document quality, inconsistent metadata, permission complexity (e.g. SharePoint inheritance), latency between updates and availability, and inconsistent classification across systems.

Addressed through metadata enrichment, controlled initial scope (thin slice), progressive pipeline refinement, entity normalisation, and caching strategies.

Outcome: This approach enables a unified, secure view of enterprise knowledge with grounded, traceable AI outputs. It supports both retrieval-based responses and agent-driven workflows, scales across additional data sources without re-architecture, and establishes a reusable data foundation for future AI capabilities across Serco.

Technical Approach

The Collaboration Hub is implemented as a modular, scalable AI capability that enables users to retrieve, synthesise, and act on enterprise knowledge.

It is delivered as a reusable, agent-driven architecture, forming part of Serco's broader AI platform and capability library.

End-to-End Architecture Overview

The solution follows a layered architecture to ensure separation of concerns, scalability, and reuse.

Key Architectural Principles

Grounded and governed by design, Outputs are based on enterprise data and subject to deterministic access control
Agent-oriented (not prompt-oriented), Capabilities are implemented as reusable agent patterns
Modular and reusable, Core components are shared across use cases
Separation of capability and use case, Avoids duplication and accelerates future delivery
Security and identity-first, Access enforced through enterprise systems, not the model
Cost-aware model usage, Model selection optimised for efficiency

Acceleration from Prior Delivery

Delivery is accelerated using proven components: pre-built ingestion and retrieval pipelines, agent orchestration patterns, structured output templates (summaries, risks, actions), evaluation frameworks, and reusable UI patterns. Enables rapid progression from prototype to MVP.

Foundation for Serco's Global AI Infrastructure

The Collaboration Hub establishes reusable platform capabilities: common ingestion framework, shared knowledge and retrieval layer, standard agent orchestration pattern, and centralised monitoring and evaluation. Result: faster delivery of future use cases, reduced duplication, and consistent governance and control.

Outcome: This approach enables the Collaboration Hub to deliver immediate value through contract reporting, risk identification, and client outputs, support both retrieval and action-oriented workflows, scale across users, data sources, and use cases, and form the foundation of a reusable, enterprise AI platform.

Delivery Approach

The Collaboration Hub is delivered through a structured, iterative approach that moves from problem definition to production in controlled stages.

Delivery is centred on proving value early through thin, end-to-end slices, using real workflows and data, before scaling. All delivery aligns to Serco's AI platform, leveraging reusable capabilities and contributing back to the capability library.

1

Discovery & Framing

We define the Collaboration Hub in operational terms with Serco stakeholders.

Activities: Engage contract managers, analysts, and commercial teams. Map workflows (reporting, risk identification, client communication). Identify friction points. Assess across five pillars. Define success criteria.

Outputs: Use Case Card, initial desirability & viability scoring.

2

Ideation & Business Prototype

We rapidly design and demonstrate how the Collaboration Hub supports real workflows.

Activities: Define key interaction patterns. Explore agent-based solution patterns (retrieval, orchestration, workflows). Build a lightweight prototype demonstrating querying enterprise knowledge, generating summaries, risks, and actions.

Outputs: Clickable prototype, early user feedback, validation of desirability and usability.

3

Experimentation

We validate that the solution delivers reliable, governed outputs using real data.

Activities: Test retrieval across selected data sources. Validate orchestration. Verify access control and governance enforcement. Evaluate output quality. Run targeted experiments to reduce uncertainty.

Outputs: Experimentation Log, updated confidence, Build Readiness Pack, decision to proceed / iterate / stop.

4

MVP Build

We deliver a working Collaboration Hub through incremental, end-to-end slices.

Approach: Build vertical slices delivering complete user workflows. Test-first development. AI-augmented engineering with human oversight. Leverage platform golden path capabilities.

Example slices: Slice 1, query + summary (SharePoint). Slice 2, source traceability. Slice 3, risk identification. Slice 4, structured outputs (reports, emails).

Outputs: Working MVP deployed to a controlled user group, early usage and feedback data.

5

Pilot

The Collaboration Hub is deployed to pilot users within live operational contexts.

Activities: Enable contract managers and analysts to use the solution in real workflows. Monitor usage, output quality, and performance. Gather structured user feedback. Refine prompts, retrieval, and workflows.

Outputs: Validated solution in real-world usage, evidence of adoption and user satisfaction.

6

Scale

The Collaboration Hub is scaled progressively across users, teams, and use cases.

Approach: Controlled release (phased rollout, environment promotion). Ongoing monitoring of performance, reliability, and adoption.

Scaling: Vertical, performance, reliability, cost optimisation. Horizontal, new users, contracts, and use cases. Each new use case builds on the same platform capabilities, not a new solution.

Agile Delivery Cadence

Weekly planning, define slices and priorities
Daily standups, maintain momentum and resolve blockers
Weekly demos, show working capabilities
Regular decision points, assess progress and adjust direction

Stakeholders see working software early and continuously, not at the end.

Dependencies & Enablers

Access to enterprise data sources (SharePoint, Teams)
Identity integration (Azure AD / Entra ID)
Environment provisioning (Azure infrastructure)
Availability of SMEs and pilot users

Identified early and managed as part of the delivery plan.

How We Demonstrate Early Progress

Progress is demonstrated through working functionality, not documentation:

Days, clickable prototype
Weeks, first thin slice delivered
Weekly, demonstrations of new capabilities
Weeks, not months, pilot usage with real users

Outcome: This delivery approach ensures that the Collaboration Hub delivers value early through real workflows, is validated with users before scaling, evolves incrementally into a robust, enterprise-grade capability, and contributes to and benefits from Serco's shared AI platform and capability library.

Responsible AI & Quality

The Collaboration Hub is designed to deliver trusted, auditable, and high-quality outputs. Given its role in supporting contract performance, risk identification, and client communication, structured controls are applied across the full lifecycle, from generation through to ongoing operation.

Grounded and Controlled Outputs

All outputs are grounded in enterprise data and subject to deterministic controls.

Retrieval ensures responses are based on approved enterprise sources (e.g. SharePoint, Teams, Databricks)
Outputs include source traceability, enabling users to validate the origin of insights
Responses are constrained to retrieved evidence and structured prompts

Critically: access control is enforced before retrieval and before agent execution. Policies are applied through deterministic controls (identity, RBAC, data policies), not model inference. This ensures outputs are both evidence-based and compliant by design.

Quality Evaluation Framework

We implement a structured evaluation framework to continuously measure and improve output quality.

Evaluation dimensions:

Accuracy, correctness of retrieved and summarised information
Relevance, alignment to user intent and context
Consistency, stability of outputs across repeated queries
Usefulness, ability to support real user decisions

Approach: Define evaluation datasets based on real Collaboration Hub scenarios (e.g. contract reporting, risk identification). Combine automated evaluation (pattern and metric-based scoring) with human review (SMEs validating outputs in context). This ensures quality is measured systematically and tied to real operational use.

Continuous Monitoring & Feedback

Once deployed, the Collaboration Hub is actively monitored to detect issues and drive improvement.

Logging of queries, retrieved data, agent decisions, and outputs
Monitoring for hallucination or unsupported responses, degradation in relevance or accuracy, and latency and performance issues
User feedback embedded in the interface (e.g. ratings, issue flagging)

Insights are fed back into prompt refinement, retrieval tuning, and orchestration and workflow adjustments. This creates a continuous improvement loop aligned to the broader experimentation and operating model.

Risk Management & Guardrails

We implement explicit guardrails to manage risks associated with AI-driven outputs.

Deterministic access control, Responses are limited to data the user is authorised to access. Identity and permissions are enforced through enterprise systems, not the model.
Execution controls, Agent actions are governed and constrained by predefined policies. Only approved tools, data sources, and workflows can be invoked.
Prompt and output constraints, Responses are limited to available evidence. Speculative or unsupported conclusions are avoided.
Human-in-the-loop controls, Critical outputs (e.g. client-facing reports) can be reviewed before use. Users remain accountable for final decisions.
Fallback behaviour, Where insufficient data exists, the system signals uncertainty rather than generating misleading outputs.

Model Management & Optimisation

Models and prompts are actively managed to maintain performance, reliability, and cost efficiency.

Prompt versioning, track and manage changes to output behaviour
Model routing, select appropriate models for classification, execution, and synthesis
Performance optimisation, balance accuracy, latency, and cost
Continuous refinement, incorporate new data, feedback, and usage patterns

This ensures the Collaboration Hub remains effective as data, usage, and requirements evolve.

Auditability & Governance

All interactions with the Collaboration Hub are traceable and auditable.

Logging of user queries, retrieved data sources, agent decisions and actions, and generated outputs
Version control of prompts, workflows, and configurations
Alignment with Serco's governance, security, and compliance standards

This provides a clear audit trail, supporting both internal assurance and external scrutiny.

Outcome: This approach ensures that the Collaboration Hub delivers reliable, evidence-based outputs, operates within clear governance and risk controls, is continuously measured and improved, and builds user trust, critical for sustained adoption and business impact.

Knowledge Transfer & Operating Model

Knowledge transfer is embedded throughout delivery, not treated as a final handover. The approach is designed to enable Serco to independently operate, extend, and scale the Collaboration Hub, while building the internal capability to deliver future AI use cases on the shared AI platform.

Embedded Delivery Model

Delivery is executed through a blended team model, where Serco teams work alongside our delivery team across all phases, from discovery through to scaling.

For the Collaboration Hub, this includes:

Product / Transformation leads shaping the use case
Data and platform teams supporting ingestion and integration
Engineering teams contributing to slice delivery
Business users participating in testing and feedback

This ensures knowledge is transferred through active participation in real delivery, not documentation alone, and builds confidence in operating AI-enabled workflows in practice.

Structured Knowledge Transfer Through the Methodology

Each stage of the methodology is designed to build specific capabilities within Serco:

Discovery & Use Case Intake, Teams learn how to identify and prioritise AI opportunities, apply the five-pillar assessment model, and define clear problem statements and success criteria.
Experimentation (Lab), Teams learn how to define and structure hypotheses, design and run experiments using real data, evaluate results and make evidence-based decisions.
Build Readiness, Teams learn how to translate validated use cases into production-ready designs, engage with governance (architecture, security, DPO), and align solutions to platform standards and approved patterns.
Delivery (Thin Slice Build), Teams learn how to build in vertical slices delivering end-to-end value, apply AI-augmented engineering practices, implement and test agent-based workflows and integrations, and deliver working functionality iteratively.
Operate & Scale, Teams learn how to monitor performance, usage, and quality, manage prompts, models, and orchestration behaviour, and scale capabilities across new users, data sources, and use cases.

This creates a repeatable model that Serco can apply beyond the Collaboration Hub.

Collaboration Hub-Specific Capability Transfer

For this use case, we focus on transferring the capabilities required to operate and extend the Hub as a reusable enterprise service. This includes:

Agent orchestration patterns (classification, routing, execution workflows)
Retrieval and grounding capabilities (RAG pipelines and hybrid retrieval)
Prompt engineering and structured output design
Model routing and optimisation strategies
Data ingestion and integration patterns (e.g. SharePoint, Teams, Databricks)
Access control and governance implementation (deterministic enforcement)
Monitoring, logging, and evaluation frameworks

All components are delivered in a way that is transparent, documented, and reusable, forming part of Serco's AI capability library.

Artefacts and Assets Handover

We provide full access to all artefacts created during delivery, including:

Use Case Cards and Experimentation Logs
Build Readiness Packs
Architecture designs and integration patterns
Source code, pipelines, and deployment configurations
Prompt libraries, agent workflows, and evaluation datasets
Monitoring dashboards and operational runbooks

These artefacts are structured for reuse across future use cases and aligned to platform standards.

Transition to Independent Operation

As the Collaboration Hub moves from MVP to scale, ownership progressively transitions to Serco teams. This includes:

Defined ownership across product (use case prioritisation and roadmap), engineering (delivery and extension of slices), platform (shared capabilities and infrastructure), and operations (support and monitoring)
Handover of support processes (monitoring, incident management)
Enablement of internal teams to deliver new slices and use cases independently

We support Serco in establishing a Centre of Excellence (CoE) or equivalent function to govern and scale AI delivery.

Platform Contribution & Continuous Evolution

The Collaboration Hub is not only a consumer of platform capabilities, but a contributor to their evolution. As part of delivery: reusable patterns are contributed to the AI Capability Library, gaps and constraints are fed into the platform backlog, and evaluation datasets and learnings are shared across use cases. This ensures that each delivery strengthens the overall platform, accelerating future AI initiatives.

End-State Operating Model

The target state is for Serco to operate the Collaboration Hub as a managed, evolving product within a broader AI platform. In this model:

New use cases are identified and fed through the same methodology
The Collaboration Hub acts as a shared capability layer across business units
Internal teams continuously extend functionality through new slices
Platform teams maintain and evolve shared capabilities
Governance, quality, and performance are managed centrally

Outcome: This approach ensures that Serco builds internal capability, not dependency, can extend the Collaboration Hub independently, establishes a repeatable model for AI delivery at scale, and continuously evolves its AI platform through real-world usage and feedback.

Delivery Phases & Estimated Effort

The Collaboration Hub is delivered incrementally, with effort aligned to each phase.

Phase	Duration	Key Activities
Discovery & Framing	2 weeks	Stakeholder engagement, workflow mapping, use case definition
Ideation & Prototype	1 week	Interaction design, prototype build, early validation
Experimentation	2 weeks	Data validation, retrieval testing, orchestration validation
MVP Build (Thin Slices)	6-8 weeks	Slice-based delivery, continuous deployment, weekly demos
Pilot	4-6 weeks	Real user usage, monitoring, refinement
Scale (Initial rollout)	4-8 weeks	Controlled rollout, performance optimisation

Total initial delivery: ~5-6 months

This approach ensures early value delivery (within weeks), controlled risk reduction before scaling, and predictable progression from concept to production.

Cost Estimate

Please see the Commercials section for team shape, rate card, and cost details.

Section 5 Team & Experience

1

Proposed Roles

2

Individual Profiles

3

Reusable Assets

Our Team

Blackstone& is a senior delivery team. Every person on the engagement delivers work directly, there are no management layers between the team and the output. The team is structured in two tiers: a fractional engagement team providing strategic direction, governance, and specialist advisory; and a full-time delivery team embedded in Serco's AI Lab on a day-to-day basis.

Fractional Engagement Team

Senior leadership available on a fractional basis, providing strategic direction, methodology oversight, operating model design, and data platform advisory without the cost of full-time senior rates.

Role	Person	Location	Basis	Primary Focus
Engagement Lead / AI Strategist	Kieran Blackstone	UK / UAE	Fractional	Strategy, methodology, stakeholder engagement, quality assurance
Target Operating Model / Delivery Lead	Wayne Palmer	UK	Fractional	Operating model design, delivery governance, capability building
Data Strategy Advisor	Suranga Fernando	UK	Fractional	Data platform strategy, Databricks architecture, data engineering advisory

Full-Time Delivery Team

Embedded in Serco's AI Lab, delivering day-to-day. Between them, Ras and Don cover the full delivery stack, from business analysis and product ownership through to production infrastructure.

Role	Person	Location	Basis	Primary Focus
AI Product Lead / Business Analyst	Ras	UK	Full-time	Business analysis, AI product ownership, experimentation, use case lifecycle
Data/ML DevOps Engineer	Don	UK	Full-time	Machine learning engineering, data engineering, DevOps, CI/CD, LLMOps

Security clearance: All four fully UK-based team members (Wayne, Ras, Don, and Suranga) have been SC cleared within the last few years. None currently hold active SC/DV. Blackstone& is able to provide SC cleared resources at scale through its vetted subcontractor network.

Key Individual Profiles

Kieran Blackstone

Fractional

Engagement Lead / AI Strategist

Location: UK / UAE

Availability: Fractional, strategic direction, methodology oversight, quality assurance

On-site: Available for key workshops, steering, and stakeholder engagement

Responsibilities

Strategy, methodology, stakeholder engagement, quality assurance. Owns the engagement relationship and overall delivery quality. Ensures the methodology is applied consistently and the team delivers against Serco's objectives.

Comparable AI Platform & Use Case Experience

Kanad Hospital, Abu Dhabi, Defined AI adoption strategy using the same methodology proposed for Serco. Built the hospital's first AI prototype. Currently delivering production use cases: customer support agents (AWS Bedrock) and website development agent, both integrating with Microsoft Fabric. Designed AI roadmap aligned to UAE healthcare regulation and sensitive patient data governance.

HMRC, Hawk Platform, Built a microservices platform giving businesses in trade a single self-serve interface. Components pre-approved by governance, security, data, and architecture boards, directly comparable to the AI Capability Library's golden path approach.

HMRC, GVMS, Oversaw delivery of the UK imports/exports trade system for UK ports post-Brexit. Introduced agile contracting principles. Matured the platform to Critical National Infrastructure standards.

DWP, DevOps Capability Delivery, Delivered DWP's first DevOps maturity assessment, leading to capability delivery in the Fraud, Error & Debt directorate. Built capability directly for DWP rather than creating consulting dependency.

DSIT, GenAI Product, Created DSIT's first GenAI-powered product on Salesforce Einstein. First use of generative AI in a production-facing government context.

Three Mobile, AI Labs Function, Engagement lead for the AI Lab service rollout, working alongside Ras Fernando and Don Capito on the build. Established the AI Labs methodology and delivery framework that forms the foundation of the approach proposed for Serco.

DfE, MOD (via DESA), Software development, DevOps services, and Salesforce rescue/transition across multiple government departments.

Founded Blackstone&, Built the Collaboration Hub prototype, AI Capability Library, and Agile Contracting Toolkit before this bid using the same rapid delivery approach proposed for Serco.

Wayne Palmer

Fractional

Operating Model / Strategy Execution

Location: UK

Availability: Fractional, operating model design, delivery governance, capability building

On-site: Available for on-site collaboration with Serco digital and business operations teams

Responsibilities

Deep experience in governance, delivery frameworks, and organisational design for technology functions across UK government. Involved in the majority of the engagements listed above, bringing complementary delivery and operating model expertise to Kieran's programme leadership. Current focus is designing and deploying AI-native product and platform operating models that drive employee engagement and accelerated productivity into organisations.

Comparable AI Platform & Use Case Experience

Morae Global, Augmented Product & Platforms Operating Model, Designed and rolled out a globally distributed operating model for this legal technology organisation. Bringing together multiple timezones into a cohesive way of working, introduced augmented engineering techniques which allowed teams to accelerate delivery while working within highly regulated domains. Initially driven by geographical hackathons to drive awareness and get buy-in.

Morae Global, Legal Intelligence Product Mobilisation, Working with a legal technology company to define and validate the team, governance and architecture for an AI intelligence layer. This includes a multi-agent system (orchestrator, contract analysis, eDiscovery, reporting agents) built on Azure AI Search, Neo4j knowledge graph, Databricks, and LangGraph orchestration. This system is the initial product that will begin to develop their AI Platform.

DSIT, Product & Platforms Operating Model, Analysed, designed and rolled out the Target Operating Model for DSIT as part of their departmental restructure. With a heavy focus on change management and role definition, created core work management backbones and setup core events to manage and route work effectively across the organisation.

Mastercard, DevOps Transformation, Led the transformation of their faster payments product to be orientated towards cross-functional teams with a focus on engineering excellence and fast flow. Stopped a failing re-architecture programme and pivoted resources to a modern architecture.

Cross-Government Delivery, Transformation roles across HMRC, DWP, DfE, and MOD engagements. Consistent focus on governance structures, delivery frameworks, and the organisational design required to make technology functions work after the consultants leave.

Ras Fernando

Full-time

AI Product Lead / Business Analyst

Location: UK

Availability: Full-time, embedded in Serco's AI Lab

On-site: On-site collaboration with UK-based Serco digital and business operations teams

Responsibilities

Business analysis, AI product ownership, experimentation, use case lifecycle. Runs the day-to-day delivery, from stakeholder discovery through experimentation to production handoff. Owns the Use Case Cards, Experiment Logs, and Build Readiness Packs.

Comparable AI Platform & Use Case Experience

Turner & Townsend (Current), Supporting AI-enabled contract workflows with embedded guardrails, human-in-the-loop decisioning, and scalable product design within commercial processes. Directly comparable to the Collaboration Hub's contract intelligence use case.

Three Mobile, AI Labs Function, Developed the Blackstone AI Labs methodology: a structured approach to identifying, validating, and scaling AI use cases across an enterprise. This methodology forms the foundation of the approach proposed for Serco's AI Lab.

HSBC, AI Labs Function, Built and delivered an AI Labs function supporting a global customer base and distributed engineering teams. Governance, risk, and controlled experimentation within a heavily regulated, globally distributed organisation.

DWP, DevOps Capability & Delivery, Led capability assessment and development programme in the Fraud, Error & Debt directorate. Translated transformation strategy into measurable delivery outcomes. Same challenge of building internal capability alongside external delivery.

Don Capito

Full-time

Data/ML DevOps Engineer

Location: UK (Huntingdon)

Availability: Full-time, embedded in Serco's AI Lab

On-site: On-site collaboration with UK-based Serco digital and business operations teams

Responsibilities

Machine learning engineering, data engineering, DevOps, CI/CD, LLMOps. Builds and operates the AI platform infrastructure. Bridges the gap between data science teams and production systems. Upskills Serco engineers through paired delivery.

Comparable AI Platform & Use Case Experience

Genentech (Roche), AI/HPC Platform Engineering, Built and scaled cloud-native AI/ML infrastructure on AWS supporting 200+ data scientists across US and Europe. Deployed next-generation AI/HPC platform replacing on-premise clusters, 3x cost reduction, Nvidia GPU instances (B200, H200, H100) for deep learning. Established observability with Grafana, Prometheus, OpenTelemetry. Upskilled L1/L2 support engineers on AI platform operations.

IAVI, Trusted Research Environment, Lead DevSecOps delivering a TRE for medical research institutions globally, handling personal and sensitive data. AWS well-architected framework with security controls, automated with Terraform/CDK. Led cross-functional team of 10 including data scientists.

Three Mobile, AI Labs Function, Automated Databricks, Unity Catalog, and Delta Live Tables CI/CD in Azure. Delivered scalable AlteryxServer cluster integrated to Snowflake. 10x horizontal scaling, deployment frequency from monthly to weekly. Upskilled DevOps and Data Engineers while delivering.

Imperial College London, Research Computing, Led multi-disciplinary team to deliver Trusted Research Environment in AWS. Design through to MVP delivery for researchers.

Security clearance: SC cleared (lapsed). UK-based.

Security clearance: All four fully UK-based team members (Wayne, Ras, Don, and Suranga) have been SC cleared within the last few years. None currently hold active SC/DV. Blackstone& is able to provide SC cleared resources at scale through its vetted subcontractor network.

Reusable Assets & Accelerators

Every asset listed below is working software. Not a slide deck. Not a template. Evaluators can click through each one.

Asset	What It Does	How It Accelerates Delivery
Collaboration Hub Prototype	Working prototype of the exact use case being tendered. Cross-border contract intelligence, AI-powered search, quality scoring, agentic enrichment.	Stakeholders interact with the solution concept on day one. No weeks of discovery before anything is visible.
AI Capability Library	156 AI capabilities mapped across 14 domains and 6 data classification levels. Living reference architecture with strategic data exposure analysis.	Maps infrastructure requirements for any use case. Identifies Enabler capabilities that reduce data classification for downstream deployments, cutting cost and expanding scope.
Build Readiness Backlog	Interactive tool mapping Serco's 34 identified use cases against infrastructure maturity levels.	Shows what is buildable now versus what is blocked by infrastructure gaps. Auto-generates a prioritised roadmap.
Use Case Submission Portal	5-pillar intake tool for structured use case assessment. Produces scored Use Case Cards.	Standardises inputs across business units. Evolves to self-serve intake as Serco's AI function matures.
Experimentation Hub	Library of premade experiments and test harnesses for AI use case validation.	Accelerates hypothesis validation. Reduces time from idea to evidence. Reproducible and auditable.
Agile Contracting Toolkit	Interactive commercial model demonstration showing how cost, risk, and scope are managed in agile delivery.	Builds commercial trust through transparency. Demonstrates the hybrid fixed-price/agile model proposed for this engagement.
AI Adoption Framework	End-to-end methodology from discovery through production, with working tooling at every step.	Not a methodology document, a structured, artefact-driven process backed by the tools listed above.

Live Demo

Collaboration Hub

Working prototype of the exact use case being tendered. Click through to explore.

serco-pulse.lovable.app →

Live Demo

AI Capability Library

156 capabilities, 14 domains, 6 data classification levels. Browse the full catalogue.

ai-capability-library.pages.dev →

Live Demo

Agile Contracting Toolkit

Interactive commercial model showing cost, risk, and scope management.

agile-contracting-toolkit.pages.dev →

How We Scale

The core team of three senior professionals is the foundation. The framework agreement's three-year term allows the team to scale as workload demands, using two mechanisms:

Specialist subcontractors. For specific capability needs, Databricks engineering, UX research, domain-specific data engineering, Blackstone& brings in vetted specialists. All subcontractors are assessed for security clearance eligibility and delivery quality before engagement.

Progressive ownership. Serco's own team is the primary scaling mechanism. As capability transfers through paired delivery and structured knowledge transfer, Serco engineers take on delivery directly. The Blackstone& team shifts from hands-on delivery to advisory and quality assurance. This is by design: the goal is a self-sustaining AI Lab, not a permanent consulting dependency.

What we do not do: fill seats with junior staff to meet a headcount target. Every person on this engagement adds delivery value from day one. Every day rate buys output, not overhead.

Section 6 Knowledge Transfer & Capability Building

Our Philosophy

The measure of this engagement is not what we build. It is whether Serco can build the next one without us.

Knowledge transfer is not a phase that happens at the end of delivery. It is a property of how we work, embedded in every sprint, every ceremony, every artefact, and every decision from day one. We do not transfer knowledge of what we built. We transfer the capability to build, adapt, and evolve independently.

This distinction matters. Technology changes. Models improve. New use cases emerge. If we hand over documentation of a system we built, Serco has a snapshot. If we hand over the methodology, the tools, and the institutional knowledge to adapt them, Serco has a capability that compounds over time.

1

Embedded Working

2

Centre of Excellence

3

Reusable Components

4

Progressive Ownership

Transfer Capability Tracker

Proposed Use Case, agentic capability monitoring

Four Transfer Mechanisms

We structure knowledge transfer around four mechanisms that work together. No single mechanism is sufficient on its own, embedded working builds skills, the Centre of Excellence provides structure, reusable components reduce reinvention, and progressive ownership creates accountability for independence.

1. Embedded Working

Our team works alongside Serco engineers, product leads, and business stakeholders within delivery squads. There is no isolated consultancy layer. If we are in a meeting, Serco is in that meeting. If we are writing code, a Serco engineer is pairing on it. If we are making an architecture decision, Serco's technical lead is in the room.

This is not observation. Every sprint includes Serco team members as active participants, contributing to hypotheses, making design choices, reviewing outputs, and owning artefacts. The work is shared from the start, which means there is nothing to "hand over" later.

2. Centre of Excellence Development

We support Serco in establishing a central AI capability function, a Centre of Excellence that owns standards, tooling, governance, and the methodology for delivering AI use cases across the organisation.

This includes defining:

Roles and responsibilities, who owns what in the AI delivery lifecycle, from intake through to production operations
Governance structures, how use cases are assessed, approved, monitored, and retired
Standards and reuse, how patterns, components, and learnings from one use case accelerate the next
Platform ownership, how the CoE relates to the existing Databricks DSML platform programme

Training is tailored by audience, because a Serco engineer needs different capabilities than a business stakeholder:

Audience	Transfer Method	Capability Developed
Engineering	Pair delivery, code reviews, architecture decision records	Build and operate AI products independently
Data	Pipeline development, data quality frameworks, Databricks integration patterns	Design and maintain data foundations for AI workloads
Product	Learning ceremonies, experimentation interpretation, roadmap formation	Identify, validate, and prioritise AI use cases using evidence
Risk & Governance	Responsible AI framework, HITL design, risk assessment methodology	Evaluate AI risk proportionately and govern responsibly
Business Users	Decision forums, Use Case Card submission, interpreting AI outputs	Commission AI work, make evidence-based investment decisions, apply critical judgement to AI-generated insights

3. Reusable Components & Standards

Every method, tool, and asset we use in delivery is designed for reuse and handed over to Serco. These are not locked behind our IP, they become Serco's operational toolkit:

Experimentation framework, hypothesis templates, test cards, experiment logs
Build Readiness Packs, the structured assessment that determines whether a use case is ready for build
Pipeline patterns, reusable ingestion, processing, and deployment workflows
Prompt libraries, tested, versioned prompts for common AI tasks
Architecture patterns, reference designs for RAG, agentic workflows, classification, and monitoring
Documentation and runbooks, operational procedures for every component we build
AI Capability Library, Serco's own governed instance of our 156-capability reference library, covering 14 domains with data classification guidance at every level

These compound. Each use case delivered adds patterns, prompts, and learnings to the shared library. By the fourth use case, Serco's teams are drawing on a substantial internal knowledge base that did not exist before.

4. Progressive Ownership Transfer

Transfer follows a defined four-stage model. Each use case progresses through these stages, with clear criteria for transition. This is not a theoretical framework, it is how we structure every engagement, and how we hold ourselves accountable for making ourselves replaceable.

Two things make this work in practice:

Transition is per-capability, not big-bang. Some capabilities transfer faster than others. A Serco engineer may reach Stage 3 on pipeline development while still at Stage 2 on architecture decisions. We track this at the individual and team level, so we can target support where it is genuinely needed.

Transition criteria are observable, not subjective. "Serco team contributing to experiments" is verifiable in sprint artefacts. "Running sprints independently" is visible in ceremony records. We do not declare transfer complete based on hours of training delivered, we declare it complete based on what Serco's team can demonstrably do.

The Operating Rhythm as Transfer Mechanism

Our five delivery ceremonies are not just project management. Each one is designed to build a specific capability in Serco's team:

Ceremony	Frequency	What Serco Learns by Participating
Planning	Weekly	How to identify testable hypotheses, decompose work, and prioritise experiments based on evidence and risk
Standup	Daily	How to surface blockers early, coordinate across disciplines, and maintain delivery momentum
Learning	Weekly	How to interpret experimental evidence, generate insights, identify patterns, and revise strategy. This is the primary transfer mechanism.
Retrospective	Bi-weekly	How to build continuous improvement habits, what to keep, what to change, what to try
Decision Forum	Monthly	How to make evidence-based AI investment decisions. When to kill work that is not delivering value. When to scale what is.

The Weekly Learning ceremony deserves emphasis. This is not a status report. It is a shared analysis session where Serco team members, alongside us, examine what the latest experiments have revealed, debate what the evidence means, and decide what to do next. This builds the analytical and decision-making capability that Serco needs to run the AI Lab independently. Over time, Serco's team leads these sessions. We move from facilitator to participant to observer.

The Monthly Decision Forum transfers the hardest capability of all: the discipline to stop work that is not delivering value. In our experience, organisations that succeed with AI at scale are not the ones that start the most initiatives, they are the ones that kill the wrong ones early and double down on the right ones. The Decision Forum builds this muscle in Serco's leadership team, using real evidence from real experiments, with real consequences.

Measuring Transfer Success

Transfer is not complete when we have trained people. It is complete when they can do it without us.

We track transfer through observable indicators, not training hours or satisfaction scores:

Indicator	What It Demonstrates	Target Stage
Serco team members leading sprint planning	Delivery capability	Stage 3
Engineers making architecture decisions with support, not direction	Technical independence	Stage 3
Build Readiness Packs authored by Serco with Blackstone& review	Assessment capability	Stage 3
New use cases entering the pipeline without Blackstone& involvement in intake	Discovery and prioritisation capability	Stage 4
Decision Forums running with Blackstone& in advisory, not facilitator role	Governance and kill discipline	Stage 4
AI Capability Library updated by Serco team, new capabilities evaluated, classified, and added	Evolving market knowledge	Stage 4
Next use case deployed end-to-end without external support	Full independence	Stage 4

These indicators are reviewed monthly. When all Stage 4 indicators are met, the engagement has succeeded on its own terms.

Accelerating Capability Beyond Delivery

Formal transfer through delivery and ceremonies builds deep capability in the core team. But Serco's AI ambition is organisational, 700+ contracts, four divisions, thousands of potential users. We recommend investing in broader capability acceleration alongside core delivery:

Internal community building. Hackathons, show-and-tells, and lighthouse demonstrations create energy and awareness beyond the delivery team. When a contract manager in the Middle East sees what a team in UK & Europe built in two sprints, that is more powerful than any training programme.

Working out loud. We recommend recording key sessions, architecture decisions, learning ceremonies, experiment reviews, and making them available through a simple, searchable internal knowledge base. A lightweight AI agent can handle PII scrubbing and indexing. This turns tacit knowledge into institutional knowledge, and it means new team members can onboard by watching real decisions being made, not reading sanitised documentation.

Role-specific learning journeys. Not everyone needs the same depth. A business user submitting a Use Case Card needs a 20-minute walkthrough. An engineer joining the delivery squad needs a structured onboarding path covering architecture, tooling, and ways of working. We design these paths and hand them over as part of the CoE toolkit.

Builder profiles. We track what capabilities each team member has developed, across technical skills, methodology understanding, and domain knowledge. This makes capability gaps visible and manageable, and gives Serco's leadership a clear picture of where the organisation is strong and where it needs investment.

Our Use Case for Serco: The Knowledge Transfer Agent

Everything described above is how we transfer knowledge. But we also want to propose a use case of our own, one that would enter Serco's AI pipeline alongside the 34 identified opportunities, validated through the same methodology, and delivered using the same approach.

The progressive ownership model relies on human judgement to assess readiness. That works, but it requires our team to be present, observing, and making those calls. What if an AI system could do this continuously, independently, and at a level of detail that no human observer can sustain?

The use case: an agentic system that monitors the engagement itself, not just what is being built, but the capabilities required to build, operate, and evolve it, and compares those requirements against the real capability profiles of every team involved.

How It Works

The system operates across three layers:

Layer 1 Capability Demand Mapping

The agent monitors delivery artefacts in real time, architecture decision records, ceremony recordings and transcripts, sprint outputs, code commits, infrastructure configurations, Build Readiness Packs, and operational runbooks.

From these, it extracts the specific capabilities being exercised: which AI patterns are in use, which data engineering techniques, which governance frameworks, which operational practices. Every capability is tagged and tracked against our AI Capability Library taxonomy.

Layer 2 Builder Profile Comparison

Every team member, ours, Serco's, and any other third-party consultancy, has a builder profile: a structured record of demonstrated skills, domain knowledge, methodology familiarity, and delivery experience.

The agent continuously compares the capability demands of the engagement against the builder profiles of the internal Serco team. The output is a live, evolving gap analysis, not what Serco's team was trained on, but what they can demonstrably do versus what the work actually requires.

Layer 3 Gap Closure Recommendations

For each identified gap, the system recommends one of three paths:

Train

Judgement-intensive, context-dependent, or requiring stakeholder trust. E.g. architecture decision-making, responsible AI assessment, stakeholder negotiation.

Automate

Repetitive, pattern-based, or requiring consistency at scale. E.g. pipeline monitoring, prompt evaluation, documentation generation, data quality checks.

Train + Automate

Human understands the capability for oversight and edge cases, but day-to-day execution is agent-assisted. E.g. code review with AI analysis, security scanning with human sign-off, experiment log analysis.

Why This Matters

Independent accountability

This system removes the single biggest risk in any knowledge transfer engagement: the consultancy marking its own homework. The agent tracks capability transfer independently of us. It does not rely on our assessment of whether Serco's team is ready. It watches what Serco's team actually does, in ceremonies, in code, in decisions, and measures that against what the work demands.

Capability drift protection

It also solves a problem that no amount of traditional training addresses: capability drift. As the AI landscape evolves, the capabilities required to operate Serco's AI estate will change. New model architectures, new security requirements, new regulatory frameworks. The agent continuously updates the demand side of the equation, so Serco always knows where the gaps are, even after we have left.

Real-time capability view

Finally, it provides Serco's leadership with something they cannot get from timesheets or training records: an honest, real-time view of organisational AI capability, who can do what, where the dependencies are, and what it would take to close each gap.

This is not a standard offering. It is a capability we would build during the engagement, using the same methodology and platform we use for any other use case. It would enter the pipeline through the AI Front Door, be validated through the Experiment Engine, and, if it proves value, be deployed as a production tool that Serco owns and operates independently.

What a Builder Profile Looks Like

The builder profile is the foundation of the entire system. Every person involved in the engagement, our team, Serco's engineers, business stakeholders, gets one. It is not a CV. It is a living calibration tool that tracks what someone can demonstrably do, how they work, and where their boundaries are.

Below are two examples showing how profiles work for very different roles.

Example 1 Ravi Sharma, Senior AI Engineer

Serco, UK AI Lab (Internal) | Deep Databricks/ML experience, growing into LLMs and agentic architectures

Profile Created: 2026-05-12 | Last Updated: 2026-07-18

Background: 8 years across data engineering and machine learning, joining the AI Lab from Serco's Data Platform team. Deep experience with Databricks, Python, and traditional ML pipelines. Newer to LLM-based systems, agentic architectures, and RAG patterns, areas of active growth through the engagement.

Tech Stack Competence:

Tool / Capability	Level	Notes
Python	Expert	Primary language, 8 years
Databricks	Expert	3 years, built production ML pipelines on Serco's DSML platform
SQL	Expert	Complex queries, performance tuning, data modelling
LLM APIs (Claude, GPT)	Proficient	Comfortable with prompt engineering and API integration
RAG Architecture	Familiar	Has built one prototype RAG pipeline, needs guidance on production patterns
Agentic Workflows	Aware	Understands the concept, hasn't built one independently
LLMOps / Evaluation	Familiar	Can run basic evaluations, needs guidance on systematic eval harnesses
Responsible AI Frameworks	Familiar	Understands principles, has not led a risk assessment independently

Responsible Building Controls, Competence Boundary Detection:

"This work involves agentic workflow architecture, which is beyond your current rated level (Aware). This isn't a problem, it's how skills grow. But to make sure the output is production-safe, it should be reviewed by someone rated Proficient or above in this area before it goes further."

Suggested reviewers for agentic workflow architecture:
Alex Chen (Expert), available, same squad
Wayne Sheridan, Blackstone& (Expert), available for async review

Stretch work is encouraged, not blocked. For critical domains (security architecture, data classification, responsible AI guardrails), stretch work requires sign-off before deployment, not just review after the fact.

Growth & Learning Log (extract):

Date	Progression	Evidence
2026-07-18	RAG Architecture: Familiar → Proficient	Led chunking strategy redesign for contract ingestion pipeline. Architecture decisions documented in ADR-017.
2026-06-22	LLM APIs: Familiar → Proficient	Built production prompt pipeline for contract summarisation. Versioned prompts, fallback handling, runbook written independently.
2026-06-22	Responsible AI: Aware → Familiar	Participated in risk assessment for Collaboration Hub. Contributed to HITL design.
2026-06-01	RAG Architecture: First hands-on build	Prototype search pipeline using Databricks vector store + Claude API. Needed step-by-step guidance.

Ownership Stage:

Capability Area	Current Stage	Evidence
Data pipeline development	Stage 4, Independent	Deployed contract ingestion pipeline end-to-end without external support
RAG architecture	Stage 3, Lead with Support	Leading chunking redesign, architecture decisions reviewed
Agentic workflows	Stage 1, Learning	Observing and contributing, hasn't led independently

Example 2 Sarah Hartley, Director of Contract Operations

Serco, UK & Europe Division | 20 years programme delivery, AI commissioner and investment decision-maker

Profile Created: 2026-05-05 | Last Updated: 2026-07-15

Background: 20 years in large-scale programme delivery, responsible for a portfolio of 200+ contracts spanning justice, health, and citizen services. No technical background in AI or software development, and doesn't need one. Her role is as a commissioner of work, an investment decision-maker, and a champion of AI adoption across her division.

Tech Stack Competence:

Tool / Capability	Level	Notes
AI concepts (LLMs, RAG, agents)	Familiar	Understands what they do and where they apply, cannot build or configure them
Use Case Card submission	Proficient	Has submitted 6 use cases through the AI Front Door
Experimentation interpretation	Familiar	Can read experiment summaries, sometimes needs help distinguishing signal from noise
AI Capability Library	Familiar	Can navigate and filter, understands tier structure
Commercial modelling for AI	Proficient	Can model ROI, build business cases, assess cost-benefit at portfolio level
Technical architecture	None	Not her role, routes to engineering leads

Responsible Building Controls, Commission-Safe Defaults:

"This decision involves evaluating RAG architecture trade-offs, which is outside your technical expertise (rated: None for technical architecture). That's expected, your role is the commercial and strategic lens. But this decision has technical implications that should be validated by someone rated Proficient or above before it's committed."

Suggested technical validators:
Ravi Sharma (Expert, Databricks, Proficient, RAG), same programme
Alex Chen (Expert, agentic architecture), available for 30-min review

For any use case Sarah submits through the AI Front Door, technical feasibility assessment is automatically routed to engineers rated Proficient or above in the relevant capability areas.

Growth & Learning Log (extract):

Date	Progression	Evidence
2026-07-15	Kill discipline milestone	First time Sarah voted to kill a use case she had personally championed (contract compliance scanning, viable but low impact relative to alternatives).
2026-06-20	AI Capability Library: Aware → Familiar	Used the library independently to assess a proposal from the Middle East division. Correctly identified missing Enabler-tier infrastructure and recommended sequencing.
2026-06-05	Use Case Card: Familiar → Proficient	5th submission. Problem statements specific, impact estimates grounded in contract data, data sensitivity classifications correct without review.

Ownership Stage:

Capability Area	Current Stage	Evidence
Use case identification & submission	Stage 3, Lead with Support	Submitting quality use cases independently
Evidence-based investment decisions	Stage 3, Lead with Support	Leading Decision Forum discussions, killed a use case on evidence
Portfolio prioritisation	Stage 2, Co-Deliver	Uses Capability Library for sense-checking, not yet leading independently
Technical feasibility assessment	Stage 1, Learning	Appropriately routes to engineers (nor should she assess this herself)

Our Track Record

This is how we work. It is not a section we added to a proposal, it is the model we have applied repeatedly across UK government.

Department for Work and Pensions, During our DevOps Capability Delivery programme, we trained DWP apprentices alongside experienced engineers in live delivery. The apprentices were not observers. They were building production services, supported by our team, developing capability that remained in DWP long after we left.

Ministry of Defence, We transitioned responsibility from an underperforming incumbent by first documenting the dependency the existing supplier had created, then building a streamlined service model that reduced external reliance. The goal was not to replace one supplier with another, it was to give MOD the ability to operate independently.

Department for Science, Innovation and Technology, Our team designed and rolled out the Target Operating Model for DSIT's technology function, defining the roles, responsibilities, and ways of working that the department continues to operate under.

In each case, the engagement ended with the client's team running the capability. That is the only outcome we consider successful, and it is the outcome we commit to delivering for Serco.

Section 7 Commercials

1

Team Shape

2

Rate Card

3

Collaboration Hub

4

Agile Contracting

Team Shape

Our proposed team shape for the Serco engagement combines fractional strategic leadership with full-time embedded delivery. This model provides senior expertise without the cost of full-time senior rates, while ensuring day-to-day delivery is consistent and embedded within Serco's operations.

The team scales based on delivery phase, lighter during discovery and experimentation, fuller during MVP build and scaling. As capability transfers to Serco, the Blackstone& team contracts and Serco's internal team expands.

Initial Engagement Team

Name	Role	Basis	Day Rate (£)
Kieran Blackstone	Engagement / Delivery Lead	Fractional Collectively up to 3 days/week	1,200*
Wayne Palmer	Operating Model / Strategy Execution		1,200*
Suranga Fernando	Data Strategy & Databricks Advisor		1,200*
Ras Fernando	AI Product Lead / Business Analyst	Full-time (5 days/week)	900*
Don Capito	Data/ML DevOps Engineer	Full-time (5 days/week)	900*

Fractional Leadership

3 days/week

Shared across Kieran, Wayne & Suranga

£3,600/week

+

Full-Time Delivery

10 days/week

Ras (5 days) + Don (5 days)

£9,000/week

=

Weekly Total

13 days/week

5-person blended team

£12,600/week

* Discount of 12.5% on standard rate card

Collaboration Hub Cost Estimate

The Collaboration Hub use case will be delivered across six phases over approximately 18 weeks. The initial core team (outlined above) will be present throughout the full engagement. From the MVP phase onwards, we add a nearshore tester to support quality assurance through build, pilot, and scale, an additional 13 weeks of coverage.

Phase	Duration	Team
1. Discovery	1 week	Core team
2. Ideation & Prototyping	2 weeks	Core team
3. Experimentation	2 weeks	Core team
4. MVP Build	3 weeks	Core team + Nearshore Tester
5. Pilot	4 weeks	Core team + Nearshore Tester
6. Scale	6 weeks	Core team + Nearshore Tester

Weekly Cost Breakdown

Phases 1-3 (5 weeks)

Core team only

£12,600/week

→

Phases 4-6 (13 weeks)

Core team + Nearshore Tester

£12,600 + £1,500 (tester @ £300/day) = £14,100/week

Total Estimated Cost

Phases 1-3: Core team (5 weeks × £12,600)	£63,000
Phases 4-6: Core team (13 weeks × £12,600)	£163,800
Phases 4-6: Nearshore Tester (13 weeks × £1,500)	£19,500
Total Collaboration Hub Delivery	£246,300

Rate Card

Role	UK Day Rate (£)	Nearshore Day Rate (£)
Engagement / Delivery Lead	995	-
Product Lead	995	-
Security Architect	1,100	-
Data Architect	1,100	-
Platform Architect	1,100	-
Principal AI Engineer	1,100	605
Senior AI Engineer	950	495
AI Engineer	875	500
Principal Full Stack Engineer	895	485
Senior Full Stack Engineer	825	465
Full Stack Engineer	750	435
Principal Data Engineer	1,150	610
Senior Data Engineer	910	505
Data Engineer	875	500
Principal DevOps Engineer	950	515
Senior DevOps Engineer	850	480
DevOps Engineer	750	435
Principal QA Engineer	825	430
Senior QA Engineer	700	385
QA Engineer	600	345
Senior Business Analyst	800	435
Business Analyst	700	395
Junior Business Analyst	500	290
Fractional Subject Matter Experts (AI Strategy, Data Strategy, TOM SME, Security SME)	1,500	-

Potential Roadmap to Agile Contracting

We propose a phased commercial approach that allows both parties to build confidence in the engagement, the partnership, and the ways of working before committing to a long-term commercial structure.

Supporting tool: agile-contracting-toolkit.pages.dev

1

Discovery & Foundation

Days 1-90 | Time & Materials

Pure T&M for the first 90 days. Scope is clarified, dependencies mapped, and ways of working established.

Why T&M

Scope still being defined. Dependencies not yet understood. Teams need time to establish rhythms. Premature milestones would create friction.

Focus

Building a shared understanding of the problem space, delivery landscape, and the partnership itself.

2

Dual-Track Commercial

Days 91-180 | T&M (primary) + Hybrid Agile (shadow)

Primary Track

T&M remains the billing model. All invoicing continues on a day-rate basis. No change to the commercial relationship.

Shadow Track

Hybrid Agile model runs as a comparison. Sprint milestones defined, delivery tracked, costs calculated, without money changing hands differently.

What the shadow track proves

Can we define meaningful milestones together? How does the hybrid model perform financially vs T&M? What's the right milestone allocation percentage based on real delivery, not assumptions? At the end, both parties have 90 days of comparative data, a concrete, evidence-based foundation for the long-term model.

3

Long-Term Model

Day 181+ | Hybrid Agile (if agreed) or T&M (if preferred)

If both parties are comfortable

Transition to Hybrid Agile Contracting. Milestone allocation, sprint cadence, and deliverable definitions informed by six months of real engagement data.

If either party prefers T&M

That remains a perfectly valid choice. The dual-track phase ensures the decision is informed, not forced.

Section 8 Assumptions, Risk & Compliance

Clear boundaries, shared accountability

What Serco Provides

Assumption	Impact If Not Met
Access to contract data within first 2 weeks	Discovery delayed; prioritisation based on incomplete picture
Cloud infrastructure and Databricks workspace access provisioned	Build phase cannot begin; team idle time
SSO/IAM integration available or scheduled within discovery	Workaround needed for access control
Named business stakeholders available 2-4 hrs/week	Decisions deferred; sprint velocity reduced
Security clearance guidance and sponsorship in first week	Team access to classified environments delayed
Existing governance and classification frameworks shared at start	Duplicate effort defining controls already in place

General Assumptions

Databricks is a separate workstream, we integrate with it, we do not build or manage it
Initial delivery focuses on UK & Europe, global rollout phased by division
Serco's existing governance frameworks apply, we operate within them, not alongside
Day rates quoted in GBP, exclusive of VAT
Remote-first with on-site for workshops, reviews, and key sessions

Exclusions

Exclusion	Clarification
Databricks platform build or management	We integrate; we do not own it
Data cleansing or migration	We work with data as provided; flag quality issues for Serco's data team
Custom hardware procurement	All delivery uses Serco-approved cloud infrastructure
Legal or regulatory advice	We identify requirements; Serco legal provides interpretation
Microsoft Copilot configuration	Separate programme
Penetration testing or formal security certification	We build to standards; formal testing is Serco's responsibility

Jurisdictional Compliance

Division	Key Regulations	Considerations
UK & Europe	UK GDPR, Data Protection Act 2018, EU AI Act	Majority of initial use cases; well-understood landscape
Middle East	Data localisation requirements (UAE, KSA, Qatar each distinct)	In-country hosting likely required
North America	ITAR (potential), PIPEDA, state privacy laws	US defence contracts may restrict model/hosting choices
Australia & NZ	Privacy Act 1988, Australian Government ISM	Five Eyes alignment simplifies some cross-border considerations

Data Handling

No data on Blackstone& systems. All processing on Serco-approved infrastructure.
In-situ access only. No copies, exports, or transfers to external environments.
Cloud models: DPAs in place, UK/EU data residency, zero-retention API policies.
Classified data: Self-hosted models within Serco's secure boundary. No data leaves.
Full audit trail within Serco's infrastructure for compliance and monitoring.

IP & Confidentiality

Category	Ownership	Detail
Bespoke outputs	Serco	All deliverables created specifically for Serco, full ownership, unrestricted use
Methodology & frameworks	Blackstone&	Perpetual, royalty-free licence to Serco for internal use
Open-source components	Per licence terms	Identified, catalogued, licence-compatible. No copyleft contamination.

Open Items for Negotiation

Item	Our Position
Uncapped liability provisions	Seek cap proportionate to contract value
Breadth of supplier IP licence	Clarify scope: bespoke outputs, not pre-existing IP
Termination notice period	Align with sprint cadence for orderly handover

Standard negotiation points, these do not represent objections to Serco's Terms and Conditions.

Section 9 Environmental, Social & Governance

Environmental

Blackstone& operates as a remote-first, lean consultancy. Our delivery model minimises environmental impact by design:

Low-carbon delivery model. No permanent office footprint. Team members work remotely by default, with on-site presence at Serco locations only when delivery requires it, reducing commuting and facilities overhead.
Responsible compute. We actively select AI models and infrastructure that balance capability against energy consumption. Our approach to model strategy (detailed in AI Foundations) includes cost and compute efficiency as selection criteria, avoiding the default of running the largest available model when a smaller, fine-tuned model delivers equivalent results at a fraction of the energy cost.
Cloud provider alignment. Our recommended infrastructure runs on hyperscaler platforms (AWS, Azure, GCP) that publish verified carbon intensity data and have committed to 100% renewable energy targets, aligning with Serco's own target of 100% renewable-sourced electricity.

We commit to procuring an EcoVadis Sustainability Assessment within 6 months of contract execution, as required under the Framework Agreement, and to sharing results via the EcoVadis portal.

Social

Our engagement model directly supports Serco's social value objectives:

Capability building over dependency. Knowledge transfer is embedded in every sprint, not bolted on at handover. The explicit goal of this engagement is for Serco's own teams to operate AI products independently, creating sustainable, skilled roles within the organisation.
Inclusive AI design. Our Responsible AI approach (detailed separately) includes bias testing, fairness evaluation, and diverse stakeholder input during use case design, ensuring AI products serve all user groups equitably across Serco's four global divisions.
Skills development. Our delivery model pairs Serco engineers directly with our team from day one. This is practical, on-the-job upskilling, not a training course delivered after the fact.

Governance

AI governance framework. We propose establishing a Responsible AI governance structure as part of the AI Foundations workstream, including policy development, human-in-the-loop controls, risk assessment processes, and red-teaming protocols. This gives Serco a reusable governance capability, not just a one-off compliance exercise.
Transparent ways of working. All code, documentation, architecture decisions, and delivery artefacts are owned by Serco from day one (per the Framework Agreement IP terms). No proprietary lock-in, no black boxes.
Data handling. We operate under clear data handling principles: Serco data is processed only within agreed environments, with no subcontractor access unless explicitly agreed. Our data handling disclosure is provided separately in the Assumptions & Governance section.