Why do 95% of enterprise AI pilots fail?

MIT's Project NANDA found that approximately 95% of generative AI pilots fail to deliver measurable financial impact, with purchased tools from domain vendors succeeding roughly three times as often as internal builds — primarily because the vendor handles the data pipeline, not because the model is superior.³

How much does bad data cost the construction industry?

Autodesk and FMI estimated the cost may be as much as $1.85 trillion globally in their 2021 study of 3,900+ professionals, with 30% reporting that more than half their project data is unreliable.¹ Gartner separately estimated the average organizational cost of poor data quality at $12.9 million per year.⁴⁴

What is the 1-10-100 rule in construction quality?

The principle, established in CII research, holds that an error caught in design costs $1 to fix, the same error caught during construction costs $10, and after project completion it costs $100. CII's research found that design-phase deviations account for approximately 78–79% of total quality costs.¹³

Can better data improve AI performance more than a larger model?

Yes. Alawadhi and Abbas found that tailored datasets, advanced embedding models, and optimized chunking strategies were the key differentiators in RAG performance for electrical engineering applications — not the choice of language model — demonstrating that data quality and retrieval architecture outweigh raw model size.⁹

What percentage of engineers trust AI output without checking it?

Six percent. The Omni Calculator 2026 survey of 400+ engineers found that 86% use AI tools but only 6% trust results without manual verification, with 85% citing accuracy as their primary concern.⁶

How much time do construction workers spend searching for information?

PlanGrid and FMI's 2018 study found that construction professionals spend 5.5 hours per week searching for project data and 14+ hours per week total on non-productive information activities — 35% of their working time, costing $31.3 billion annually in the U.S.²

What did ASCE's AI policy statement say about data governance?

ASCE Policy Statement 573, adopted July 18, 2024, establishes that the licensed professional engineer retains full responsibility for AI-assisted work product and that AI deployment requires verified, validated knowledge bases. It focuses entirely on data governance and professional accountability, not model selection.⁴⁷

Good knowledge = good engineering

Q: What is information infrastructure in an engineering context?

Information infrastructure refers to the systems, classifications, governance rules, and data structures that determine whether engineering documents — drawings, specifications, inspection records, HAZOP studies — are findable, current, and reliable. Cisco's 2024 AI Readiness Index found that only 32% of organizations rate their data infrastructure as AI-ready.¹⁴

Q: What is a data cascade in AI systems?

A data cascade is a downstream failure triggered by upstream data quality issues. Sambasivan et al., studying 53 AI practitioners in high-stakes domains, found that 92% had experienced data cascades, with 45.3% experiencing two or more. The researchers characterized them as "pervasive, invisible, delayed, but often avoidable."¹⁵

Q: How long does it take to see ROI from AI in engineering?

Deloitte's 2025 survey of 1,854 executives across Europe and the Middle East found that the typical AI ROI timeline is 2–4 years, with only 6% achieving payback in under a year. Mott MacDonald, which invested in information infrastructure before deploying AI, reported ROI in under 12 months.⁵ ¹⁰ ---

How leading engineering firms turn messy data into measurable AI returns

The global construction industry may lose as much as $1.85 trillion a year to bad data.¹ That figure comes from a 2021 Autodesk/FMI survey of 3,900 professionals — and nothing about it has improved. Workers spend 35% of their time on non-productive activities like searching for information, resolving conflicting documents, and managing rework caused by errors that lived in the data long before anyone asked an AI model to read it.² Meanwhile, approximately 95% of enterprise generative AI pilots fail to deliver measurable financial impact, and the determining variable is not model capability — it is the learning gap between tools and organizations, including flawed enterprise integration of the information the model depends on.³ Engineering firms that deploy AI on top of disorganized, unclassified, and ungoverned information don't just fail to get ROI. They systematically amplify the errors, rework, and knowledge loss that were already bleeding them dry.

Key Points

Bad data may cost the global construction industry as much as $1.85 trillion annually, with 30% of professionals reporting that more than half their project data is unreliable.¹

Lessons Learned

Audit your information infrastructure before evaluating any AI tool. If you cannot tell a model where your documents are, what they contain, and whether they are current, the model cannot help you.

What is the real problem with AI in engineering?

Three weeks. That is how long it took Samsung Semiconductor to turn a reasonable productivity tool into a global data security crisis. In March 2023, Samsung lifted its internal ban on ChatGPT. Within 20 days, engineers had pasted proprietary source code, code for identifying defective equipment, and confidential meeting transcripts directly into the model.¹¹ Samsung responded with a 1,024-byte input cap. In May 2023, the company issued a full company-wide ban.

The engineers were not reckless. They were doing exactly what generative AI promises: accelerating debugging, analysis, and documentation. The failure was not the AI and it was not the engineers. It was the absence of any data classification, data-loss prevention, or governance infrastructure between the workforce and the tool.

Samsung's episode is fast and dramatic, but the underlying condition is universal. The construction and engineering industry runs on information — drawings, specifications, inspection records, HAZOP studies, commissioning documents, meeting minutes — and that information is overwhelmingly unclassified, ungoverned, siloed, or missing entirely. AI does not fix that. AI reads it, trusts it, and gives it back to you at scale, errors included.

How do you recognize information infrastructure failure?

The symptoms are well documented, even if the diagnosis is not. Peter Love's January 2026 study in the ASCE Journal of Construction Engineering and Management found that pre-completion rework averages 0.38% of contract value — a figure that sounds manageable until you learn that rework costs are systematically underreported by 300%, and post-completion rework averages 0.76% with peaks at 7.34%.⁸

The Get It Right Initiative (GIRI), tracking UK construction since 2015, puts the direct cost of errors at roughly 5% of project value — approximately £5 billion per year in the UK alone — with total costs (including indirect and consequential) reaching 10–25% of project value. That range has, in GIRI's own framing, "remained stubbornly high" for a decade.¹² The figure is roughly seven times the industry's average profit margin.

A 2023 study by Khadim and colleagues in Quality & Quantity quantified what most project teams sense but cannot prove: hidden quality-failure costs run 3.67 times higher than visible costs. Of a total cost-of-quality figure averaging 12.76% across 25 projects studied, only 2.34% was visible. The remaining 8.59% was buried in schedule absorption, workarounds, and unreported fixes.⁷

The Construction Industry Institute established the 1-10-100 principle decades ago: an error caught in design costs $1 to fix; the same error caught during construction costs $10; caught after completion, $100.¹³ The information failures that feed AI systems today are the design-phase errors of tomorrow — except AI delivers them faster and with more confidence than any draftsperson ever could.

Why is this problem getting worse?

Because spending is increasing on AI tools while the information those tools depend on is not improving at the same rate. Cisco's 2024 AI Readiness Index, surveying 7,985 business leaders across 30 markets, found that only 13% are fully ready for AI — down from 14% the previous year. Ninety-eight percent report increased urgency to deploy AI. Eighty percent acknowledge data shortcomings. Only 32% rate their data infrastructure as ready.¹⁴

The gap is widening, not narrowing. And the firms that fall behind on information infrastructure do not simply miss out on AI benefits — they actively compound existing problems. Sambasivan and colleagues, in a 2021 study of 53 AI practitioners working in high-stakes domains, found that 92% had experienced "data cascades" — downstream failures triggered by upstream data quality issues that are "pervasive, invisible, delayed, but often avoidable."¹⁵

What do engineers and executives actually believe about AI?

Engineers already know. The Omni Calculator 2026 survey of 400+ engineers found that 86% use AI — but only 6% trust its output without verification. Eighty-five percent cite accuracy as their primary concern. Fifty-two percent perform manual checks on every AI-generated output.⁶

Autodesk's 2025 State of Design & Make report, surveying 5,594 professionals, found that trust in AI across design-and-make industries dropped 11 percentage points year-over-year. The share who believe AI will "enhance" their industry fell from 78% to 69%. The share who believe it will "destabilize" rose from 41% to 48%.¹⁶

Executives see a different picture. BCG and Columbia, in a study of 1,400 professionals published in Harvard Business Review, found that 76% of executives believe their employees are enthusiastic about AI. The actual figure among individual contributors: 31%.¹⁷ Accenture's 2024 workforce study found a similar disconnect: 95% of workers see value in generative AI, but roughly 60% are concerned about job displacement, fewer than one-third of C-suite leaders share that concern, and only 5% of organizations are training employees at scale — despite 94% of workers saying they are ready to learn.¹⁸

The people doing the work do not trust the tools. The people buying the tools do not know that.

---

What does AI failure actually cost engineering firms?

The case studies follow a pattern: information infrastructure degrades slowly, invisibly, over years — then surfaces as a crisis that costs orders of magnitude more than the infrastructure would have.

Plant Vogtle: tens of thousands of missing records and a $16 billion overrun

Plant Vogtle Units 3 and 4 were supposed to cost $14 billion. They came in above $30 billion, seven years late. Among the factors: tens of thousands of missing inspection and quality records that had to be recreated, contributing to an estimated $920 million in delay costs from the documentation backlog alone.¹⁹ Tom Fanning, then-CEO of Southern Company, acknowledged publicly that the project's document management failures were a primary driver of schedule and cost escalation. The drawings existed. The inspections had been performed. The records connecting them were gone.

Freeport LNG: a HAZOP study that sat in a system and saved nothing

On June 8, 2022, an explosion at the Freeport LNG facility in Texas took 17% of U.S. LNG export capacity offline. The root cause analysis revealed that a 2016 HAZOP study had failed to evaluate the specific blocked-in piping scenario that caused the rupture. Alarms that should have triggered were orphaned — present in the system but disconnected from operational response. The facility faced a $1.54 million fine. Market losses were estimated at $6–8 billion.²⁰ The safety analysis existed. The data was in the system. The information infrastructure connecting analysis to action had decayed.

Crossrail: management data as the failure

London's Crossrail project was reported as roughly 90% complete when leadership told the public it would open on schedule. Thirty percent of milestones had been marked as complete without meeting their acceptance criteria. Bond Street station alone went from a £110 million budget to £660 million. The total project cost grew from £14.8 billion to over £19 billion, with more than £4 billion in overruns and full services arriving over four years late.²¹ The project's own data — the status reports, the milestone tracking, the progress dashboards — was the failure mode. Leadership was making decisions on information that did not reflect reality.

What do the surveys confirm?

The pattern holds at population scale. The HKA CRUX Insight reports, analyzing thousands of disputed capital projects, find that 33.2% of total capital expenditure ends up in dispute, with an average disputed value of $83.1 million per project. Time overruns average 16 months — 66.5% of planned schedule. Design-related issues account for the largest share of claims.²² Arcadis's annual construction disputes report finds that "Errors and Omissions in Contract Documents" has been the number-one cause of disputes in six of the last nine years, with an average dispute value of $60.1 million and 12.5 months to resolve.²³

Ninety-one and a half percent of major projects exceed their budget or schedule. Fewer than one percent finish on time, on budget, and deliver the promised benefits.²⁴

---

How does information failure compound across an organization?

The problem is not a single bad document. It is a system that makes bad documents inevitable.

PlanGrid and FMI's 2018 study of roughly 600 construction professionals found that workers spend 35% of their time — over 14 hours per week — on non-productive activities. Of that, 5.5 hours per week goes to searching for project information. The annual cost: $31.3 billion in the U.S. alone.² A 2004 NIST study estimated that poor interoperability between building information systems cost the U.S. capital facilities industry $15.8 billion per year, with $10.6 billion of that burden falling on owners and operators.²⁵ McKinsey Global Institute estimated in 2012 that knowledge workers spend 19–20% of their workweek searching for information, with a 35% efficiency gain available through better information systems.²⁶ These studies span two decades. The numbers have not materially improved.

What do users actually complain about?

The complaints are remarkably consistent across roles and industries. RICS's 2025 AI in Construction report, surveying over 2,200 professionals, found that 45% have not adopted any AI, and only 1% describe their AI usage as "scaled." The top three barriers: lack of understanding of AI tools (46%), data quality and availability (37%), and cost of implementation (30%). Yet 70% believe AI will be helpful.²⁷

The Dodge/CMiC 2025 SmartMarket Brief found that 87% of contractors believe AI will transform the industry — but only 19% have adapted their workflows. Ninety-two percent report that AI is effective for automated proposal generation, one of the most contained, data-controlled use cases available.²⁸ The pattern is clear: where the data is clean and the scope is narrow, AI works. Where the data is messy and the scope is broad, it does not.

How often does information-driven failure actually occur?

More often than anyone reports. Love's 2026 study demonstrated the 300% underreporting gap.⁸ KPMG's 2023 Global Construction Survey found that only 50% of projects finish on time and 37% miss budget or schedule targets due to inadequate risk management.²⁹ The Deloitte 2025 AI ROI study, surveying 1,854 executives across 14 countries in Europe and the Middle East, found that the expected ROI timeline for AI is 2–4 years, with only 6% achieving payback in under a year — yet 91% plan to increase spending.⁵

The math is uncomfortable. If you deploy a generative AI tool to 1,000 engineers and each asks 10 questions per day, even a 1% hallucination rate produces 100 fabricated answers daily — answers that look authoritative, cite plausible-sounding sources, and enter your project documentation with no flag, no asterisk, and no audit trail.

---

What happens when engineering firms deploy AI without fixing their data?

Gartner predicted in July 2024 that 30% of generative AI projects would be abandoned after proof of concept by the end of 2025.³⁰ MIT's Project NANDA, based on 150 interviews with leaders, a survey of 350 employees, and an analysis of 300 publicly disclosed AI initiatives, found that approximately 95% of generative AI pilots fail to deliver measurable financial impact — but the failure mode is instructive. Purchased tools from domain-specific vendors succeed roughly three times as often as internal builds, not because the vendor's model is better, but because the vendor handles the data pipeline.³

Gartner's February 2025 analysis was blunter: 63% of organizations lack or are unsure about the data practices required for AI, and through 2026, 60% of AI projects unsupported by AI-ready data will be abandoned.³¹

How does the problem compound?

The feedback loop is vicious. An engineer asks an AI assistant a question. The model retrieves an outdated specification from an unclassified document store. The engineer, under time pressure, incorporates the answer into a design package. The design package passes review because the reviewers are also under time pressure and the answer looks right. The error propagates into procurement, then fabrication, then installation. Months later, it surfaces as rework — and the rework report, if it is filed at all, attributes the cause to "design error," not to the information infrastructure failure that made the error inevitable.

The AI did not cause the error. The AI found the error that was already in your system and delivered it with confidence, at speed, to everyone.

McKinsey's "Reinventing Construction" analysis found that 98% of megaprojects exceed their budgets by more than 30%, and 77% are at least 40% late.³² When every project already runs over, adding a tool that amplifies existing information quality at machine speed does not produce efficiency. It produces faster, more expensive failure.

---

Is the problem really the data — not the AI model?

Yes. And the evidence is not ambiguous.

Andrew Ng, in a widely cited 2022 statement, argued that model architectures are "basically a solved problem" and that data quality is now the primary lever for AI performance improvement.³³ The academic literature agrees. Alawadhi and Abbas, optimizing retrieval-augmented generation for electrical engineering applications in 2025, found that the key differentiator was "tailored datasets, advanced embedding models, and optimized chunking strategies" — not the choice of language model.⁹ Barnett and colleagues, cataloging seven failure points in RAG systems, found that the majority trace to information infrastructure: missing content, incorrect chunking, poor retrieval, failure to extract relevant context.³⁴ The model sat at the end of a pipeline, and the pipeline was broken long before the model was asked to generate an answer.

This is the reframe. The industry conversation has centered on which AI tool to buy, which model to deploy, which vendor to trust. The actual variable is none of those things. It is whether the information the model will read — your drawings, your inspection records, your design specifications, your project correspondence — is classified, current, governed, and retrievable. If it is not, no model on earth will save you. If it is, even a modest model will deliver value.

Precisely and Drexel University's 2026 survey found that 87% of leaders express confidence in their data integrity — while 43% simultaneously cite data readiness as their biggest AI obstacle.³⁵ The gap between confidence and reality is where the money disappears.

What does a successful implementation look like?

Mott MacDonald, a global engineering consultancy with 20,000 employees, built its AI assistant — EMMA — on a foundation of systematic document classification and curated knowledge bases covering 40+ engineering disciplines. Before deploying the model, the firm classified its document estate using automated tools, tagging 1,000 documents every 10 minutes across 16,000+ employees' work product.³⁶ The result: 220,000 queries in nine months, 60% workforce adoption, £1.4 million in annual savings, and ROI in under a year.¹⁰ ³⁶

The contrast with Samsung is exact. Same class of organization — global, tens of thousands of employees, engineering-intensive. Opposite infrastructure investment. Opposite outcome.

Thames Tideway, London's £4.2 billion super sewer project, built a unified common data environment housing 80,000+ documents (685 GB) accessible to 300 users across 12 disciplines before any advanced analytics were layered on.³⁷ The project achieved an 82% first-review stage-gate approval rate, a 32% reduction in design resources, and a 90-day schedule savings through 4D construction modeling.³⁸ ³⁹ The information architecture was a BS 1192-compliant, COBie-ready system designed for a 120+ year asset life.⁴⁰

At industrial scale, Bechtel deployed a governed digital torque-management system on the Shell Pennsylvania Chemicals project, capturing 173,800+ torque records across 120,000+ flanged connections. The result: a 0.1% leak rate against an industry average commonly cited at 10% — a 100-fold improvement.⁴¹ Rolls-Royce built a digital thread connecting design, manufacturing, and in-service operations on Microsoft Azure and Databricks, achieving a 30% increase in machine utilization and preventing approximately 400 unplanned maintenance events per year.⁴² ⁴³ In every case, the data infrastructure preceded the AI deployment.

---

What does information failure actually cost?

The direct costs are large. The indirect costs are larger. And most organizations measure neither.

Gartner estimated the average annual cost of poor data quality at $12.9 million per organization.⁴⁴ Autodesk/FMI put the global construction figure at $1.85 trillion, with rework attributable to bad data costing $88.69 million.¹ The HKA CRUX reports show 33.2% of capital expenditure in dispute, averaging $83.1 million per project.²² Arcadis finds $60.1 million average dispute values with 12.5 months to resolution.²³

But the hidden costs dwarf these figures. Khadim's 3.67× hidden-to-visible ratio means that for every dollar of rework you can see in your project controls system, $3.67 more is absorbed invisibly — in workarounds, schedule float consumption, unreported fixes, and knowledge loss.⁷ The 1-10-100 escalation principle means that information errors left uncaught in design become 100× more expensive at completion.¹³

McKinsey's "Rewired" research found that successful digital transformations — the roughly 6% of organizations that qualify as AI high performers — achieve 3.6 times the enterprise value of other transformations. But 70% of the effort in those transformations is data work.⁴⁵ The EY 2025 AI Pulse Survey found that 96% of organizations investing in AI report productivity gains, with organizations investing $10 million or more far more likely to see significant returns (71% vs. 52% at lower investment levels).⁴⁶ The returns are real. They are just downstream of an investment that most firms skip.

ASCE adopted Policy Statement 573 on July 18, 2024 — the first formal AI policy from a major U.S. engineering society — establishing that the licensed professional engineer retains full responsibility for AI-assisted work product, and that this responsibility requires verified, validated knowledge bases.⁴⁷ The policy does not mention model selection. It centers entirely on data governance, verification, and professional accountability. The profession's own standards body has named the problem.

---

How do you fix AI information infrastructure for engineering?

The thesis is straightforward: the pipeline matters more than the model. Every failure case in this article traces to information that was unclassified, ungoverned, outdated, or missing. Every success case traces to an organization that treated information infrastructure as a first-class engineering deliverable. The fix is not a better chatbot. It is a better pipeline — one that ensures AI models only access information that is classified, current, and citable.

Tricky Wombat builds that pipeline. The system is designed around three requirements that most AI deployments get wrong.

1. Classification before retrieval

Most enterprise AI systems ingest documents in bulk and rely on the model to figure out what is relevant. This produces the failure mode Barnett and colleagues documented: the model retrieves the wrong document, or the right document at the wrong version, or a chunk of a document stripped of its context.³⁴ Tricky Wombat classifies documents before they enter the retrieval pipeline — by discipline, document type, revision status, and project applicability. The model never sees an unclassified document. This is not preprocessing. It is a continuous operation that runs on every ingestion cycle.

2. Structured domain knowledge, not generic embeddings

Alawadhi and Abbas showed that domain-specific embeddings and chunking strategies outperform generic approaches regardless of model size.⁹ Tricky Wombat builds engineering-domain knowledge structures — ontologies for disciplines, taxonomies for document types, relationship maps between specifications and the drawings they govern — and uses these to inform both retrieval and generation. A question about a structural steel connection retrieves the governing specification, the relevant drawing, and the inspection record, in that order, because the system knows how those document types relate.

3. Citation-level traceability on every output

The 6% trust-without-verification figure from the Omni Calculator survey is not a problem to solve — it is the correct professional posture.⁶ Engineers should verify. The system's job is to make verification fast. Every Tricky Wombat output includes a citation chain: the specific document, the specific section, the specific revision. If the source document has been superseded, the system flags it. If the retrieval confidence is low, the system says so. No hallucinated references. No "based on your documents" without naming which one.

The system monitors its own retrieval performance, reprocesses documents when new revisions are ingested, and flags citation chains that break when source documents are updated. The information infrastructure does not degrade over time. It improves — because every query, every correction, and every new document makes the classification and retrieval more precise.

---

The bottom line

The pattern across every case in this article is the same. Plant Vogtle's missing records, Freeport LNG's orphaned alarms, Crossrail's fictional milestones, Samsung's unclassified source code — these are not technology failures. They are information infrastructure failures that existed long before anyone mentioned artificial intelligence, and they will persist long after the current generation of AI models is obsolete.

The firms that break the pattern — Mott MacDonald, Bechtel, Rolls-Royce, Thames Tideway — do not start with the model. They start with the documents. They classify, govern, structure, and curate their engineering information until it is reliable enough to serve as a foundation. Then they deploy AI on top of it. And then, consistently, they get the returns that everyone else is spending money to chase.

The window matters. Cisco's AI readiness data shows that organizational readiness is declining even as investment accelerates.¹⁴ Autodesk's trust data shows that practitioner confidence is eroding.¹⁶ Deloitte's data shows that payback timelines are stretching.⁵ Every quarter that an engineering firm spends buying AI tools without fixing the information those tools depend on is a quarter of compounding waste. The organizations that get the infrastructure right now will not just catch up. They will make it structurally impossible for those that did not to compete.

---

By Tricky Wombat

Last Updated: Mar 29, 2026

Your AI support bot isn't stupid

AI in legal practice is broken

What exactly IS prompt engineering?