Reading time: 3 Minutes

Why Most AI Pilots Fail (and What to Do About It)

Every week there is another headline about artificial intelligence transforming an industry. Behind the scenes, most projects never make it past the pilot stage.

Studies from Gartner and McKinsey put the failure rate at around 70 to 85 percent. The problem is not usually the algorithm. What works in a lab often cannot survive the leap into the messy, costly, and regulated world of real operations.

Why Pilots Fail

Data gaps
AI models are trained on curated, high quality data. Real world data is messy, incomplete, and always changing.
Hidden costs
Pilots look cheap because they run on a small scale. Once traffic grows, cloud bills spike. Compliance, monitoring, and guardrails add new costs that were never planned for.
Workflow friction
Even if the AI is accurate, adoption stalls when it slows people down. Doctors, bankers, and factory operators do not have time for tools that add friction.
Governance roadblocks
In healthcare, finance, and other regulated industries, projects stall without clear approvals, audits, and explainability.
No owner
Many pilots are experiments without a real business sponsor. When the demo ends, the project has no one to carry it forward.

Stories That Show the Pattern

Healthcare misstep: IBM’s Watson for Oncology cost over 60 million dollars at a major cancer center but never treated a single patient. The tool could not integrate into existing workflows and its recommendations were often unsafe.
Healthcare success: In Denmark, an AI system for mammogram screening reduced radiologist workload by a third and caught more cancers. It worked because it fit into existing tools and processes.
Trust broken: A mental health app tested AI written responses to user messages. People rated the messages as higher quality, but once they found out the words came from a machine, trust collapsed and the project ended.

What Success Looks Like

The projects that survive share three qualities:

Applicability: They solve a real problem with a clear business or human outcome.
Deployability: They run reliably in the environment where people actually use them.
Sustainability: They scale without runaway costs or compliance risks.

How to Improve the Odds

Start with a business metric that matters, not just what a model can do.
Involve the people who will use the system early. Adoption depends on trust.
Treat governance as a design requirement, not an afterthought.
Budget for the full costs of scaling, including compliance and monitoring.
Use TAHO to avoid high cloud bills and unmanageable operating costs.

The Bottom Line

Most AI pilots fail not because the technology is weak, but because the bridge from lab to production was never built. The projects that succeed focus less on flashy demos and more on viability. That means AI that is applicable, deployable, and sustainable in the real world.

Get smarter about infra. Straight from your inbox.

No spam. Just occasional insights on scaling, performance, and shipping faster.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Get Compute Out of Traffic: Federated Orchestration Beats the Control Plane

Turn any environment into a self-organizing compute fabric with instant startup, hot reload, and policy-driven scaling. No central control plane. Faster, leaner, and more adaptive across cloud, on-prem, and edge.

The problem with modern infrastructure

Modern infrastructure is powerful but inefficient. Stacks rely on layers of orchestration that add latency, waste resources, and slow teams down. A central control plane makes every decision. That creates queues, bottlenecks, and idle capacity that you still pay for.

From dispatchers to a compute fabric

Traditional scheduling works like an old taxi service. A dispatcher sits in the middle and tells each driver where to go. It is reliable but rigid. When traffic spikes, you get a long line of waiting passengers. Cars sit nearby with fuel in the tank, yet the line does not move faster.

TAHO works like a modern rideshare network. There is no single dispatcher. Every node participates. When a request appears, available nodes advertise themselves, and the system picks the best fit. Fastest. Least busy. Closest to the data. If one node cannot take the job, another does without delay.

This creates a compute fabric. A mesh of secure peers that discover each other and cooperate as one. The result is self-healing, self-scaling, and efficiency across data centers, edge sites, and multiple clouds. There is no single point of control and no central bottleneck.

What makes the fabric work

Portable components

Applications are built from lightweight, portable components that can run anywhere in the fabric. A component can be referenced and invoked from any node without the caller needing to know where it lives. Once a component is published, it can start, move, or scale near-instantly as demand changes.

A simple example

One node needs a report. Another node already has the data and the tool. The system lets the second node do the work and share the result with the rest. From that moment on, there is no coordination overhead. Only results that any node can use.

Security by design

Every component runs in its own secure sandbox. There is no shared memory and no implicit access to files or networks. If one component fails or is compromised, it stays contained while the rest of the system keeps running.

Instant hot reload

When you push new code, the fabric swaps in new components without restarting. Existing versions finish their current work while the new ones take over. If something goes wrong, the fabric automatically rolls back to the last known good version.

Peer-to-peer networking

Nodes use secure decentralized protocols for service discovery, workload distribution, and shared state. There is no need for centralized load balancers or control planes. Coordination and performance emerge from the collective behavior of all nodes working together.

How it compares to a central control plane

A central control plane concentrates decision-making in one place. That creates a queue and a single locus of complexity. You scale the controller. You tune the controller. You wait on the controller. The system is stable, but work piles up when demand spikes.

A compute fabric distributes decision-making. Nodes advertise capacity, claim work quickly, and keep work moving. There is no queue to jam up. There is no master to fail. Work finds the fastest path through the environment every time.

The takeaway

A central control plane was a useful step for the last decade of cloud. The next step is federation. If you want to see the fabric in action, start with a small service and let TAHO run it across a few nodes in your environment. You will feel the difference the first time you push code and watch it go live with no blip.

Engineering Breakthroughs That Crush AI Bubble Fears

‍Amid fears of an AI bubble, concrete engineering wins advancing infrastructure will form the basis of a sustainable AI-driven economy in the U.S.

‍Amid fears of an AI bubble, these advancements in AI infrastructure are concrete engineering wins and will form the basis of a sustainable AI-driven economy in the U.S.‍

In the swirling market excitement that has defined the AI era, it is natural to be concerned that investors may be inflating a bubble. Many of us who lived through dot-com mania look at Nvidia surging past a $5 trillion in market cap with a skeptical eye. One prominent voice pegged the current AI hype as 17 times larger than the dot-com boom, fueled by trillions in projected spending that may never yield commensurate returns. OpenAI's revenue forecasts tripling to $12.7 billion next year sound triumphant, but come amid warnings from firms like Ark Invest's Cathie Wood about potential market corrections. The BBC has spotlighted a "tangled web of deals" in Silicon Valley, where valuations do not match up to profits.

Yet amid these valid concerns, infrastructure advancements based on hard science and engineering are taking AI’s inflated expectations and shifting them to a robust productivity engine, particularly in the United States. Innovations in both compute hardware and infrastructure software promise to address the core bottlenecks of scaling: energy-hungry data centers, memory walls that choke model performance, and supply chains vulnerable to geopolitics. To give just two examples from different parts of the stack: startups like Substrate are working on X-ray lithography techniques that could reclaim U.S. semiconductor dominance, while TAHO, a U.S.-engineered compute software platform, unlocks far more data-center capacity and reduces inference costs on existing infrastructure without new silicon.

By 2030, global data centers could demand $3.7 trillion to $5.2 trillion in investments, but with U.S.-led efficiencies, this spend translates into productivity gains that could add trillions to GDP, echoing McKinsey's early projections for AI's potential. When energy demands are projected to rival entire nations' power consumption, these concrete wins are setting the stage for the U.S. to take a leading role in the transformation of the global economy.

Hardware Advancements

Today, it’s widely assumed that AI's scaling challenge lies primarily with the speed and cost of chip production. For years, the U.S. has ceded ground in semiconductor manufacturing to Taiwan's TSMC and the Netherlands' ASML, whose extreme ultraviolet (EUV) lithography tools hold a near-monopoly on producing chips at the 2-3 nanometer scale essential for AI.

Enter Substrate, a San Francisco startup that emerged from stealth this month with an audacious claim: the ability to use particle accelerators to etch features finer than 2 nanometers, surpassing the state of the art. The new technique also costs a tenth as much as in-market solutions, costing $40 million per tool versus $400 million. Backed by over $100 million from Peter Thiel's Founders Fund and In-Q-Tel, Substrate has successfully etched silicon wafers at U.S. national labs like Oak Ridge in my home state of Tennessee.

However, to compete in the global AI race, chips alone will not suffice. Data centers will form the backbone of daily productivity, and data centers are hungry – for energy, water, and real estate. Energy constraints loom large, with AI's power consumption possibly hitting 123 gigawatts in the U.S. by 2035. That would be enough to power about 100 million U.S. homes simultaneously. There’s a limit to how chip design can maximize energy efficiency, at which point software architecture becomes a key lever.

Software Advancements

While energy and hardware provide raw potential, it is the software we run on it that ultimately decides whether we are maximizing the use of scarce compute cycles. As an example, TAHO, a stealthy infrastructure software layer that claims to increase effective compute without new hardware, could slash inference costs by 90% and launch processing jobs 30 times faster by creating a shared memory fabric across fleets.

Unlike Kubernetes, which often leaves 70% to 80% of cloud capacity idle due to orchestration overhead, suboptimal scheduling, and queuing delays, TAHO acts as a compute-efficiency layer that eliminates redundant work and cold starts, reclaiming capacity into coherent AI pipelines.The framework sits atop existing stacks, turning $371 billion in annual data center spend into twice the ROI by optimizing for the AI supercycle's underbelly.

As hyperscalers like Meta project capital expenditures to grow notably larger in 2026, software-side innovations will ensure these investments yield higher returns. Innovative architectures like TAHO could transform Substrate's already dense chips into supercomputers, making compute "feel infinite" without ballooning power consumption. Deloitte predicts that over 50% of data will be generated at the edge, and performance optimization software like TAHO will facilitate that trend, ensuring efficient scaling and reducing supply chain risk.

Concrete hardware and software advancements are shaping a path to sustainable growth in the AI sector, and these gains are quantifiable regardless of whether AI investment is momentarily overheated. When foundational technologies like Substrate's lithography, TAHO's efficiency alchemy, and others are combined, trillion-token models that don’t fry the grid become practical – leading to AI abundance that will improve the quality of life for all.

For more, follow Dave Birnbaum @ contrarymo on X.

Why the Future of Software Is a Living System

Traditional software breaks under the scale of modern HPC and AI, and must evolve into organism like systems that adapt, heal, and reconfigure through feedback.

We’ve spent decades treating software like machinery. Something to be built, deployed, and maintained. But that metaphor is beginning to break down as HPC forces us to flex in ways we’ve never needed to before.

At scale, software no longer behaves like the same machine that it used to. With the advancement of HPC, and AI - it will likely need to behave more like an organism that is adaptive in nature.

Organic systems naturally evolve through feedback loops, replication, and adaptation. They self-heal, reroute, and reconfigure based on signals from their environment.

Think about it: Human biology is a perfect example of how complex systems can work together seamlessly without conscious coordination. The muscular, skeletal, and nervous systems collaborate to produce movement. Muscles contract, bones provide structure, and nerves deliver signals in milliseconds, all without deliberate thought.

The respiratory and circulatory systems synchronize to ensure oxygen flows precisely where it’s needed, while the endocrine and immune systems quietly regulate balance and defense. Each subsystem is autonomous yet interdependent, continuously communicating through chemical and electrical signals. The result is harmonious, and it just works.

The current model is deployment pipelines and fixed architectures, combined with traditional redundancy measures, which feels increasingly primitive in a world where intelligence, data, and computing is constantly in motion. We engineer systems that require back up machinery and switching mechanisms built in, and even those are often done through a lens of reactive maintenance, as opposed to proactive support.

At TAHO, we’re exploring what happens when you design infrastructure in a similar way to that of a living system. Our runtime treats compute as a fluid - something that can dynamically move, replicate, and support anywhere in the network. And we’re doing this with no containers. No orchestration. No traditional scheduling. Just autonomous components forming a self-healing mesh, much like living organisms do.

This isn’t about uptime; it’s about adaptation. When a node fails, others reorganize and adapt. When demand spikes, the network flexes. The system learns - not because it’s programmed to, but because it’s designed to respond accordingly.

This is how we see the future of software infrastructure. Software that doesn’t just run, it evolves with the future of modern day computing, and HPC workloads.

View all

Get smarter about infra. Straight from your inbox.

Related posts

Get Compute Out of Traffic: Federated Orchestration Beats the Control Plane

The problem with modern infrastructure

From dispatchers to a compute fabric

What makes the fabric work

Portable components

A simple example

Security by design

Instant hot reload

Peer-to-peer networking

How it compares to a central control plane

The takeaway

Engineering Breakthroughs That Crush AI Bubble Fears

Hardware Advancements

Software Advancements

Why the Future of Software Is a Living System

Ready to double performance, without doubling spend?