Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman
What happens when engineering teams reorganize around AI agents | InfoWorld
Technology insight for the enterpriseWhat happens when engineering teams reorganize around AI agents 8 May 2026, 11:05 pm
I counted at least 10 events in San Francisco last night aimed at matching AI startups with VCs. Just another Thursday.
But what made Camp AI’s “Agents at Work” event (hosted by Auth0) stand out was its showcase of companies that are in various stages of reorganizing their engineering processes around AI agents. Browserbase, Mastra, Fireworks AI, Drata, Mya, MindFort, and Corridor are all part of the vendor ecosystem trying to enable secure and performant agentic AI, but the most revelatory stories were their own successes and the challenges they faced restructuring their engineering orgs for agents.
Agentic AI is reshaping team structures
Paul Klein IV, founder and CEO of Browserbase, delivered the night’s most memorable line while discussing the speed of AI adoption inside engineering teams. “If AI is not doing your whole job it’s a skill issue at this point,” said Klein.
Abhi Aiyer, founder and CTO of Mastra, said the result is dramatically smaller teams capable of executing much larger scopes of work. “You can have one person run a whole feature project because they have an army of one to infinity AI agents behind them,” said Aiyer.
The AI-generated code bottlenecks have moved
Several panelists argued that AI systems are now generating software faster than organizations can safely review and operationalize it. Aiyer said that engineering teams are opening significantly more pull requests while review throughput becomes the new bottleneck.
Klein stressed the importance of throttling experimental AI output to appropriately lower risk in deployment environments. “If you are in the critical path and customer facing, no slop,” he said. “If you are not critical path, not customer facing, slop away.”
Trust and ownership are common stumbling blocks
Speakers repeatedly emphasized observability and accountability as challenge areas for autonomous agents. Rob Ferguson, VP of technology and strategy at Fireworks AI, argued that ownership cannot disappear simply because AI generated the output. “It doesn’t matter if you typed it or prompted it, you own it,” Ferguson said.
Bhavin Shah, VP of AI product at Drata, said enterprise AI systems increasingly require detailed auditability. “The agent is constantly telling the user, here is the action I’m taking, here is what I’ve done,” he said.
Securing the agentic workflow
Auth0’s demos focused heavily on authentication, authorization, and runtime controls for AI agents interacting with APIs and Model Context Protocol (MCP) servers. The company’s new MCP authentication product, which reached general availability this week, is designed to secure how agents interact with MCP servers and APIs.
Monica Bajaj, SVP of engineering at Okta, emphasized the importance of minimizing risk exposure as agents operate autonomously across enterprise systems. “How do we make sure that those tokens are not long-lived tokens?” she asked, adding “We make sure that the blast radius is minimum.”
Infrastructure is becoming the real AI differentiator
Klein argued that many AI limitations today are no longer about the underlying models themselves. “The overhang of AI capabilities is actually an infrastructure problem, not a model quality problem,” he said.
Klein noted that orchestration, tooling, permissions, and training data pipelines increasingly determine whether AI systems succeed in production.
Other interesting company demos
Mya demonstrated an AI program manager that aggregates Slack, Gmail, Jira, GitHub, and meeting notes to automatically track project risk and operational status. MindFort showed autonomous penetration testing agents designed to continuously probe enterprise applications for vulnerabilities during development and runtime. And Corridor demonstrated AI security guardrails that pre index codebases and inject secure coding guidance directly into AI coding workflows.
Mastra discussed redesigning developer documentation and frameworks specifically for AI agents rather than human developers.
Python isn’t always easy 8 May 2026, 9:00 am
It’s harder than it might seem to create a stand-alone Python app. It’s also harder than you might think to reliably back up SQLite databases, but Python has the tools for it. And while it’s not easy to install Python on an air-gapped machine, it absolutely can be done.
Top picks for Python readers on InfoWorld
Why it’s so hard to create stand-alone Python apps
Python’s dynamism is one of its most powerful features. It’s also why making stand-alone apps from Python programs is such a bear.
How to back up SQLite databases the right way (not by copying them!)
SQLite databases are single files, so backing them up just means copying them, right? Wrong. Make backups the proper way by using SQLite’s own backup mechanisms.
Python’s new frozendict type, demonstrated
A long-desired and -debated core addition to the language: a “frozen” or immutable dictionary, is coming in Python 3.15. See where it’ll be most useful in our live demo.
How to set up Python on an air-gapped system
Stuck working with a machine that has limited or no network connectivity, but still needs a Python installation? Such feats are possible — just tricky!
More good reads and Python updates elsewhere
Python 3.15 is getting sentinel values
No more abusing object() for creating alternatives to None or booleans. sentinel() offers a better, and native, alternative.
Package MATLAB programs for deployment as Python packages
A great way to bridge the worlds of MATLAB and Python, the Python Package Compiler takes MATLAB programs and makes them deployable as pip-installable Python packages.
Choosing a Python logging library in 2026
From the Python standard library’s logging module to the Microsoft-backed, C-based picologging library, there may be more options for logging in your Python apps than you realized.
Semi-off-topic: NetHack 5.0
Great news: The great-granddaddy of dungeon crawlers is getting its first new version in six years! Not-so-great news: Your older saved games won’t work. Go back to Level 1 and get grinding.
12 model-level deep cuts to slash AI training costs 8 May 2026, 9:00 am
Optimizing artificial intelligence pipelines requires moving beyond surface-level hardware adjustments to fundamentally alter how models process data. While engineers often implement basic toggle-away efficiencies inside the training loop, achieving permanent cost reductions requires architectural changes directly inside the neural network. As I have previously argued, the science is solved, but the engineering is broken; true FinOps maturity demands deep, model-level interventions. The following 12 architectural cuts will drastically lower the unit economics of your AI pipeline.
Redesigning the training foundation
1. Fine-tune, don’t train from scratch
Training a foundation model from scratch is computationally prohibitive and rarely necessary for standard enterprise applications. Instead of burning millions of dollars on raw compute, engineering teams should download highly capable, publicly available open-weight models. This baseline transfer learning approach is the mandatory first step when building internal corporate chatbots or domain-specific classifiers. Leveraging existing neural architectures instantly bypasses the massive energy and financial costs associated with initial pre-training phases.
2. Parameter-efficient fine-tuning (LoRA)
Even standard fine-tuning of a massive language model requires immense VRAM to store optimizer states and gradients. To solve this hardware bottleneck, engineers must implement parameter-efficient fine-tuning (PEFT) techniques like low-rank adaptation (LoRA). By freezing 99 percent of the pre-trained weights and injecting incredibly small trainable adapter layers, LoRA drastically reduces memory overhead. This mathematical shortcut is ideal for deploying highly customized generative AI features, allowing teams to fine-tune billions of parameters on a single consumer-grade GPU.
python
from peft import LoraConfig, get_peft_model
config = LoraConfig(r=8, lora_alpha=32, target_modules=["q_proj", "v_proj"])
efficient_model = get_peft_model(base_model, config)
3. Warm-start embeddings/layers
When you must train specific network components from scratch, importing pre-trained embeddings ensures that only the remaining layers require heavy computational lifting. This warm-start approach slashes early-epoch compute because the model does not have to relearn basic, universal data representations. It should be used immediately in specialized domains, similar to how healthcare startups leverage AI to bridge the health literacy gap using pre-existing medical vocabularies.
python
# PyTorch warm-start example
model.embedding_layer.weight.data.copy_(pretrained_medical_embeddings)
model.embedding_layer.requires_grad = False
Memory optimization and execution speed
4. Gradient checkpointing
Memory constraints are the primary reason engineers are forced to rent expensive, high-VRAM cloud instances. Introduced by Chen et al., gradient checkpointing saves memory by recomputing certain forward activations during backpropagation rather than storing them all. Engineers should deploy this technique when facing persistent out-of-memory errors, as it allows networks that are 10 times larger to fit on the same GPU at the cost of approximately 20 percent extra compute time.
python
# Enable in Hugging Face / PyTorch
model.gradient_checkpointing_enable()
5. Compiler and kernel fusion
Modern deep learning frameworks frequently suffer from memory bandwidth bottlenecks as data is constantly read and written across the hardware. Using graph-level compilers like XLA or PyTorch 2.0 fuses multiple operations into a single GPU kernel. This architectural optimization yields massive throughput improvements and faster execution speeds without requiring manual code changes. Engineers should enable compiler fusion by default on all production training runs to maximize hardware utilization.
python
import torch
# PyTorch 2.0 compiler fusion
optimized_model = torch.compile(model)
6. Pruning and quantization
Deploying a massive, fully precise 16-bit neural network into production often requires renting top-tier cloud instances that destroy an application’s profit margins. Applying algorithmic pruning removes mathematically redundant weights, while quantization compresses the remaining parameters from 16-bit floating points down to 8-bit or 4-bit integers. For instance, if a retail enterprise deploys a customer service chatbot, quantizing the model allows it to run on significantly cheaper, lower-memory GPUs without any noticeable drop in conversational quality. This physical reduction is critical for financially scaling high-traffic applications, directly lowering the carbon cost of an API call when serving thousands of concurrent users.
python
import torch
import torch.nn.utils.prune as prune
# 1. Prune 20% of the lowest-magnitude weights in a layer
prune.l1_unstructured(model.fc, name="weight", amount=0.2)
# 2. Dynamic Quantization (Compress Float32 to Int8)
quantized_model = torch.ao.quantization.quantize_dynamic(
model, {torch.nn.Linear}, dtype=torch.qint8
)
Smarter learning dynamics
7. Curriculum learning
Feeding highly complex, noisy datasets into an untrained neural network forces the optimizer to thrash wildly, wasting expensive compute cycles trying to map chaotic gradients. Curriculum learning solves this by structuring the data pipeline to introduce clean, easily classifiable examples first before gradually scaling up to high-fidelity anomalies. For example, when training an autonomous driving vision model, engineers should initially feed it clear daytime highway images before spending compute on complex, snowy nighttime city intersections. This phased approach allows the network to map core mathematical features cheaply, reaching convergence much faster and with significantly less hardware burn.
8. Knowledge distillation
Deploying a massive 70-billion parameter model for simple, repetitive tasks is a severe misallocation of enterprise compute resources. Knowledge distillation resolves this by training a highly efficient, lightweight “student” model to strictly mimic the predictive reasoning of the massive “teacher” model. Imagine an e-commerce company needing to run real-time product recommendations directly on a user’s smartphone, where battery and memory are strictly limited. Distillation allows that tiny mobile model to perform with the accuracy of a massive cloud-based architecture, permanently cutting inference costs and avoiding the AI accuracy trap.
9. Bayesian optimization and hyperband
Standard grid search algorithms waste massive amounts of cloud budget by blindly testing and completing network configurations that are doomed from the start. Smarter hyperparameter search methods, like Bayesian optimization and Hyperband, act as a ruthless financial governor by mathematically predicting and pruning bad trials during the very first epochs. For instance, if a bank is tuning a fraud detection model, Hyperband will instantly kill configurations that show poor early accuracy, redirecting all compute power only to the most promising setups. To further bound these costs, teams can integrate my RES-Cost-Aware-Retraining-Framework, which is based on recent peer-reviewed IEEE research.
Infrastructure and data efficiency
10. Model vs. data-parallel right-sizing
Improper cluster configuration creates massive network bottlenecks. If you split a moderately sized model across too many GPUs (model parallelism), the processors will spend more time waiting for data to travel across the network cables than actually doing math. Conversely, replicating the entire model across nodes (data parallelism) is highly efficient for processing massive datasets, provided the batch sizes are tuned correctly. A real-world FinOps team must dynamically right-size these parallel strategies based on the specific architecture, ensuring GPUs are never left idling while the network catches up.
11. Asynchronous evaluation
Standard training pipelines constantly pause the primary, expensive GPU cluster just to run routine validation checks on the model’s progress. Stopping a massive hardware cluster for twenty minutes every epoch to calculate accuracy metrics is a catastrophic waste of hourly rental fees. By implementing asynchronous evaluation, engineers can offload these validation checks to a separate, much cheaper CPU or low-tier GPU instance. Keeping the primary high-cost GPUs 100 percent busy is a mandatory architectural separation that helps mitigate the hidden operational costs of AI governance.
12. Intelligent data sampling and selection
Blindly processing massive datasets forces the optimizer to waste expensive compute cycles on highly redundant, low-quality information. If a visual model has already seen ten thousand identical photos of a standard stop sign, processing the ten-thousand-and-first photo provides zero mathematical value. Using algorithmic sampling to curate an information-rich subset achieves the exact same model performance at a fraction of the hardware cost.
Conclusion
Implementing these 12 model-level deep cuts transitions your AI strategy from a brute-force hardware approach to an elegant, software-defined discipline. By combining efficient training loop configurations with the architectural redesigns outlined here, engineering teams can stop throwing expensive GPUs at poorly optimized networks. However, even the most optimized training code will fail if the surrounding enterprise infrastructure is fragile. True operational maturity requires scaling these localized efficiencies across robust deployment architectures, which you can begin building today using the implementation scripts in my open-source git repository.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
When cloud giants meddle in markets 8 May 2026, 9:00 am
Hyperscale cloud providers are doing what any aggressive buyer with deep pockets would do: purchasing enormous volumes of DRAM and high-bandwidth memory to feed AI factories, new cloud regions, and expanding platform services. By securing supply ahead of competitors, they lock in favorable terms and ensure their growth is not constrained by component scarcity. From their perspective, this is smart business. From the enterprise market’s perspective, it is something else entirely.
When the largest infrastructure providers absorb a disproportionate share of a finite supply of memory, prices rise for everyone downstream. Enterprises attempting to refresh on-premises servers, expand private clouds, or maintain hybrid architectures suddenly face a distorted market. Hardware lead times grow. Budget assumptions fail. Planned refreshes become much more expensive than expected. In some cases, the cloud begins to look attractive not because it is strategically superior, but because the economics of self-hosting have been artificially degraded.
Normal market fluctuation or not
Large-scale, even aggressive procurement is not inherently illegal. Companies are allowed to buy what they want, negotiate volume discounts, and use their scale as leverage. However, it strays into illegal territory when the same firms that dominate public cloud demand benefit most from the rising cost of the hardware their customers need to remain independent. If nothing else, we should at least acknowledge the optics. If your business model profits when enterprise buyers cannot afford to build or refresh their own infrastructure, that go-to-market strategy deserves scrutiny.
While there is no suggestion of a secret conspiracy or an overt plot to deprive enterprises of memory modules, the reality is more mundane and more dangerous. Market manipulation in technology often does not arrive with a smoking gun. It arrives through incentives, asymmetry, and scale. One group of buyers can afford to overpurchase, precommit, and outbid the rest of the market. Another group cannot. The result is a lawful but highly consequential distortion that changes architecture decisions across the industry.
Forced architecture decisions
Too many enterprises still treat the debate of cloud versus on-premises as a purely technical decision. It is not. It is a business decision, an operating model decision, a governance decision, and, increasingly, a supply chain decision. If the price of memory rises as hyperscalers vacuum up supply to support AI expansion, the cloud may appear cheaper in the short term. But cheaper under those conditions does not mean better. It means the baseline has shifted.
This is the classic trap. A CIO sees a delayed server refresh, inflated memory prices, and a tight budget. The cloud vendor offers a quick fix: move workloads, consume on demand, and skip capital costs. That might suit some workloads, but if a distorted component market drives the decision, the enterprise isn’t choosing an architecture. Rather, it’s reacting to economic pressure from an ecosystem that benefits from that reaction.
That is not a strategy. That is coercion wearing the mask of efficiency.
Enterprises should take a step back to consider a tougher question: What if infrastructure component prices reflected a more balanced market? If memory were readily accessible, refresh cycles predictable, and hardware economics unaffected by hyperscale demand, would this workload still be suited for the cloud? The answer may vary—sometimes yes, sometimes no. What’s crucial is that this decision stems from an analysis of workload characteristics, business agility needs, compliance requirements, latency considerations, resilience objectives, and long-term economics. It should not be driven by panic over temporary or artificially created scarcity.
Mature architecture is critical now
A workload should go to the cloud because it benefits from elasticity, global reach, managed services, and rapid innovation. It should stay on-premises because data gravity, cost predictability, performance, sovereignty, or specialized operational requirements make it a better fit. Hybrid models should exist because the enterprise has intentionally optimized for choice and risk distribution. None of those decisions should be forced by a memory market that has tilted so far toward hyperscaler consumption that normal enterprise procurement starts to break down.
There is a broader strategic danger here. Allowing distorted prices to push enterprises into the cloud risks future leverage. Once workloads, data, and skills settle into a provider’s ecosystem, reversing becomes costly. What starts as a fix for expensive memory can lead to long-term dependence on a platform whose pricing power only increases as customer exit options decrease.
When pressure becomes lock-in
The correct response is not an anti-cloud ideology. Cloud remains a critical part of modern enterprise IT. The correct response is discipline. Enterprises should separate temporary market distortion from durable architectural truth. They should revisit total cost models using multiyear assumptions, not just current-quarter hardware quotes. Preserve optionality through hybrid patterns where practical. Negotiate harder with vendors. Diversify suppliers where possible. And build internal architecture teams strong enough to say no when the market is trying to bully them into a decision disguised as modernization.
The cloud should win when it is the best architecture. It should not win because enterprises have been priced out of independence.
That is the real issue here. If hyperscalers are using their scale to consume memory supply in ways that raise the cost of on-premises computing, the resulting wave of cloud migration is not entirely organic. It may not be illegal, but it is certainly worth questioning. Essential supply chains should not become indirect instruments of architectural coercion. Enterprises that surrender to the pressure without rigorous analysis will make expensive mistakes.
In the end, the smartest organizations will treat today’s memory crunch for what it is: a market condition to be managed, not a strategic truth to be obeyed.
13 new critical holes in JavaScript sandbox allow execution of arbitrary code 8 May 2026, 12:18 am
Thirteen critical vulnerabilities have been found in the vm2 JavaScript sandbox package that could allow an attacker’s code to escape the container and do nasty things to IT environments. As a result, developers using this library in their applications are urged to update the software to the latest version, which is currently 3.11.2.
The warnings come in advisories from vm2 maintainer Patrik Simek.
vm2 is an open source vm/sandbox that can run untrusted code with whitelisted Node.js’s built-in modules.
One of the more serious of the 13 vulnerabilities is CVE-2026-26956, a full sandbox escape with arbitrary code execution. Attacker code that is inside VM.run() can obtain host process object and runs host commands with zero co-operation from the host.
However, researchers at Socket told us in an email that the advisory about this escape says it has been confirmed only on Node.js 25.6.1, and requires a Node.js version with WebAssembly exception handling and JSTag support.
The highest-risk scenario, they said, would be an application using vm2 version 3.10.4 on Node 25, where attacker-controlled JavaScript is passed into VM.run().
“This is a narrow but high-impact vulnerability,” Socket research engineer Wenxin Jiang said in an email. “It does not appear to affect every vm2 deployment, because the advisory points to a specific vulnerable version and a specific Node 25/WebAssembly combination. But when those conditions line up, the security boundary fails completely: code that was supposed to be confined to the sandbox can reach the host process and execute commands. That is why teams using vm2 for user-supplied JavaScript should patch quickly and review what the sandboxed process can access.”
UPDATE: A day after this story was published, Socket issued new guidance saying the package and runtime scope of this particular vulnerability are broader than the original advisory suggests. That means, Socket said, some dependency scanners may incorrectly mark vulnerable deployments as unaffected. Socket testing found that the vulnerability affects all vm2 versions before 3.10.5 on Node.js runtimes that expose WebAssembly.JSTag, including Node.js 24.x.
Although it is not a vm2 maintainer, Socket said it is issuing a patch for developers who can’t immediately upgrade to the latest, fixed version.
Another serious hole is CVE-2026-44007, an improper access control vulnerability in the vm2 Node.js library that allows sandbox escape and execution of arbitrary operating system commands on the underlying host. Its advisory says that the vulnerability is in how the nesting:true option interacts with the legacy module resolver. This was patched in vm2 version 3.11.1.
“For CSOs, both [vulnerabilities] deserve urgent attention,” said Jiang, “but the second [the NodeVM nesting issue] may be the one more organizations need to audit for immediately.”
Both flaws, said Socket researchers, can turn sandboxed JavaScript into command execution on the host system. The difference is in how many environments are likely to be exposed. The Node 25/WebAssembly issue appears narrower because it depends on a specific vm2 version and a specific newer Node.js runtime behavior. The NodeVM nesting issue may be broader because it affects more versions and is triggered by a configuration pattern that some developers may have used intentionally.
Jiang added that both advisories point to a broader lesson: JavaScript sandboxes are difficult to secure, and small differences in runtime behavior or configuration can have major security consequences. “The first issue appears tied to a narrow Node 25/WebAssembly path,” he said. “This second issue is a configuration-driven escape involving NodeVM and nesting:true.
In both cases, the highest-risk users are organizations that run untrusted JavaScript and assume vm2 is containing it. Those [application development] teams should patch immediately and add stronger isolation around sandboxed workloads.”
‘Fragile security model’
These sandbox escape vulnerabilities demonstrate why sandboxing untrusted code inside a trusted process is a fragile security model, Adam Reynolds, senior security researcher at Sonatype, said in an email. “Once untrusted code runs inside a process with access to credentials and secrets, the underlying filesystem, the network, or with deployment privileges, a sandbox bypass can easily lead to a full system compromise,” he said.
Simply having vm2 installed somewhere in the dependency tree is not enough to make some of these vulnerabilities exploitable, he added. For example, an attacker generally needs the ability to execute crafted JavaScript (and in the case of CVE-2026-26956, crafted WebAssembly) inside a vm2 sandbox controlled by the vulnerable application. If the application never instantiates vm2, only uses it for trusted internal scripts, or does not allow attacker-controlled code execution at all, then there may be no realistic exploit path despite the presence of the dependency.
If an organization is running any applications impacted by vm2, they should be upgraded immediately, he said. To mitigate risk until the upgrade is complete, users can avoid Node.js 25 runtimes, disable or block WebAssembly entirely inside untrusted sandboxes, and prevent user-controlled WASM compilation/execution.
“Since future runtime updates could lead to similar issues, vm2 should be viewed as a convenience isolation layer as opposed to a hard security boundary,” he added.
In addition, Robert Enderle of the Enderle Group said that IT leaders who are serious about security should stop relying on software-level sandboxing for untrusted code. Start looking at moving those processes into hardened Docker containers or V8 Isolates, he advised.
This article originally appeared on CSOonline. It has been updated with new guidance from Socket.
The best new features in Python 3.15 7 May 2026, 8:32 pm
The first full beta of Python 3.15 has arrived, and it’s one of the most feature-packed Python releases in many a moon. Here’s a rundown of the biggest, boldest, and most important innovations, changes, and fixes.
Lazy imports
A long-asked for feature, lazy imports allow imports to be processed only when they’re actually used by the program. Thus for slow-importing modules that impose a large cost on a program’s startup time, you can now easily defer that cost to when the code of that module will actually be executed.
You can use lazy imports explicitly using the new lazy import syntax, but you can also force code with conventional imports to behave lazily, either programmatically or by using an environment variable. This makes it easy to make existing code take advantage of this feature without tons of rewriting. Best of all, there’s no drawback to making imports lazy: they otherwise behave exactly as intended.
The frozendict built-in type
Only rarely does Python add a new data type, but this is a long-debated and long-desired addition: the frozen dictionary. The frozendict behaves like a regular dictionary, except that it’s immutable (you can’t add, remove, or change elements) and it’s hashable (so you can use it as a key in another dictionary, for instance).
The sentinel() built-in type
Another new addition to the language is intended to replace a common and problematic Python pattern: creating a unique sentinel object (as an alternative to None where None could be a valid value, for example) by using object(). The new syntax ,sentinel("NAME"), creates unique objects that compare only to themselves via the is operator. These objects can be type-checked properly, and they have an informative representation instead of just a random object descriptor.
A statistical sampling profiler
The long-standing cProfile module profiles Python code deterministically—that is, it tracks and records every single call. That makes it precise, but it also means a cProfile-tracked program runs far slower than normal. A new profiling module in Python 3.15, profiling.sampling, uses statistical sampling methods to garner useful information about performance at a fraction of the impact on the program’s speed. The existing cProfile profiler is still available—it’s not going away—but has a new alternate name, profiling.tracing.
An upgraded JIT
CPython’s built-in just-in-time (JIT) compiler debuted in Python 3.13. Its long-term goals are to make Python programs run faster without any changes to code, in something of the same way the alternate Python runtime PyPy can speed things up. And it comes without the cost of changing to a totally different interpreter with some of its own limitations.
The first couple of revisions of the JIT didn’t promise, or deliver, a great deal of additional speed, as they were more about laying a foundation for future improvements. With Python 3.15, though, the JIT is now showing an 8% to 13% geometric mean performance improvement over standard CPython, depending on the platform and workload. The biggest changes include a new tracing front end (to enable more speedups on more kinds of code), the use of register allocation for faster and more memory-efficient work, better machine code generated by the JIT, and additional optimizations such as eliminating reference counts for some classes of objects.
Better error messages
Error messages in Python have been made more precise, detailed, and useful over the last couple of versions, and Python 3.15 continues that work. The highlights:
- Suggestions for missing names (“
xhas no attribute ‘y‘. Did you mean ‘xyz‘?”) now include suggestions from the members of a given object, and not just the object itself. - Suggestions now also cover checks for deleting attributes, not just accessing them.
- If the interpreter can’t come up with a suggestion for a method based on fuzzy name matching via Levenshtein distance, it consults a list of names commonly used in other languages for such methods. For example, if you attempt to use
list.push()(a JavaScript method), the interpreter suggests.append(), the proper method for Python lists.
Type system improvements
The TypedDict class, which lets you create dictionaries with predefined keys and type-hinted keys and values, adds support for two new arguments in its definition. The closed argument lets you specify if only the keys specified can be used at runtime. The extra_items argument lets you specify additional keys at runtime, but only keys with a value of a specified type.
The TypeForm type definition lets you represent the value that results from evaluating a type expression. With this, type annotations can be used in places where the type itself is being used as a value—for instance, variations on operations like typing.cast or even isinstance, or as part of how a third-party type-checking tool works.
Unpacking in comprehensions
This is another long-requested feature. If you wanted to completely unpack or “flatten” a nested object using a comprehension, you used to need a function like itertools.chain() or you would have to write a nested comprehension with an ugly syntax:
x = [[1,2,3],[4,5],[6]]
y = [a for b in x for a in b]
>>> [1, 2, 3, 4, 5, 6] # y
Unpacking in comprehensions using the star operator lets you save yourself a step:
x = [[1,2,3],[4,5],[6]]
y = [*a for a in x]
>>> [1, 2, 3, 4, 5, 6] # y
Unpacking with ** also works, for instance as a way to flatten and combine dictionaries:
dicts = [{'a': 1}, {'b': 2}, {'a': 3}]
y = {**d for d in dicts}
>>> {'a': 3, 'b': 2}
Finally, this kind of unpacking can also be used to form generator expressions:
(*x for x in ["ab","cd","ef"])
The expression above creates a generator that yields:
['a', 'b', 'c', 'd', 'e', 'f']
Reverting the incremental garbage collector
Finally, Python 3.15 takes an important U-turn. Python 3.14 featured a major change to its garbage collection system—an incremental garbage collector intended to reduce the amount of program-stopping time needed to collect garbage. Unfortunately, many users reported the new garbage collector increases process memory usage, sometimes dramatically. Python 3.15 will revert back to the older generational garbage collector used in Python 3.13 and before. The incremental collector may return in a future version, but not without additional work done on it to keep this problem from resurfacing.
Teradata launches platform for enterprise AI agents moving beyond pilots 7 May 2026, 2:30 pm
Teradata has launched its Autonomous Knowledge Platform, a new flagship offering that brings together data, analytics, AI development, agent orchestration, and governance across cloud, on-premises, and hybrid environments.
The target customer is an enterprise that has moved beyond testing AI assistants and is now asking harder questions: which data agents can use, what actions they can take, how much they will cost to run, and who is accountable when something goes wrong.
The company said the platform builds on its existing database engine and governance infrastructure, while adding new capabilities and more tightly integrating existing ones, including AI Studio, the Tera natural-language workspace, Tera Agents, Elastic Compute on Teradata Cloud, and the upcoming Teradata Factory for on-premises AI workloads.
Teradata is entering a competitive market with this. Snowflake, Databricks, Microsoft, Oracle, and Salesforce are all trying to persuade customers that their platforms should become the operating layer for enterprise AI agents.
Strategic consolidation
Teradata is positioning the Autonomous Knowledge Platform as a product evolution rather than a simple rebranding of existing tools.
AI Studio is designed to help enterprises build and govern AI workflows, while Tera serves as a natural-language workspace. Tera Agents are intended to handle operational tasks such as sizing, tuning, provisioning, telemetry, and FinOps. The company is also adding Elastic Compute to Teradata Cloud and plans to offer Teradata Factory for on-premises AI workloads in regulated environments.
The launch brings together several capabilities under one broader platform, according to Greyhound Research’s chief analyst Sanchit Vir Gogia.
He described the platform as “a meaningful strategic consolidation rather than a clean-sheet invention,” pointing to Tera, prebuilt platform agents, Elastic Compute, and the company’s Global Identity framing as the most clearly new or newly emphasized pieces.
The harder problem for buyers, he said, is whether these systems can remain governed once agents begin operating continuously across enterprise environments.
Gogia said the prebuilt Tera Agents may be one of the more interesting parts of the launch because they focus on infrastructure operations rather than user-facing assistants. If they work as described, agents that manage sizing, tuning, compute, telemetry, and FinOps could help Teradata make the cost and efficiency case for the broader platform.
Addressing governance requirements
Governance is a key part of the pitch that Teradata would want enterprise buyers to notice. The company said autonomous agents require different controls from traditional analytics users because their activity can extend from repeated data queries to tool use and actions across enterprise systems.
Sumeet Arora, Teradata’s chief product officer, said every tool call made by an agent passes through Enterprise MCP, which Teradata describes as its governed context interface. The company said the system includes authentication, role-based and attribute-based access controls, schema validation, and a full audit trail.
Agents can invoke only the systems they are authorized to access, Arora said, while enterprises can configure human-in-the-loop approval workflows for actions they consider sensitive or high risk.
Teradata is also tying that governance model to its Connected Data Foundation, which it says allows data to be stored once and accessed consistently. The company said the architecture is designed to make interactions traceable across analytics, AI, and autonomous agents, supporting auditability and compliance.
That control layer could become increasingly important as enterprises move from AI assistants that generate recommendations to agents that act on business data.
“Enterprises are ready to put tightly scoped, policy-governed, high-value agents into production, but they are not ready for open-ended autonomy with vague permissions and fuzzy accountability,” Gogia said. “Bounded autonomy is a deliberate, governed expansion of what software can do without supervision. Open-ended autonomy is an aspiration in search of a control plane.”
Teradata said the Autonomous Knowledge Platform will be available on Teradata Cloud in Q3. Teradata Factory is expected to follow later this year, while Tera Claw, the company’s multi-agent orchestration mode, is scheduled to enter research preview by the end of the year. AI Studio and AI Services are available now.
The hidden cost of front-end complexity 7 May 2026, 9:00 am
Front-end development has never been more capable. Modern frameworks offer fast rendering pipelines, component composition, powerful tooling, and a growing ecosystem of libraries that promise to make building sophisticated applications easier than ever.
Yet many teams experience exactly the opposite — increasing difficulty. Applications grow harder to reason about. Features interact in unexpected ways. Simple changes ripple through unrelated parts of the system. Debugging becomes an exercise in tracing invisible dependencies across the application.
The tools improved, but the complexity remained.
Front-end complexity never ends
For many years, front-end complexity was blamed on frameworks. Each generation of tooling promised to fix the limitations of the previous one. The transition from server-rendered pages to client-side frameworks introduced a wave of architectural experimentation. Then came virtual DOM engines, reactive libraries, and increasingly sophisticated component systems.
The expectation was that better frameworks would eventually tame the complexity of large front-end applications. But something else happened instead.
Modern frameworks largely solved the original problems they set out to address. Rendering performance improved dramatically. Component architectures became predictable. Tooling and developer experience matured. And yet front-end systems continued to become more complex.
Part of the reason is that the role of the front-end has quietly expanded far beyond what it once was.
Front-end development is no longer just about HTML, JavaScript, and a framework. Modern front-end engineers are expected to understand how entire systems are designed and operated. They work with distributed APIs, CI/CD pipelines, containerized deployments, design systems, and complex build infrastructure. They make architectural decisions about data flows, caching strategies, and how large client-side systems evolve over time.
In many ways, the front-end has absorbed responsibilities that once belonged to multiple layers of the stack.
It would be tempting to describe this shift as front-end becoming full-stack. But that description still understates the change. Modern front-end work often feels like full stack multiplied several times over. Engineers are expected to think about application architecture, CI/CD pipelines, containerized deployments, design systems, and distributed data flows — all while building the user-facing layer of the system.
What used to be a presentation layer has gradually turned into an entire application platform running inside the browser.
The complexity we don’t see
Much of this complexity remains invisible during early development. A small application with a handful of components appears straightforward. State flows feel manageable. Data dependencies seem obvious.
But as systems grow, hidden complexity begins to accumulate.
A user interaction triggers a network request. The response updates several pieces of state. Derived values recompute. UI components re-render. Background synchronization updates cached data. Another feature subscribes to the same state and triggers its own updates.
Each individual step may appear reasonable. But together they form a web of dependencies that becomes increasingly difficult to understand.
This is the same pattern I explored recently in the context of event-driven front-end systems, where behavior becomes spread across chains of reactions rather than expressed through visible structure.
The problem is not that these interactions exist. Modern applications must coordinate them, after all. The problem is that our architectural thinking often treats them as implementation details instead of system design concerns.
Complexity moves up the stack
One of the defining patterns of software evolution is that complexity rarely disappears. It moves.
When frameworks simplified rendering, complexity shifted toward application logic. When component architectures improved modularity, complexity moved into state coordination. As applications grew more dynamic, complexity migrated into data synchronization and derived state.
Today, front-end architecture is less about rendering techniques and more about managing relationships between pieces of application state. This shift is subtle but profound.
We often still talk about front-end architecture as if it were primarily about frameworks, component patterns, or routing strategies. In reality, those decisions now represent only a small portion of the system. The real architecture lives in the structure of state and the rules that govern how that state evolves over time.
This shift points toward what can be described as a state-first front-end architecture, where application state becomes the primary structure of the system and UI emerges as a projection of that state.
Architectural challenges emerge
The central architectural challenge in modern front-end development is no longer rendering the UI efficiently. It is modeling application state in ways that keep large systems understandable as they grow.
When state relationships are unclear, complexity multiplies quickly. Features begin to interact through hidden dependencies. Data flows become unpredictable. Engineers spend more time tracing behavior than building new capabilities.
But when state relationships are clear, much of that complexity becomes manageable.
This is why many of the most important front-end innovations today revolve around state modeling. Whether through reactive primitives, declarative data dependencies, or derived state systems, the industry is slowly converging on the idea that the shape of application state defines the structure of the application itself.
A critique of current front-end thinking
Despite this progress, much of the front-end ecosystem still focuses on the wrong problems.
Discussions often revolve around rendering performance, component syntax, or framework comparisons. These debates can be useful, but they rarely address the issues that make large systems difficult to maintain.
Too much of the front-end conversation still optimizes for how fast we render the UI while ignoring whether we understand the system we are building.
Most large front-end failures do not come from choosing the wrong framework. They come from systems where state relationships are unclear, responsibilities are poorly defined, and application logic spreads across many disconnected pieces of the codebase.
In other words, they are architectural failures. Treating front-end development primarily as a framework choice obscures the deeper challenge of designing systems that remain understandable as they grow.
How front-end architecture will evolve
Over the next decade, I believe, front-end architecture will increasingly revolve around explicit state modeling.
Instead of building applications as collections of components reacting to scattered events and asynchronous updates, teams will begin structuring systems around clearer representations of application state and the relationships between those states.
UI will become a projection of state rather than the place where application logic is orchestrated.
This change in focus also changes what we expect from engineers, moving the role away from coordinating behavior toward defining the structure and intent of the system itself.
We can see these developments already in emerging patterns across the ecosystem. Reactive primitives, derived state models, and signal-based architectures all point toward the same direction: systems where state relationships are explicit and observable.
As front-end systems continue to scale, this approach will likely become less of an optimization and more of a necessity. Over time, teams that fail to model state explicitly will find their systems increasingly unmaintainable, regardless of the frameworks or tools they choose.
The future of front-end architecture
The hidden cost of front-end complexity is not measured in rendering speed or bundle size. It appears in the cognitive load required to understand how the system behaves. When engineers cannot easily reason about how data moves through the application, development slows down. Bugs become harder to isolate. Features become riskier to implement.
Reducing that complexity requires a shift in perspective. Front-end architecture must move beyond frameworks and focus on system design. The most important decisions are no longer about which library renders the UI fastest. They are about how we structure application state, how it evolves, and how its relationships remain visible to the engineers building the system.
As applications continue to grow in scale and capability, the shift to state modeling will define the next phase of front-end architecture. The future of the front-end is not about more powerful rendering engines. It is about designing systems whose state structures make complexity understandable.
Three skills that matter when AI handles the coding 7 May 2026, 9:00 am
Writing code has always been the most time- and resource-intensive task in software development. AI is changing that, and faster than most engineering organizations are prepared for. Tools like Claude Code and Cursor are already handling significant parts of code construction, freeing developers to spend more time on requirements, architecture, and design.
But that shift creates a new challenge nobody is talking about enough. As AI takes on the heavy lifting, the skills that matter most are moving upstream: how to provide the right context for a prompt, how to evaluate what the model produces, and how to understand a problem deeply enough that you can’t be fooled by a confident but wrong answer.
This piece explores those three skills and why developers who master them will have a significant edge over those who don’t.
Beyond coding: Mastering the art of the prompt
Software translation tools such as compilers and assemblers map a high-level description of code to a lower-level representation suitable for execution. Layering such tools led to the first dramatic improvements in coding productivity. AI prompt engineering represents the next generation of layered translation software that sits above the compiler and assembler. With AI code generation, the focus will move from writing good code to writing good prompts.
What constitutes a good prompt? The answer is good context. But what provides the best context? Most importantly, the developer must have a good understanding of the task the software must perform. Consider what’s required to write a typical software module that is part of a larger system. The prompt should cover:
- Expected inputs and outputs, like the software’s core functionality
- Errors and exception conditions and how they should be handled
- Performance expectations
- Existing frameworks the software is surrounded by and the programming language used
- Interface expected by the user
- Required storage, compute, and network resources.
How system design informs context
For new initiatives, the context for this module should be taken from a detailed system design. The system design is essentially the blueprint for the software, created by breaking down the overall design into smaller, separate parts called modules. Each of the modules is responsible for performing a specific function that the software needs to deliver. In microservice implementations, domain-driven design breaks the business requirements into distinct subdomains that can be mapped to microservices.
Good system designs have a coherent architecture that provides a concept of operation like how the modules work together to meet the functional requirements. The best system designs result when well-understood requirements are combined with the right architecture.
By working backwards and building the context into a prompt we discovered the most important phases in the development life cycle including requirements analysis (what the software has to do), and architecture and system design (how it does it).

An example of a linear software life cycle.
Confluent
Although one design pass might work, often developers will need to iterate on their design to get the best outcome. This has been emphasized by many software experts over the years, but perhaps best put by the famous computer scientist Fred Brooks: “Plan to throw one away, you will anyway.”

An example of an iterative software life cycle.
Confluent
Iterative life cycles like spiral and evolutionary prototyping build the “throw one away” part into the process. Throwing something away sounds wasteful, but each iteration builds a deeper understanding of the problem: user requirements, architecture limitations, risks, and opportunities. Learning from each iteration greatly reduces the cost and complexity of the final product.
How AI tools can impact developer productivity
AI translation tools have the potential to make us more productive, but also introduce the risk that we will become lazy and dependent on them. A recent study found that LLM-assisted essay writing reduced user’s cognitive energy associated with their work relative to those who wrote essays unassisted by LLMs. This effect was termed “cognitive debt.”
I work with a strength trainer because modern life is too easy. It doesn’t require heavy lifting or strenuous activity. So we have to simulate it to improve both our strength and health. AI coding tools are like robots that do the heavy lifting of code generation for us. Without different challenges for us to overcome, we’ll get weaker.
Modern coding and AI tools
We need to find ways to keep our brains working hard while using AI tools, so that we have the capacity to think through the hard problems in our software design and its development work.
Writing optimized assembly code is no longer considered a good use of anyone’s time because compilers are so good at it. But until recently, writing good code for a compiler or run-time engine in Java, Go, or Python has been an important skill. In fact, these skills will remain important even as LLMs support code generation because developers will still need to review the generated code and verify that the LLM output meets certain standards. Experienced developers who have been writing code for years already have these skills. Both new and existing developers will be able to learn and expand their knowledge via interaction with LLM tools that expose them to new techniques and ideas.
We need to find the equivalent of strength training for coding that replaces some coding directly but retains understanding and judgement for the code the LLM produces. Where can we put our brains to work to avoid cognitive debt?
Avoiding the cognitive debt danger
First, study and understand the code generated from your prompt. Then re-write your prompt to improve the generated code, or rewrite the generated code if it’s close enough to what you need. LLMs behave statistically, so the generated code might not meet design goals. LLM gaslighting is real: quite often what it generates won’t run or isn’t correct, but the LLM will insist confidently that all is well. Don’t trust. Always verify.
LLMs can generate alternative designs from the same or slightly different prompt. Many developers are already leveraging this capability to explore the design space. Make sure you put the effort into understanding and modifying the code generated, and you’ll retain your coding skills.
Second, the focus of prompt engineering is to provide context to an LLM. So the key becomes creating that context, and understanding and judging the code that is generated. In addition to retaining their existing language and coding skills, software professionals should focus on other life-cycle elements, especially requirements, architecture, and design, so they have high-quality context for prompts.
Third, learn new languages and data models, and understand where each one fits best.
Fourth, build an understanding of best practices in code construction and design, independent of languages, so you can judge generated code using best practices that work across many different languages.
To stay competitive, you should understand that the bar will be rising. Historically, research has shown that the most productive individual developers are already about 10 times more effective than the least productive ones, and the best teams are about five times better than the weakest teams. AI tools could increase these differences by two or three times more, further widening the productivity gap. Many of these highly productive teams will work for your competitors.
AI will allow developers and teams that can crystallize requirements, architecture, and design to rapidly apply and evaluate different languages and data models to their project. AI will make iterative life cycles like spiral and evolutionary prototyping even more effective by allowing parallel development paths during each iteration. The key to success is leveraging AI in a way that allows you to focus on higher-level design issues while not losing control over code complexity. If you don’t learn these higher-level skills, developers and teams that do will be far more productive than you are.

Iterative life cycle with parallel paths and feedback loops.
Confluent
Accidental vs. essential complexity – why AI cannot be a silver bullet
Some have argued that AI will significantly improve software productivity. They envision a future in which software developers need only write a few prompts and an LLM will produce software that can replace existing SaaS products. But as Fred Brooks argued in a famous 1986 paper, “No Silver Bullet,” this is still impossible because of the two types of complexity that remain—accidental complexity and essential complexity.
Accidental complexity (or ‘accidents’)
Accidents are not inherent to the problem itself, but to the production process including the tools, languages, hardware limits, and implementation details we use to build the software. Historically, most productivity gains come from reducing accidental complexity. AI productivity can reduce accidental complexity, but developers must deal with its own challenges including hallucinations and poor-quality generated code that must be detected.
Essential complexity (or ‘essence’)
Essence refers to the inherent, unavoidable complexity of the problem itself. It is the challenge of “fashioning the complex conceptual construct” such as the abstract, interlocking ideas, data relationships, algorithms, and behaviors that accurately model the real-world problem the software must solve.
AI cannot be a silver bullet because of software’s inherent complexity. Even if you could reduce the time for all the accidental tasks to zero, the essential tasks still will be your biggest challenge and take up most of your efforts. Nevertheless, AI is a powerful tool. When used properly to manage complexity and explore the design space, it can significantly increase the productivity of teams and the quality of the software developed.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
MongoDB targets AI’s retrieval problem 7 May 2026, 8:06 am
For all their technical capabilities, large language models (LLMs) still have a memory problem. They can lack the ability to retain context across conversations, and don’t always contain the frameworks to let them access relevant data, ultimately making their results unreliable and untrustworthy.
NoSQL database pioneer MongoDB is taking on this problem, releasing new persistent memory, retrieval, embedding, and re-ranking features, all integrated into one platform. The company is also introducing new security connectivity, open-source plugins, and other framework integrations to support agentic AI workloads.
Supporting agentic memory
“Unlocking the power of agents requires memory,” Pete Johnson, MongoDB’s field CTO of AI, said during a press briefing. “Just like human memory, a good agentic memory organizes knowledge. It helps agents retrieve the right knowledge based on context and learn to make smarter decisions and take optimized actions over time.”
To advance automated retrieval and persistent agent memory, the company is adding Automated Voyage AI Embeddings in MongoDB Vector Search. The capability is now available in public preview.
Fragmented AI stacks present another challenge. As builders grapple with them, they are often stuck paying what Ben Cefalo, MongoDB CPO, called the “synchronization tax.” To make data agent-searchable, developers must stitch together factor search, operational data stores, embedded models, and caches, then take the time to build complex data pipelines that keep everything in sync across systems.
But by natively integrating Voyage AI into Atlas, MongoDB has turned a “multi week engineering project into a two minute configuration,” Cefalo claimed. Developers can ship reliable, trustworthy agents much more quickly and easily, and “without all the complex data plumbing.”
Dovetailing with this, the company is announcing the general availability of a LangGraph.js Long-Term Memory Store. Cefalo pointed out that JavaScript and TypeScript users comprise the world’s largest builder communities, but the company’s existing Python integration formerly limited these groups to short-term and single-threaded content.
Now, they can use MongoDB to give agents persistent long-term memory so they retain preferences and interaction history across conversations “on the data pipeline they already trust.” This underscores MongoDB’s longstanding “run anywhere” strategy, Cefalo explained.
Introducing embedding and re-ranking
Agents must be able to retrieve information based on context, and learn from and optimize that process, all while minimizing LLM token use as much as possible, Johnson pointed out, because without consistent, high-accuracy retrieval, users lose trust.
Most users incorrectly blame this lack of trust on the LLM, and “the instinct is to upgrade to the latest, most extensive, expensive model,” he said. Ultimately, though, it’s a retrieval problem: Models can only act on the information they are given; if data is inaccurate, out of date, or lacking context, the output will ultimately be wrong, leading to potentially “disastrous” business consequences.
“That’s exactly the sentiment that we hear from customers: They’re excited about AI agents, but they’re nervous about putting them in front of their customers if the results are inconsistent, irrelevant or flat out wrong,” Johnson said.
The solution is getting the LLM the right information it needs upfront; this is where embedding and re-ranking models come in. MongoDB has been integrating these technologies into its Voyage 4 family of models, building off the company’s acquisition of Voyage AI in February 2025.
As Johnson explained it, embedding models convert unstructured information like PDFs, images, videos, and audio into vectors, which capture and map data meaning and group related data. “That’s how you can get semantic-style searching for things that aren’t exact keyword search.”
Re-rankers take this a step further. After results are retrieved by the embedding model, re-rankers compare them to the user’s query. This provides more relevant, grounded responses. “Think of the embeddings as a wide net, and the re-ranker hand picks the best fish out of it,” Johnson explained.
Both embedding and re-ranking capabilities are natively integrated into MongoDB, so enterprise customers don’t have to switch between vendors and end up “Frankensteining a stack that creates an operational headache,” he said.
Johnson also underscored the fact that the decisions technical leaders make about their data platform now will either accelerate their AI development or delay it by months or years. “This isn’t a question for the future, it’s a question for today, because the success of that development depends on the data platform they’re working with,” he said.
Database enhancements and new integrations
In addition to offering new memory capabilities, MongoDB is strengthening its data foundation. The latest version of its database, MongoDB 8.3, is now generally available, and represents a “deep architectural hardening” of its core offering to support faster AI workloads at lower cost, Cefalo explained.
Query expressions (instructions for retrieving and organizing data) are integrated natively into MongoDB, so developers don’t have to rely on external toolboxes; transformation logic stays inside the database. “It’s SQL-style data transformations for data engineers,” said Cefalo.
Further, MongoDB is also announcing an Atlas integration with Feast. The widely-adopted open-source store provides AI and LLM apps with structured data during training and inference. This means machine learning (ML) teams can operate without having to play a “high stakes game of database musical chairs” requiring them to move data from their primary training database to a separate system for real-time inferencing, said Cefalo.
“This database sprawl doesn’t just create operational overhead, it creates drift, where the model trains on one version of reality but makes predictions on another,” he said. This can be complex and expensive, and a hurdle to scaling AI.
Finally, to support security and compliance, MongoDB is providing cross-region connectivity to MongoDB Atlas from AWS PrivateLink, which supports connectivity between AWS services, virtual private clouds (VPCs), and on-premises networks without exposing traffic to the public internet. This integration, now generally available, provides a “single, auditable model” that simplifies compliance and maintains strong security posture for organizations operating across multiple regions, Cefalo explained.
Designing front-end systems for cloud failure 6 May 2026, 9:00 am
Modern frontend applications rely on cloud services for far more than basic data fetching. Authentication, search, file uploads, feature flags, notifications and analytics often depend on APIs and managed services running behind the scenes. Because of that, frontend reliability is closely tied to cloud reliability, even when the frontend team does not directly own the infrastructure.
This is often one of the biggest mindsets shifts for frontend engineers. We often think about failure as a total outage where the whole site is down. In practice, that is not what most users experience. More often, the interface is partially degraded: A dashboard loads but one panel is empty, a form saves but the confirmation never arrives, or a file upload stalls while the rest of the page still appears normal.
That is why I think frontend resilience deserves more attention in day-to-day engineering conversations. The goal is not to prevent every cloud issue. That is rarely realistic. The more practical goal is to build interfaces that stay usable, calm and understandable when cloud services or other dependencies hiccup. Reliability guidance from major cloud platforms is useful here because it frames reliability as the ability of a workload to perform correctly and recover from failure over time, not just remain available in ideal conditions. Those reliability design principles offer a broader cloud perspective that can inform frontend decisions.
Why cloud failures matter to frontend engineers
Cloud platforms are designed for scale and availability, but they still depend on many moving parts. Requests can fail because of temporary network instability, slow downstream services, expired credentials, rate limiting or short-lived infrastructure problems. Sometimes the issue is not in the primary API at all. It can be in storage, identity, messaging or another supporting service that the user never sees directly.
From a frontend perspective, the important lesson is that failures are often partial, not absolute. A product list may load correctly while recommendations fail. Login may work while user preferences do not. Search may return results, but analytics events may silently drop. When teams assume every dependency either succeeds together or fails together, they tend to create brittle interfaces that turn one bad response into a blank screen.
Resilient frontend systems often start with a simpler question: What is the minimum useful version of this screen if one dependency is unavailable? That question changes how you design loading states, component boundaries and recovery behavior. It also encourages a more honest relationship between frontend and backend teams, because the frontend is designed for real operating conditions instead of perfect demos.
Designing for graceful degradation in real products
One practical reliability habit in frontend systems is separating critical features from non-critical ones. Critical features are the parts users need to complete their main task. Non-critical features add richness, context or convenience, but the product can still provide value without them for a short period. On an account page, profile details and security settings may be critical. A recent activity panel or personalized recommendations may be useful, but not essential in the moment.
That distinction helps teams decide where to invest in stronger fallback behavior. If a non-critical feature fails, the interface can hide the section, show cached data or swap in a simpler default state. If a critical feature fails, the user needs a much clearer recovery path. That might mean preserving unsaved input, offering a visible retry action or falling back to a server-confirmed state instead of leaving the UI in limbo.
Retries are part of that picture, but they need to be used carefully. Common cloud reliability guidance emphasizes controlled retries, exponential backoff and jitter rather than aggressive repeated requests. That matches what I have seen from the frontend side as well. Retrying a read request after a short delay can smooth out transient failures. Retrying a write action without safeguards can create duplicate submissions, conflicting state or user confusion. A frontend should treat retries as a deliberate recovery tool, not a reflex.
The user experience matters just as much as the retry policy. If the application is attempting recovery in the background, the interface should say so. Endless spinners are rarely reassuring. Clear language such as “Still trying to load your recent activity” or “We’re retrying your request” makes the system feel more transparent. It also gives users a reason to wait instead of assuming the product is frozen.
This is also where partial rendering becomes powerful. Interfaces are often more resilient when they isolate failures instead of spreading them. If one widget fails, the rest of the dashboard should still render. If one secondary API is unavailable, the page should still load primary content. A resilient frontend should not require every backend dependency to succeed perfectly before it shows something useful. That design choice often matters more than any individual recovery tactic.
What resilient failure states look like in practice
Good failure handling is not only technical. It is also a communication problem. When users encounter an issue, they need to know what failed, what still works and what they can do next. Generic messages like “Something went wrong” usually fail on all three counts. They are vague, they do not reduce anxiety and they do not support recovery.
A better message is specific without becoming overly technical. For example: “We couldn’t load your recent activity right now. Your account details are still available. Please try again in a few minutes.” That kind of message reassures the user that the whole product is not broken and gives them a practical next step. It also reflects a more mature product mindset: Failures should be contained, explained and recoverable.
One area where this matters a lot is form-heavy workflows. Frontend systems can lose user trust quickly when a submission fails and the user loses everything they typed. Preserving user input should be a baseline expectation for critical flows. Even basic browser capabilities and web APIs can support better failure handling here. For example, the Fetch API and AbortController give frontend teams a cleaner way to manage request lifecycles, cancel stale requests and avoid leaving the interface stuck in outdated loading states. These are small implementation details, but they often shape whether the product feels reliable under stress.
The same principle applies to fallback data. In some cases, showing cached or last-known information is more helpful than showing nothing at all. In others, it is better to hide a non-essential section until the dependency recovers. There is no single universal pattern. What matters is choosing a failure state that matches user intent. If the user is trying to complete a task, support task completion. If the user needs context, preserve as much trustworthy context as possible.
Cloud failures will continue to happen, even in mature environments. For frontend engineers, resilience is less about dramatic disaster handling and more about small design decisions made early: Isolating failures, protecting user work, controlling retries, rendering partial content and writing clearer recovery messages. When those decisions are made well, users may never know what failed behind the scenes. They only notice that the application remained usable, understandable and calm under pressure.
This article is published as part of the Foundry Expert Contributor Network.
Want to join?
No, AI won’t destroy software development jobs 6 May 2026, 9:00 am
I’m not even remotely worried about AI eliminating software development jobs. In fact, I’m pretty sure there will soon be a boom in both software development jobs and the amount of software available to everyone.
People have always worried about automation causing massive unemployment. Each time a breakthrough happens, folks are sure that “it will be different this time.” Only it never is different.
But the worriers persist.
It’s paradoxical
You can tell them all about the Jevons paradox — the observation that as something becomes more efficient, demand for that more efficient thing increases rather than decreases. In the mid-19th century, William Jevons noticed that the use of coal became more efficient. Humans figured out how to get more heat and energy out of less and less coal. The common belief was that, because less coal was needed for the same amount of energy or heat, there would be less demand for coal as a result. Everyone was concerned that coal miners would lose their jobs. But Jevons noticed that demand for coal actually went up, as the more efficient processes led to more widespread uses for coal.
The same thing happened half a century earlier with the introduction of the automated loom. Despite fears that the power loom would destroy jobs for weavers, it made the production of clothing and other textile products cheaper, increasing demand for such products and increased employment in the textile industry.
This phenomenon can be seen over and over again. Spinning jennies, automobiles, computers, robotic manufacturing, tractors, sewing machines, and countless other inventions all caused widespread fears of job loss, but the fears were never really realized. When a company can suddenly produce 10 times more with the people they have, they have always wanted to produce 10 times more, not cut their workforce by 90%. Yet here we are, with everyone sure that AI is going to put us all out of work.
It’s not going to happen — especially in the software development realm. You know what is going to happen? The same thing that always happens. That which is automated and made more efficient will find new and different ways to express itself. Existing software will suddenly be vastly more useful as the backlog of features can be implemented. New software ideas that were previously too complex for humans to write and manage will be created.
Marc Andreessen was never so right as when he said that “software is eating the world.” Sure, software was eating a lot when humans wrote every line of code. But now that code can be written 10 or 100 times faster, software’s appetite will go from hungry to ravenous. The work that can be done has expanded rapidly. And that work will be done because there is too much money in building what we have always wanted but that humans alone could not deliver.
A positive-sum game
The world is never a zero-sum game, but humans seem hard-wired to view the world that way. Only now, with AI, we have what Daniel Jefferies delightfully calls “Fear Mongering as a Service,” running rife through our industry. Yet while all the Chicken Littles decry the job market falling out of the sky, job postings continue to actually increase, and it is becoming harder to fill those jobs.
Now that doesn’t mean the market isn’t shifting. The demand is strong for experienced engineers and weaker for entry-level jobs, a situation that is creating a bit of a paradox all by itself. The skills that worked for many years may not be as valuable going forward. Writing good code and getting an AI agent to write good code are two different but related skills.
Now, I recognize that the debate on this matter is strong and that there are many folks who will take the opposing side. Some will argue that software development shops overhired during Covid and that the resulting adjustments are going to put a damper on things. Some argue that the increase in job postings is merely a scam, with AI generating many of the new postings, and that the increase in job openings is a fraud. Could be. But it doesn’t matter.
So go ahead and panic if you want — update your résumé, run around flapping your arms, and cry that the sky is falling. Me? I’ve seen the PC “destroy” mainframe jobs, the internet “destroy” off-the-shelf software, open source “destroy” commercial software, and offshoring “destroy” the American programming market. Things are going pretty well considering all this “destruction.” I can’t wait for AI to “destroy” our current developer market.
Building AI apps and agents with Microsoft Foundry 6 May 2026, 9:00 am
At first glance, Microsoft Foundry looks like a big grab bag of every AI-adjacent service that Microsoft has offered in the last decade, plus some new ones. In Microsoft’s own words, “Foundry consolidates several previous Azure AI services and tools into a unified platform” and “unifies agents, models, and tools under a single management grouping.”
Microsoft Foundry helps application developers to build and deploy agents, which may use models and tools. It also helps machine learning (ML) engineers and data scientists to fine-tune models, run evaluations, and manage model deployments. Finally, it helps IT administrators and platform engineers to govern AI resources, enforce policies, and manage access across teams. It isn’t quite a floor wax and a dessert topping, but it does try to serve three distinct audiences.
Key capabilities of Microsoft Foundry for building agents include multi-agent orchestration, workflows, a tool catalog, memory, knowledge integration, and publishing. Key capabilities for operation and governance include real-time observability, centralized AI asset management, and enterprise controls.
Microsoft Foundry competes directly with the Google Cloud Agent Development Kit (ADK), Amazon Bedrock AgentCore, and Databricks Agent Bricks. Additional competitors include the OpenAI Agents SDK, LangChain/LangGraph, CrewAI, and SmythOS.
Microsoft Foundry Agent Service
The Microsoft Foundry Agent Service is a helpful platform that guides you through the development, deployment, and scaling of AI agents. These agents use large language models (LLMs) to handle tricky requests, connect with other tools, and do tasks on their own.
The service groups agents into three main types: prompt agents, which are easy to set up and great for quickly trying out ideas; workflow agents, which are visual or YAML-based tools that make automating several steps easier; and hosted agents, which are containers that let you manage your own code as well as frameworks like LangGraph.
Microsoft Foundry also has a model catalog with both new and well-known models, and a tool catalog that includes web search, memory management, and code execution.
The platform uses guardrails and controls to keep things secure, like stopping prompt injection. Plus, it supports private networks, versioning, managing the infrastructure, and full monitoring.

The Microsoft Foundry Agent Service accepts inputs from user messages, system events, and agent messages. The agent combines a large language model with instructions and tool calls. Tools can retrieve data, perform actions, and provide memory. Agents can send agent messages and emit structured output.
Microsoft
Microsoft Foundry Models
Microsoft Foundry Models is a collection of AI architectures, including foundational models, reasoning models, and models tailored for specific domains, brought to you by Microsoft and other companies. These models are grouped into those you can buy directly from Azure and those shared by the community. This grouping helps you figure out how much direct support Microsoft will give you and how well they’ll fit into your existing cloud setup.
Models from Microsoft come with official service level agreements and are well-integrated, while models from partners like Anthropic and Meta let you explore innovations under their own rules.
You can use the platform in two main ways: managed compute and serverless deployments. (You can check out Microsoft’s comparison table below.)
Managed compute means you get your own virtual machines where the model weights are stored, which is great for doing complex stuff like fine-tuning and keeping track of the model’s life cycle using Azure Machine Learning, but the VMs incur costs whenever they are active. Serverless deployments give you easy access to Microsoft’s models through APIs, and usually you pay based on how many tokens you use, not how much hardware you use.
To keep things safe, the platform has built-in content safety filters that watch out for anything bad, and you can (if necessary) lock down your data by turning off public network access and using private endpoints for all your hub-based project work.
When selecting models, you may want to consult the Foundry model leaderboard (screenshot below), which is found in the Discover/Models tab of “new” Foundry.

Comparison of managed compute and serverless deployment options for models on Microsoft Foundry. Managed compute deployments are billed by virtual machine core hours; serverless deployments are billed by usage measured in tokens.
Foundry

Foundry model leaderboard. You’ll note that the highest-quality models are not necessarily the safest, fastest, or cheapest. You can sort this chart by any column.
Foundry
Microsoft Foundry Control Plane
The Microsoft Foundry Control Plane is essentially a dashboard that helps you keep an eye on all your AI agents, models, and tools in one place. It brings together all the admin stuff from different projects into a single view, so you can easily see how everything is doing. Plus, it lets you keep tabs on performance, costs, and compliance from just one spot.
The Control Plane breaks down the work of running things into different areas, like the Assets pane. The Assets pane keeps a list of all your AI resources, so you can find them easily and see how they’re doing. It also looks at what’s happening when they’re running and gives you a health score to spot any problems early. The Compliance pane sets up rules for the whole company using Microsoft Defender and Purview. It collects security alerts and policy violations and helps you fix them all at once to make sure everyone’s using the agents safely and following the rules.
The Admin and Quota panes keep an eye on who can do what and how much they’re using. This helps you manage costs and make sure no one’s hogging the resources. The Control Plane also keeps things running smoothly by using tools that automatically check for weaknesses, like prompt injection, and gives you tips on how to improve your prompts based on what’s happening.
Observability, evaluation, and tracing
Observability in Foundry Control Plane is a toolkit for keeping an eye on and fixing systems as they run, all while making sure the outputs are top-notch and safe. In the Microsoft Foundry world, this is divided into three main areas: checking things out, keeping tabs on them, and following their path.
First up, evaluation is like a detective’s work, where special tools look at things like how well the model fits together and if it’s safe, like checking for harmful materials or sneaky biases. You can even add your own evaluators to make sure it works for your specific needs. There are also built-in tools that give you an idea of how well it’s doing.
Then there’s production monitoring, which is like having a live camera on your apps. It connects with Azure Monitor to keep an eye on what’s happening, like how much it’s using and how slow it is, along with how good it’s doing. If something goes wrong, you get alerts so the tech team can fix it fast.
Finally, distributed tracing uses OpenTelemetry to show you exactly how your AI agents are working. This gives you a clear picture, so you can figure out tricky thinking or spot where the app is slowing down. You can use these tools from the start, checking your models, making sure everything is good before you launch, and even spotting any changes after deployment.
Developer experience
Microsoft Foundry allows you to develop agent applications in four programming languages, Python, C#, TypeScript/JavaScript, and Java. That said, the vast majority of samples and solution templates are in Python, typically with Microsoft Bison setup files for Azure. You can use Visual Studio Code or another IDE of your choice. You need project and AI permissions on Azure. You will also need the Azure CLI (az) and the Azure Developer CLI (azd) to use many of the solution templates. If you use Visual Studio Code, you’ll need the Foundry extension. In the unlikely event that you don’t already have Git installed in your environment, you should install it now, because you’ll want it to clone Foundry SDK sample repos.
If you wish, you can configure Claude Code for Microsoft Foundry. That lets you run the coding agent on Azure infrastructure while keeping your data inside your compliance boundary. In this configuration, unfortunately, you have to run Claude models through their Azure API and pay by the token, even if you have a flat-rate Claude subscription.
There are currently over a dozen AI templates (or 18, if you log into “new” Foundry and look at the Solution templates under Discover) available to help you get started with Microsoft Foundry. The Get Started with Chat template is a good first project. (See the architecture diagram below.)
You can use on-demand Foundry playgrounds for rapid prototyping, API exploration, and technical validation, to experiment with models, and to validate ideas. Experimenting with playgrounds is recommended prior to writing production code. There are four different playgrounds, one each for models, agents, video, and images.
LangChain is a framework for developing applications powered by language models. It enables language models to connect to sources of data, and also to interact with their environments. LangGraph extends LangChain’s capabilities for building multi-actor or agentic applications by orchestrating agents. You can combine LangChain and LangGraph with Microsoft Foundry models and other capabilities using the langchain-azure-ai Python package.
There are two kinds of Foundry agent workflows, declarative and hosted. Declarative agent workflows define predefined sequences of actions for your agents using YAML configurations rather than explicit programming logic; you can generate code from the YAML once you’ve tested it. Hosted workflows let multiple agents collaborate in sequence, each with its own model, tools, and instructions.
The Foundry MCP Server (preview) is a cloud-based version of the Model Context Protocol (MCP). It provides a collection of tools that allow your agents to interact with Foundry services by reading and writing data, all without needing to connect directly to the back-end APIs.
Fireworks AI is integrated with Microsoft Foundry on a preview basis. It allows you to use the latest open-source models and bring your own models onto Fireworks’ GPU-backed infrastructure.

This “Get started with AI chat” solution deploys a web-based chat application with AI capabilities running in Azure Container App. It uses Microsoft Foundry projects and Foundry Tools to provide intelligent chat functionality, and supports retrieval-augmented generation (RAG) using Azure AI Search. It lacks any significant security features.
Microsoft
Microsoft Foundry SDKs
Microsoft Foundry currently offers four SDKs, each implemented in four programming languages (Python, C#, TypeScript/JavaScript, and Java). When choosing the best development path for your project, select the Microsoft Foundry SDK if you are building applications that use agents, evaluations, or unique Foundry-specific features. If your priority is maintaining maximum compatibility with the OpenAI API or accessing Foundry direct models via Chat Completions, the OpenAI SDK is the better choice. For specialized tasks involving AI services such as Azure Vision, Azure Speech, or Azure Language, use the Foundry Tools SDKs. Implement the Agent Framework when your goal is to orchestrate multi-agent systems through local code.
Guardrails and Responsible AI
Implementing guardrails improves model and agent safety by detecting harmful content, enhancing user interactions, and reducing AI output risks. Microsoft Foundry currently offers guardrails that can be applied to one or many models and one or many agents in a project. As has been the case for years, the risks that are handled are categorized as hate, sexual, violence, and so on, and the severity level threshold settings for content risks range from off to high. Guardrails can be applied at four intervention points: user input, tool call, tool response, and output.
To conform with Microsoft’s Responsible AI policy, Microsoft recommends that Foundry developers discover agent quality, safety, and security risks before and after deployment; protect, at both the model output and agent runtime levels, against security risks, undesirable outputs, and unsafe actions; and govern agents through tracing and monitoring tools and compliance integrations.
Trying the Foundry Agents Playground
The 2024 predecessor to Microsoft Foundry was Azure AI Studio. One of the parts of AI Studio that I found most useful was the Playground, where you could find dozens of examples of effective instructions/prompt/model combinations in addition to the actual Playground for testing out your own. I wrote about this in my guide to generative AI development. The Playground has since evolved for agents, but the examples seem to have fallen by the wayside in the transition to Microsoft Foundry. The new playground is found under Build/Agents/Playground.
In the Foundry Agents Playground screenshot below, I provided the system instructions “You are a careful researcher who never makes up answers and always cites references,” and in my query asked it to summarize Kierkegaard’s massive “Concluding Unscientific Postscript,” a text I studied in college. Those system instructions tend to encourage models to stay on the straight and narrow, but don’t always prevent models from making up citations out of whole cloth. Hallucinated citations can seem legit even while being utter fabrications, as several lawyers have discovered to the detriment of their careers. If you use generative AI, you are still responsible for any answers you use, so you need to fact-check everything carefully, even if it sounds correct.
By the way, there’s a decent summary of prompt engineering techniques in the Microsoft Foundry documentation. It’s not as entertaining as the old examples, however.

The Microsoft Foundry Agents playground, found in the Build section, is a useful place to try out models, tools, guardrails, instructions, and prompts. Here I have asked the agent to summarize Kierkegaard’s “Concluding Unscientific Postscript,” with system instructions that say “You are a careful researcher who never makes up answers and always cites references,” using the open-weight mixture-of-experts (MoE) model gpt-oss-120b from OpenAI. The summary looks pretty good based on my memory of the text, although I have not checked the generated references for accuracy.
Foundry
Trying a Foundry Solution
I tried one of the 18 Microsoft Foundry solution templates, “Get started with AI agents.” The entire process took me about an hour, ran almost entirely in the cloud, and cost me a whopping $0.02. That’s right, two cents. You can find the code on GitHub in the Azure-Samples repository.

The README doc for “Getting started with agents using Microsoft Foundry,” a basic sample solution for deploying AI agents and a web app with Azure AI Foundry and SDKs. Note the solution architecture diagram two-thirds of the way down.
Foundry
Starting from the GitHub repo, you can click on the “Open in GitHub Codespaces” button or the “Dev Containers” button in the Getting Started section. I used the former, which essentially opens a VM-based Visual Studio Code environment in the Azure cloud. The latter opens the VS Code environment on your local machine and connects it to a development container in the Azure cloud.

The “Getting started with agents using Microsoft Foundry” solution opened and running in GitHub Codespaces. At this point the azd up command has completed, and has supplied an end point for the web interface.
Foundry
In this solution, the agent uses Azure AI Search for knowledge retrieval against a vector database, and includes built-in monitoring for troubleshooting and performance optimization. It’s essentially retrieval-augmented generation (RAG) in web agent form.

The running agent answering questions about the uploaded product catalog. This AI assistant can perform some of the tasks that would otherwise fall to a human customer service agent either talking on the phone or texting with a potential customer.
Foundry
The bottom line
Overall, Microsoft Foundry acquitted itself well in my test of one of its major use cases, helping application developers to build and deploy agents that use models and tools. I found the ease of use good, the selection of models solid, the Agents Playground excellent, and the agent types and framework support very good.
I liked Microsoft Foundry about as well as I liked the Google ADK (reviewed here), and better than I liked Amazon Bedrock AgentCore (reviewed here). I didn’t test Microsoft Foundry’s model fine-tuning or IT administration capabilities.
Cost
Platform is free; pricing occurs at the deployment level.
Platform
Microsoft Azure
Pros:
- Microsoft Foundry has many capabilities that application developers can use to build and deploy agents.
- The Microsoft Foundry Agents playground is a nice interactive way to develop and test agents.
- Microsoft Foundry offers about 18 solution templates to get you started.
- Pricing seems quite reasonable.
Cons
- The Microsoft Foundry documentation is extensive enough to be forbidding.
- It takes a while to learn your way around the development surface.
Supply-chain attacks take aim at your AI coding agents 5 May 2026, 9:26 pm
Attackers too are looking to cash in on the AI coding craze, adapting their supply-chain techniques to target coding agents themselves.
Many AI agents autonomously scan package registries such as NPM and PyPI for components to integrate into their coding projects, and attackers are beginning to take advantage of this. Bait packages with persuasive descriptions and legitimate functionality have cropped up on such registries, while packages that target names that AI coding agents are likely to hallucinate as dependencies are another attack vector on the horizon.
Researchers from security firm ReversingLabs have been tracking one such supply-chain attack that uses “LLM Optimization (LLMO) abuse and knowledge injection” to make packages more likely to be discovered and chosen by AI agents. Dubbed PromptMink, the attack was attributed to Famous Chollima, one of North Korea’s APT groups tasked with generating funds for the regime by targeting developers and users from the cryptocurrency and fintech space.
“This campaign presents us with the new frontier in software supply chain security: AI coding agents manipulated into installing and using malicious dependencies in the code they generate,” the researchers wrote in their report. “The underlying problem is, in principle, not much different from the well established pattern of cybercriminals and malicious actors socially engineering developers to use malicious packages in their codebase. Where it differs is in the ability of the threat actors to test their lure before it is deployed.”
An evolving campaign
North Korean threat actors commonly use social engineering to trick developers into installing malware, whether through fake job interviews or by publishing rogue software components that could appeal to developers from specific industries.
The PromptMink campaign appears to have started last September with two malicious packages called @hash-validator/v2 and @solana-launchpad/sdk. The SDK was used as a bait package with legitimate functionality intended to be discovered by developers, while hash-validator, a dependency for the SDK, contained a JavaScript infostealer.
This combo of a lure package and a malicious dependency appears to be a central technique used by the group to make their campaigns more resilient. The bait packages have a better chance of remaining undetected for longer, accumulating downloads and history to appear more credible.
Multiple second-layer malicious packages were rotated over time as part of the campaign, including aes-create-ipheriv, jito-proper-excutor, jito-sub-aes-ipheriv, and @validate-sdk/v2. All were related to cryptocurrency networks, posing as tools to work with cryptographic hashes and functions. The bait packages were also diversified over time with @validate-ethereum-address/core and several others, expanding across multiple package registries and programming languages such as Python and Rust.
The attack later evolved to include additional obfuscation techniques and malicious actions — for example, deploying an attacker-controlled SSH key on victims’ machines for direct remote access, and archiving and exfiltrating entire code projects from compromised environments.
One notable development was the pivot to compiled payloads to complicate detection. For example, in February the @validate-sdk/v2 package started bundling Single Executable Applications (SEAs) — self-contained applications that include JS code with the full Node.js interpreter. SEAs aren’t typically distributed as part of NPM packages because users already have Node.js installed locally on their machines.
In March, the attackers pivoted from SEAs to pre-compiled malicious Node.js add-ons written in Rust with the NAPI-RS project. This was likely done to reduce payload size, as SEAs are unusually large, exceeding 100MB in some cases.
Using LLMs to trick LLMs
ReversingLabs’ researchers observed clear signs of vibe coding in the creation of these malicious components, including LLM-generated code comments. However, something else stood out: the level of detail in their README files and the way the documentaton boasted about how effective these packages were at performing their tasks.
The researchers questioned whether this was intended to make the rogue components more appealing to developers, who are typically the target of such attacks. But the overly persuasive language made more sense if the intended targets were LLM-powered autonomous coding agents, and it wasn’t long before they confirmed this was likely the case.
In a January 2026 post on Moltbook, a Reddit-like platform where AI agents make posts and discuss topics autonomously, one bot described how it created a memecoin and used the @solana-launchpad/sdk package because it had one of the needed functions. It is possible the post was generated intentionally by an AI bot controlled by the attackers. But it wasn’t the only example of an AI agent falling for the bait package.
The researchers later found a legitimate project called openpaw-graveyard that was developed as part of the Solana Graveyard Hackathon and included the @solana-launchpad/sdk as a dependency. The repository history showed the dependency had been added in a commit co-authored by Claude Opus.
“This transforms the technique from social engineering to a combination of LLM Optimization (LLMO) abuse and knowledge injection,” the researchers concluded. “In the context of this campaign, the goal is to make the LLM likely to recommend using the malicious package by making the documentation as believable (knowledge injection) and as appropriate as possible in the project that the specific LLM coding agent is working on.”
‘Slopsquatting’
This AI agent supply-chain risk isn’t limited to specifically crafted package descriptions and documentation. Coding agents can also hallucinate package names entirely. Previous research has shown that this happens often and predictably enough to make it something attackers could abuse.
Back in January, Aikido Security researcher Charlie Eriksen registered an npm package called react-codeshift that was hallucinated by an LLM and subsequently made its way into 237 GitHub repositories.
It started with someone vibe coding a collection of agent skills back in October for migrating coding projects to different frameworks. That collection included two skills — react-modernization and dependency-upgrade — that invoked the hallucinated react-codeshift package via npx, a CLI tool bundled with npm for downloading and executing Node.js packages on the fly without installation.
Agent skills are markdown or JSON files that contain instructions, metadata, and code examples to teach AI agents how to perform certain tasks. They are automatically activated during agent operation when specific keywords are encountered in prompts.
Eriksen registered the react-codeshift package on NPM and immediately started seeing downloads, suggesting that skills with the hallucinated package names were being used in practice. And not just with npx but with other Node.js package installers as well, because the original skills were cloned and modified by other developers.
“The supply chain just got a new link, made of LLM dreams,” said Eriksen, who called the new threat “slopsquatting.”
“This was a hallucination. It spread to 237 repositories. It generated real download attempts. The only reason it didn’t become an attack vector is because I got there first,” he said.
Vibe coding agents need stronger security controls
As organizations rush to incorporate AI agents into business workflows and software development pipelines, their security controls need to keep pace with the novel attack vectors these agents introduce.
The US Cybersecurity and Infrastructure Security Agency, the US National Security Agency, and their Five Eyes partners recently published a joint advisory on the adoption of agentic AI services. Among the many recommendations, the agencies advise organizations to maintain trusted registries of approved third-party components, restrict AI agents to allow-listed tools and versions, and require human approval before high-impact actions.
“Poor or deliberately misleading tool descriptions can cause agents to select tools unreliably, with persuasive descriptions chosen more often,” the agencies warned, effectively confirming that LLMs can be socially engineered through documentation.
AI coding agents should not be allowed to install dependencies without developer review, and every suggested package should be treated as untrusted by default until their transient dependencies are reviewed. Development teams should implement Software Bill of Materials (SBOM) practices so they can track and audit the components used in their development pipelines.
Oracle will patch more often to counter AI cybersecurity threat 5 May 2026, 3:40 pm
Oracle plans to issue security patches for its ERP, database, and other software on a monthly cycle, rather than quarterly, to respond to the increased pace of AI-enabled software vulnerability discovery.
Other software vendors, notably Microsoft, SAP, and Adobe, already release patches on a monthly beat, always on the second Tuesday of each month.
Oracle, though, is taking an off-beat approach: It will release the first of its monthly Critical Security Patch Updates (CSPUs) on May 28, the fourth Thursday, and after that, it will release its patches on the third Tuesday of each month — a week after the other vendors — with the next batches arriving on June 16, July 21, and August 18, it said earlier this week.
The new CSPUs “provide targeted fixes for critical vulnerabilities in a smaller, more focused format, allowing customers to address high-priority issues without waiting for the next quarterly release,” Oracle said.
It will issue a cumulative Critical Patch Update each quarter, so on the same schedule as before. The first one this year came in January.
Oracle initially announced the switch to a monthly patching schedule last week, but did not provide the dates.
The new patching rhythm will primarily interest customers running Oracle applications on premises or in their own or third-party hosting environments. For customers using the software in an Oracle-managed cloud, Oracle applies the patches automatically automatically.
Oracle is using artificial intelligence to identify and fix the vulnerabilities faster than before. It said it has access to OpenAI’s latest models through that company’s Trusted Access for Cyber program, and to Anthropic’s Claude Mythos Preview.
Mythos has contributed greatly to concerns that AI will uncover thousands of zero-day flaws in software, but as of mid-April, only one vulnerability report had been tied directly to it.
This article first appeared on CSO.
AI finds 20-year-old bugs in PostgreSQL and MariaDB 5 May 2026, 11:57 am
Open-source databases are facing a bit of a memory problem as AI helps surface decades-old buffer overflow issues in widely used components. Security researchers have disclosed a set of high and critical-severity vulnerabilities affecting PostgreSQL and MariaDB, with two bugs reportedly tracing their roots back more than 20 years.
At Wiz’s zeroday.cloud hacking event, researchers using the AI-powered security analysis tool “Xint Code” found a high-severity zero-day bug in PostgreSQL’s “pgcrypto” extension, and a heap buffer overflow in MariaDB’s JSON schema validation logic, both allowing remote code execution (RCE) on respective database servers.
The Xint Code team also uncovered a missing validation bug in PostgreSQL, hidden for 20 years, allowing attackers to write arbitrary code.
Patches have been released for all these flaws, with both PostgreSQL and MariaDB maintainers urging users to upgrade to fixed versions immediately.
More than one crack in PostgreSQL’s foundation
The more pressing of the PostgreSQL zero-day flaws is a heap-based buffer overflow issue, tracked as CVE-2026-2005, in the “pgcrypto” extension. By using specially crafted input, an attacker can trigger a size mismatch that leads to out-of-bounds writes on the heap, researchers said in a blog post.
In environments where pgcrypto processes user-controlled input, this can be leveraged to achieve remote code execution on the database server.
The flaw affected all supported versions, and has been fixed in updates including v18.2,v17.8,v16.12,v15.16, and v14.21. It received a high-severity rating of CVSS 8.8 out of 10. “The vulnerable code has been present since pgcrypto was first contributed in 2005, more than 20 years ago,” the researchers added.
This wasn’t the only flaw reported in PostgreSQL. Another group of researchers competing as “Team Bugz Bunnies“ at the Wiz event found a missing validation bug, tracked as CVE-2026-2006, that allows execution of arbitrary code. The flaw was rated at a near 9 CVSS severity and was patched in the same updates that fixed CVE-2026-2005.
PostgreSQL maintainers urged customers to quickly patch the flaws as they went public after being unnoticed for years, and attackers have access to exploit code. The flaws were fixed in February, but a Wiz analysis found 80% of cloud environments using PostgreSQL, with 45% directly exposed to the internet.
Inadequate JSON parsing allowed RCE on the MariaDB server
In MariaDB, a buffer overflow bug, tracked as CVE-2026-32710, was found in the JSON_SCHEMA_VALID() function using Xint Code. The vulnerability allows an authenticated user to trigger a crash, which, under controlled conditions, could be escalated into remote code execution.
Compared to the PostgreSQL flaws, exploitation here is less straightforward. Successful code execution would require manipulation of memory layout, achievable only in “lab environments.” “Any user who can open a SQL session — whether through stolen credentials, SQL injection, or lateral movement — can reach this code path with a single function call,” Team Xint Code said in a separate blog post.
MariaDB versions 11.4.1-11.4.9, and 11.8.1-11.8.5 are affected, with a fix rolled out in 11.4.10 and 11.8.6, respectively. The flaw was assessed at 8.5 high-severity by GitHub, while NIST ranked it at a critical 9.9 out of 10 base CVSS.
The article originally appeared on CSO.
Cloud providers are blinded by agentic AI 5 May 2026, 9:00 am
I’ve been watching the cloud market long enough to know when a useful innovation becomes a strategic distraction. That’s what is happening now with agentic AI. The concept itself is not the issue. There is real value in autonomous and semi-autonomous systems that can coordinate tasks, assist developers, optimize workflows, and eventually reduce the amount of manual effort required to run complex businesses. However, just because a technology has promise does not mean it deserves to dominate the road map.
Right now, many cloud providers are acting as if agentic AI is the next unavoidable layer of enterprise computing, and therefore the best use of executive attention, engineering investment, and marketing energy. I think that is a mistake. In fact, I think it is the wrong priority at the wrong time.
The cloud providers are not operating from a position of solid fundamentals. They are still struggling with platform fragmentation, operational complexity, uneven service integration, confusing product overlaps, and, most importantly, resilience issues that have become far too visible. You can’t keep telling the market that fleets of intelligent agents are the future while the underlying infrastructure continues to wobble in ways that damage trust.
That is the part the market hype tends to ignore. Customers don’t buy cloud narratives. They buy cloud execution. They buy uptime, performance, support, predictability, governance, and a platform that does not require heroic effort just to hold it all together. If those basics are under pressure, putting agentic AI at the center of the road map is not visionary. It is evasive.
What customers actually notice
Cloud providers seem to believe that customers are waiting breathlessly for mature multi-agent deployment frameworks. Some might be. Most are not. Most customers, especially large enterprises, are still trying to get better control over costs, simplify operations, improve observability, modernize architectures, and reduce the blast radius when things go wrong.
This matters because recent outages have changed the conversation. When large cloud failures ripple across the internet, customers are reminded very quickly what matters most. They don’t care about the elegance of your agent framework in that moment. They care about whether their applications are available, whether transactions are processing, whether customer-facing systems are still online, and whether they can get clear answers from the provider.
This is why I think the current obsession with agentic AI is so badly timed. The industry should be using this moment to double down on resilience engineering, support quality, platform simplification, and better operational discipline. Instead, too many providers are trying to push the conversation upward into a more abstract layer of value. That might work in a keynote. It does not work in a post-outage executive review.
Enterprises are pragmatic. They will absolutely invest in AI where it creates real value. But they are not going to ignore infrastructure instability just because a provider can show a slick demo of coordinated AI agents booking meetings, routing tickets, or generating workflow suggestions. If the foundation is shaky, the innovation above it becomes harder to trust.
Chasing shiny objects
There is a pattern here, and we’ve seen it before. In enterprise technology, vendors often shift attention to the next strategic abstraction before fully stabilizing the current one. It happened with service-oriented architecture, with early cloud migrations, with containers, with serverless, and now with generative and agentic AI. The message is always some version of the same thing: Don’t focus on what is unfinished below, because the next layer above is where the future is headed.
Sometimes that works. Often it just compounds complexity.
Agentic AI, as it is being sold today, assumes a level of platform maturity that many cloud providers have not yet earned. These systems need dependable infrastructure, strong observability, well-managed identity and access controls, coherent data integration, policy enforcement, governance, and reliable runtime behavior. In other words, they require excellence in the basics. If the provider is still struggling to deliver a cohesive platform experience, adding autonomous behavior on top of that stack may create more moving parts, not more value.
I also worry that the economics are pushing providers in the wrong direction. AI has become the headline investment category, and every provider wants to prove it has a competitive story. That drives spending toward new AI services, developer tools, model integrations, and agent platforms. Meanwhile, the less glamorous work of improving reliability, reducing fragmentation, and preserving deep operational expertise gets treated as maintenance rather than strategy. That is exactly backward.
Fundamentals are strategic
Cloud providers would be much better off if they treated the fundamentals as a competitive differentiator again. That means resilience should move to the top of the road map, not the middle. Service consistency should matter more than feature count. Clearer integration paths should be highlighted rather than yet another branded AI abstraction layer. Customers should spend less time wiring products together and more time getting business value from stable platforms.
This is especially true now because customers are starting to look more closely at what they are really getting from their providers. If outages are more frequent, if support experiences are less satisfying, if service dependencies are harder to understand, and if the engineering lift to adopt new capabilities remains too high, then the provider is failing the basic value proposition. Agentic AI does not fix that. In some cases, it distracts from it.
I’m not arguing that providers should stop innovating around AI. They should not. I’m arguing that AI needs to sit on top of a stronger and more coherent infrastructure story. Right now, in too many cases, the infrastructure story is still incomplete. The resilience story is still incomplete. The simplification story is still incomplete. Yet the market is being told to focus on intelligent agents as if those gaps are secondary.
They are not secondary. They are the point.
Some advice for providers
The smart move for cloud providers is to put agentic AI in its proper place. Make it part of the road map but not the excuse for neglecting the rest of the platform. Reinvest in resilience. Simplify the product portfolio. Improve the connective tissue between services. Retain and empower experienced operators and architects. Reduce customer engineering lift. Be honest about where the platform still falls short.
That is what customers will remember. They will remember who helped them stay online, who reduced complexity, who communicated clearly during incidents, and who delivered real operational improvement instead of just more future-state messaging.
The cloud market has always rewarded innovation, but it rewards trust even more. Providers who forget that are going to learn a hard lesson. Before they ask enterprises to embrace multi-agent futures, they need to prove they can still deliver the dependable infrastructure those futures require.
Vibe coding or spec-driven development? How to choose 5 May 2026, 9:00 am
Vibe coding and spec-driven development (SDD) are two emerging approaches where devops teams use AI to develop all of an application’s code. There are discussions about which approach to use for different use cases, and there are many platforms to consider with varying capabilities and experiences. Some experts question whether AI delivers reliable, maintainable applications, while others suggest that, at some point, AI can lead the end-to-end software development process.
But one certainty IT organizations face is that there’s more demand for applications, integrations, and analytics than there is supply of agile teams and devops engineers. Compound this imbalance with business priorities to address application security vulnerabilities, modernize applications for the cloud, and address technical debt. It results in tough choices on what work to prioritize and where to drive efficiencies in the software development life cycle.
Even before AI code generators emerged, IT leaders sought ways to improve developer productivity. Platforms like 4GL, low-code/no-code, and configurable SaaS helped IT deliver more applications, reduce the developer skill set required to release enhancements, and improve software quality. These tools enabled IT to develop entire classes of applications, analytics, and integrations that couldn’t be built easily or cheaply by coding in Java, .NET, and other programming languages.
“Software has long been treated like infrastructure: built to last, hard to change, and expensive to replace, says Chris Willis, chief design officer and futurist at Domo. “That model is giving way to a future with more applications that are smaller, faster to build, and created to solve a specific job before getting out of the way.”
Code gen, vibe, or write a spec?
GenAI models are the next accelerators for software development. The first tools were copilots for coding assistance, followed by LLMs for generating code snippets. I used code-generation tools to develop regular expressions, extract information from web pages, and categorize data as steps in an app migration. They wrote code that I no longer had the time or skills to develop on my own, but it still required significant work to fix defects and integration issues.
We’re now in a second-generation phase of AI software development, with platforms like Amazon Q Developer, Appian AI-Assisted Development, Bolt, Claude Code, Cline, Cursor, Gemini Code Assist, GitHub Copilot, Kiro, Lovable, OpenAI Codex, Pave, and Replit.
All these platforms generate code, but they offer different developer experiences and are used to address different scopes of work. They can be broken down into three categories:
- Code-generating tools enhance the developer experience by writing code on request from engineers and are often integrated into existing development tools.
- Vibe coding generates prototypes, features, and production-ready applications through an iterative prompt-based experience.
- Spec-driven development (SDD) creates an intermediary step before generating applications by allowing a development team to establish product requirements and compose other design documents iteratively through prompts, then generating code from them.
If you are developing a new API, refactoring existing code, enhancing a workflow, or building a new feature, then a code generator may be all you need. The developer’s work shifts from writing code to expressing what code needs to be written, the requirements, the development platform, and other non-functional acceptance criteria.
But what if you want to develop a new application, integration, data pipeline, or a robust web service? For this article, I wanted to look beyond code generation and consider how development teams can use vibe coding and spec-driven development platforms to build and support applications.
What vibe coding does well
The vibe coding experience enables developers to prompt what they are looking to build and to observe the AI as it generates code.
Vibe coding platforms like Bolt, Lovable, and Replit can start developing from a single prompt, but they demonstrate more capabilities when the developer goes into plan mode. In planning, a vibe coding platform may repeat back the requirements it understands, ask questions to elaborate on them, and offer options when requirements aren’t specified.
The “vibe” you get from these platforms is that they want to help developers go from idea to a functioning application quickly. Developers can then prompt the platform to refine requirements and request changes. And it’s not just developers; business owners, non-technical startup founders, and other citizen developers are vibe coding, though they must learn the security best practices.
“Vibe coding enables groups within the organization to create minimal viable products or small-scale tools that greatly increase their productivity,” says Duncan Ng, vice president of solutions engineering at Vultr. “Examples span proofs of concept that you want to put in front of potential consumers to receive feedback on product market fit, to laborious processes that can be streamlined to generate efficiency gains and increase velocity.”
Are vibes a viable production path?
A proof of concept (POC) or minimal viable product may be all a developer needs, but some question whether vibe-coded applications are ready for production. Rajesh Padmakumaran, vice president and AI practice leader at Genpact, says, “Vibe coding accelerates POCs, rapid experimentation, and idea exploration, but it lacks deterministic behavior, making it fundamentally unsuitable for systems that need to be maintained, scaled, or supported long-term.”
The negative sentiment isn’t just targeted at vibe coding, but at AI-generated code in general. Low-code and no-code platforms faced similar concerns in their early years around security, architecture, performance, and operational resiliency. Successful platform vendors established trust through transparency, and IT departments learned what scaffolding, processes, and documentation were needed to scale low- and no-code development. A similar transition is likely to happen with vibe coding platforms.
“Vibe coding accelerates experimentation, but without clear architectural constraints, observability, and performance guardrails, it introduces variability that breaks downstream systems in devops and IT operations,” says Piyush Patel, chief ecosystem officer at Algolia. “CIOs should treat vibe coding as a front-end accelerator while anchoring systems in well-defined specs that act as the ‘prompt layer’ for both humans and AI.”
Start with requirements
Another approach for using AI to develop applications is spec-driven development. Rather than jumping right into prompts to steer AI’s application development, SDD platforms shift-left the process, helping engineers document requirements. Based on those requirements, the SDD platforms then develop the application.
“Spec-driven development is all about structure and accountability,” said David Yanacek, senior principal engineer of agentic AI at AWS. “You spend some time talking about what you want and what good looks like, and it responds with requirements, a technical design, and a breakdown of the development tasks.”
Yanacek is an advisor to AWS Kiro’s development team. Much like non-AI development projects start with designs, product requirement documents, and agile user stories, SDD reinforces the need for collaborating across business and technology stakeholders before jumping into code. Two successful use cases are a drug-discovery AI agent deployed to production in three weeks and a technology company’s accelerated cloud migrations.
“Creating these documents keeps the AI focused on high-quality output, so I can go back and verify that it did what I asked it to,” adds Yanacek. “For example, the design document describes the system’s behavior in detail, including code snippets and the database schema. When you fully specify how a system or feature should behave, the agent can generate more and better tests to verify its output.”
SDD is gaining traction among devops teams that recognize the importance of collaborating with stakeholders on both feature and non-functional requirements.
“Spec-driven development is the natural maturation and evolution of vibe coding, where teams are fully maximizing the context window of their agent,” says Austin Spires, senior director of developer marketing at Fastly. “Spec-driven vibe coding forces engineers and teams to have a clearer vision, firmer requirements, and stronger writing than the first iterations of vibe coding.”
Nic Benders, chief technical strategist at New Relic, adds, “Production software doesn’t start with coding. It starts with thinking about the problem, figuring out what you want, and communicating that with your team. Spec-driven development puts a brand name on doing that thinking and writing, but with an AI tool as your team.”
Competing or complementary?
Are SDD and vibe coding competing approaches? Will an enterprise support two different methodologies? Or is SDD an evolution of the vibe coding experience? “Vibe coding and spec-driven development aren’t competing approaches; they’re complementary ones, each with a distinct role in the development life cycle,” says Ayaz Ahmed Khan, senior director of engineering at Cloudways by DigitalOcean. “Use vibe coding to explore and prototype, and spec-driven development with AI to harden and ship. The teams that succeed with genAI are the ones who mindfully guide it with constant feedback to build production-ready software.”
Others suggest that vibe coding and SDD will continue to serve different business needs and implementation strategies. “Vibe coding, especially with capable agentic systems, delivers extraordinary velocity for user-facing prototypes where the blast radius of a defect is small, like for internal tools or first POCs,” says Wiktor Walc, CTO at Tiugo Technologies. “But the moment you’re dealing with large production environments, distributed state, or transactional integrity, you start benefiting from spec-driven contracts between services—not because today’s models can’t reason about complex systems, but because no agentic workflow yet offers the kind of deterministic correctness guarantees that production-critical infrastructure demands.”
Focus on resilient releases
Planning and coding are just two steps in building and supporting applications. There are other opportunities to use AI in the software development life cycle for developing AI agents, including building in observability, integrating Model Context Protocol servers, and robust AI agent testing.
World-class IT departments need to consider how vibe coding and SDD drive business value, innovation, and reliability, more than just improving the coding aspects of delivering applications. To what extent does AI develop solutions that meet business requirements and deliver exceptional user experiences?
“Both vibe coding and SDD assume that the hard work of getting business and IT stakeholders aligned on the right requirements is already done, and this is especially true as enterprises look to reimagine and redesign many of their core workflows to leverage AI,” says Don Schuerman, CTO and vice president of marketing and technology strategy at Pegasystems. “The real opportunity for AI is not just to accelerate how code gets written, but to provide a collaborative canvas where business and IT teams can generate the designs and requirements for a truly reimagined application together.”
Much of today’s excitement is around how AI accelerates application development and developer productivity. But what about the deployment process and the infrastructure to run AI-developed applications?
One emerging trend is AI application development platforms that come bundled with cloud deployment infrastructure and business process automation services. AI-Assisted Development from Appian supports spec-driven development through its business interface Appian Composer and development tools such as Claude, Codex, and Kiro. Pave is a vibe coding platform that deploys to the same secure infrastructure as Quickbase and leverages its governance capabilities. These two examples illustrate how low-code development and process management platforms are evolving to embrace AI capabilities.
Experts remind IT leaders that whether you code, vibe, or adopt SDD, the emphasis should be on delivering resilient applications.
“The focus should be on engineering discipline and system design rather than pitting vibe coding and spec-driven development against each other,” says Sergei Kondratov, director of development at Saritasa. “The success of any AI-assisted development today depends on how well tasks are broken down and controlled. If that is done poorly, both approaches fail.”
Other experts point out that the quality of AI-generated code and the ease of maintaining AI-generated applications are open questions.
“Spec-driven development orients teams toward the right business and technical outcomes, while AI coding increases velocity, says Christian Stano, field CTO at Anyscale. “What matters is the interface where production software actually ships, where focus should solve the real bottleneck: whether review processes, infrastructure, and guardrails can keep pace. The key metric isn’t speed alone, but whether teams are accelerating without trading off reliability or accumulating hidden technical debt.”
Hannes Hapke, director of the 575 Lab at Dataiku, adds, “While vibe coding compresses the time to first demo, there are major concerns about debt, security, and auditability. Spec-driven preserves discipline but adds overhead, and the key opportunity is blending both. CIOs need to measure impact through time to release, bug rates, refactoring frequency, and developer satisfaction, not just velocity.”
There’s no doubt that vibe coding and SDD will evolve, and there’s a reasonable chance the two practices will converge into a generalized AI coding environment. One example is GitHub’s Spec Kit, which works with GitHub Copilot, Claude Code, and Gemini CLI, and treats spec writing as a prerequisite to vibe coding and code generation.
As AI’s development capabilities improve, IT will need to consider how to evolve the end-to-end development process and ensure new capabilities do more than improve velocity and productivity.
Diskless databases: What happens when storage isn’t the bottleneck 5 May 2026, 9:00 am
In 2021, I was developing software for an aerospace manufacturer and met with our machine learning team to discuss innovative approaches for tracking FOD (free-orbiting debris), a major security and operational concern in the industry. What struck me wasn’t the algorithms or tracking equipment, but the terabytes of data (up to petabytes) that were being produced.
Old-school problems of limited hardware resources and inefficient data compression were bottlenecking cutting-edge visual learning models and traditional tracking solutions alike. The team was smart and could fine-tune quickly, but the real challenge was making sure our infrastructure could scale with them.
In aerospace, performance hinges on how fast systems can absorb and interpret massive telemetry streams, and storage is often the silent limiter. When you’re generating terabytes to petabytes of data in a single test cycle, even a brief stall in the storage layer becomes a bottleneck. A few milliseconds of delay between what’s happening and what the system can write, index, or retrieve doesn’t just slow things down. It can compound through an entire run.
Traditional databases were built around disk constraints and batch workloads. But what happens when those limits no longer define what’s possible?
The diskless shift
Diskless architectures sidestep traditional constraints by separating compute from storage and removing local persistence from the critical path. Data is ingested and indexed in memory for immediate availability, while object storage provides the durable, elastic foundation underneath. The result is a database that accelerates both ingestion and retrieval without sacrificing persistence.
This design offers the best of both worlds: the elasticity and durability of object storage with the speed of in-memory caching. Compute and storage scale independently. Systems can scale continuously, recover automatically, and adapt to changing workloads without planned downtime or manual intervention.
Diskless design means data can be ingested, queried, and acted upon in real-time without trade-offs between cost, performance, and scale.
Why disks became the bottleneck
Traditional databases were built around disk constraints and transactional workloads, where latency between ingestion and retrieval doesn’t matter much. But for time series workloads, whether it’s telemetry, observability, IoT, industrial, or physical AI systems, that latency becomes the difference between insight and incident.
Diskless design combines the elasticity of cloud storage with the speed of in-memory indexing and caching. There is no complicated HA setup or heavy orchestration across a distributed system. Just linear, predictable performance.
Diskless architecture brings several benefits out of the box:
- High availability: Multi-AZ durability without complex replication.
- Zero migration: No data movement when upgrading or moving instances.
- Fault isolation: If one node fails, another can continue servicing requests with no downtime.
- Simplified scaling: Add or remove nodes on demand for ingest or query load.
What changes when the disk disappears
When storage is no longer the constraint, the entire performance profile of the database shifts. Instead of planning around limits, teams can rely on a system that remains responsive as data volumes grow, with capacity expanding in the background and compute scaling alongside demand.
This separation of compute and storage also unlocks operational simplicity. There’s no need to manage replicas or create fault isolation per node; the object store itself is able to provide this redundancy automatically. Enterprises gain petabyte-scale storage, continuous uptime, and a deployment model that adapts seamlessly across environments, whether it’s on-prem, cloud, or hybrid.
A new foundation for real-time systems
Removing the disk isn’t just a performance optimization, it’s a paradigm shift.
Predictive maintenance systems can now analyze live sensor telemetry continuously instead of batching overnight. Industrial control systems can react instantly to anomalies instead of waiting for downstream processors. AI and machine learning models can train against live data streams that tell a story instead of static snapshots that lack context.
When you eliminate the dependency on local storage, you eliminate an entire class of operational drag. The database becomes an active, real-time engine, not just a place to store data.
Architecting for what’s next
Diskless design is not an end point, but a foundation. Over the next decade, databases will continue to evolve from managing persistence to powering intelligence. Diskless architectures are a step in that direction, making the database not just faster, but fundamentally more capable of keeping up with the pace of the physical world.
Because when your systems depend on real-time decisions, the slowest part of your stack can’t be your database.
—
New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.
SAP to acquire data lakehouse vendor Dremio 5 May 2026, 3:03 am
SAP on Monday announced plans to acquire Dremio, which bills itself as an agentic lakehouse company, for an unspecified price. The move is complicated by similar offerings from existing SAP partners Snowflake and Databricks, but analysts point to key differences with Dremio, especially in its ability to work with data while it sits in the enterprise’s environment, rather than having to live externally.
One of SAP’s justifications for the acquisition is that it will theoretically make it easier for IT executives to combine SAP data with non-SAP data. But its strongest rationale involves Dremio’s ability to make complex data more AI-friendly, so that it can more quickly and cost-effectively be made usable.
“Most enterprise AI projects fail to deliver value not because of the AI itself, but because the underlying data is fragmented, locked in proprietary formats and stripped of the business context that makes it meaningful,” the SAP announcement said. “The result is a familiar and costly pattern: pilots that cannot scale, slow integration of new data sources, duplicated engineering work and compliance risk when organizations cannot explain how an AI-driven decision was reached. Dremio helps eliminate that data fragmentation and integration friction.”
While SAP is citing the data quality argument, there are many elements of enterprise data quality, including data that is outdated, from unreliable sources, or that exists without meaningful context that aren’t addressed by Dremio.
However, SAP said, “With Dremio, SAP Business Data Cloud will become an Apache Iceberg-native enterprise lakehouse that unifies SAP and non-SAP data to power agentic AI at enterprise scale. Apache Iceberg is the industry-standard open table format, and SAP Business Data Cloud will natively support it as its foundation.” This means that there need be no data movement or format conversion; SAP and non-SAP data “can coexist on the same open foundation, with federated analytical reach across every enterprise data source.”
Complicated comparison
Analysts and consultants said that any comparison of Dremio to existing SAP partners Snowflake and Databricks is complicated. For example, Dremio is younger and less established than either Snowflake or Databricks, which suggests that it is a less ideal match for enterprises.
SAP strategy specialist Harikishore Sreenivasalu, CEO of Aarini Consulting in the Netherlands, said that both Snowflake and Databricks would have been ideal acquisition targets many years ago, but they would be far too expensive today.
“Databricks and Snowflake are better [for enterprise IT] for sure because they have a mature platform, they do multi cloud” whereas Dremio “is the new entrant in the market and they have to mature more to be enterprise ready. Their security aspects need to mature,” Sreenivasalu said.
But Sreenivasalu added that the situation could easily change after SAP invests and works with the Dremio team. He advised CIOs to “stick with where you are today but watch how technologies get integrated. Listen to the SAP roadmap.”
In a LinkedIn post, Sreenivasalu said the move still is very positive for SAP: “This is the missing piece. SAP has Joule. SAP has BTP. SAP has the business processes. Now it has the open data fabric to feed AI agents the context they need to act, not just answer. For those of us building on SAP BTP + Databricks + SAP BDC, this is a signal: the lakehouse and the ERP world are converging, fast. The future of enterprise AI just got a whole lot clearer.”
Addresses LLM limitations
During a news conference Monday morning, SAP executives focused on how this move potentially addresses some of the key large language model (LLM) limitations with enterprise data, especially with predictive analytics.
Philipp Herzig, SAP’s chief technology officer, said that LLMs have various limitations, noting, “LLMs don’t deal really well with numbers” and that they struggle with structured data “where we have a lot of differentiation.”
The practical difference is when systems try to predict the future as opposed to analyzing the past, such as when asking how well a retailer’s product will sell over the next 10 months, or predicting likely payment delays and their impacts on projected cashflow. “This is where LLMs struggle a lot,” Herzig said. He also stressed that Dremio’s ability to work with enterprise data while it still resides in that organization’s on-prem systems is critical for highly-regulated enterprises.
Local data difference
Flavio Villanustre, CISO for the LexisNexis Risk Solutions Group, also sees the ability to handle data locally as the big draw.
Databricks and Snowflake both offer strong functionality, he pointed out, but users must move the data to their platform and reformat it. After this is complete, the result is a central data lake to address data access needs. “Dremio, on the other hand, provides easy decentralized data access, allowing users to access their data in place,” he said. “Of course, this could be at the expense of data processing performance, but the ease of use and flexibility could outweigh the performance loss.” Implementation speed in days versus weeks or months is another plus, he added. “There is a significant benefit to that.”
Sanchit Vir Gogia, chief analyst at Greyhound Research, agreed with Villanustre, but only to a limited extent.
“The distinction is not as clean as ‘Dremio lets data stay in place, while Snowflake and Databricks require everything to move,’” he noted. “Snowflake and Databricks have both invested significantly in external data access, sharing, open formats, governance layers, and interoperability. So it would be unfair to describe either as old-style ‘move everything first’ platforms.’” But, he added, the broader argument is correct. “[Dremio] starts from the assumption that enterprise data is already distributed and that the first problem is often access, context, federation, and governance, not wholesale relocation. For SAP customers, that matters a great deal,” he said.
That’s because of the nature of many of SAP enterprise customers’ datasets.
“Most large SAP estates are not clean, centralized data environments,” he pointed out. “They are brownfield landscapes: SAP data, non-SAP data, legacy warehouses, departmental lakes, regional repositories, acquired systems, partner data, and industry-specific platforms.” While telling these customers that AI-readiness begins with moving everything into one central platform may be good for the vendor, it’s a lot of work for the buyer.
Dremio gives SAP “a more pragmatic story,” Gogia said. “It allows SAP to say: keep more of your data where it is, access it faster, apply more consistent catalogue and semantic controls, and bring it into Business Data Cloud and AI workflows without forcing a major migration program upfront.”
Aman Mahapatra, chief strategy officer for Tribeca Softtech, a New York City-based technology consulting firm, noted that an acquisition of either Snowflake or Databricks would obliterate SAP’s marketing message/sales pitch.
“SAP did not buy a data warehouse. They bought a position in the open table format wars, and the timing tells you exactly why Snowflake and Databricks were never realistic targets,” he said. “Acquiring either would have collapsed SAP Business Data Cloud’s neutrality story overnight and alienated half the customer base in either direction. SAP’s strategic position depends on sitting above the warehouse layer rather than inside it, and Dremio is the federated layer that talks to both Snowflake and Databricks without requiring SAP to pick a side.”
Assume things will change
Mahapatra urges enterprise CIOs to be extra cautious.
“For IT executives with active Snowflake and Databricks contracts this morning, nothing changes in the next two quarters, but by the first half of 2027, expect SAP to steer net-new AI workloads toward Business Data Cloud regardless of what the partnership press releases say today. The CIOs who plan for that trajectory now will negotiate from strength,” Mahapatra said.
Compute and storage that data warehouse vendors provide is rapidly becoming a commodity, he said, and the “defensible value” in enterprise AI is migrating up the stack to the semantic layer, the catalog, the lineage graph, and the business context that lets an agent know what ‘active customer’ means within an organization.
“SAP just bought the toolkit to own that layer for any company running SAP at the core,” he said. “If you are an SAP-heavy shop running analytics on Snowflake or Databricks, your warehouse vendors are about to feel less strategic and more like high-performance compute backends.”
Corrects a strategic error
Jason Andersen, principal analyst for Moor Insights & Strategy, noted that for quite some time, SAP has been relentlessly encouraging enterprises to host all of their data within SAP systems. SAP can’t reverse that position even if it wanted to.
What the Dremio deal does, Andersen opines, is to instead address the pockets of data that many enterprise CIOs, especially in manufacturing and highly-regulated verticals, have refused to turn over to SAP. The Dremio deal gives SAP a face-saving way to get an even higher percentage of its customers’ data, he said.
“Manufacturing is loath to put things in the cloud and [manufacturing CIOs] put up a violent protest [against] going into the cloud,” Andersen said. “This [acquisition] lets SAP access a lot of data that hasn’t yet moved to SAP.”
Shashi Bellamkonda, principal research director at Info-Tech Research Group, said he sees the SAP Dremio move as fixing a strategic error that SAP made years ago, when it did not develop its own Apache Iceberg capabilities.
“Apache Iceberg is an open-source table format designed for large-scale analytical datasets stored in data lakes, a kind of bridge between raw data files and analytical tools,” Bellamkonda said. “[SAP] should have done this earlier rather than waiting till 2026.”
This article originally appeared on CIO.com.
Page processed in 0.345 seconds.
Powered by SimplePie 1.4-dev, Build 20170403172323. Run the SimplePie Compatibility Test. SimplePie is © 2004–2026, Ryan Parman and Geoffrey Sneddon, and licensed under the BSD License.
