When semiconductor materials misbehave

原文

Key Takeaways

Material behavior in production depends on the process context that no development environment can fully replicate.
In advanced packaging, the interactions that cross domain boundaries are increasingly where failures originate.
The most accurate materials data is also the most commercially sensitive, leaving simulation models calibrated against generic inputs rather than production reality.

It’s generally assumed advanced materials will behave the same in the lab as in production, but that assumption is now under serious pressure.

Typically, the lab result becomes the spec, which then becomes the baseline for qualification. That, in turn, becomes the standard against which field performance is judged. And for most of the industry’s history, this chain of inference held up well enough. Materials were fewer, stacks were simpler, and the interactions between layers were predictable enough that spec-sheet behavior was a reasonable guide to production reality.

But as heterogeneous integration evolves from engineering curiosity to the dominant architecture for high-performance computing, the number of materials in a single package has ballooned. The interactions between them are more complex and more consequential, and the environments in which those packages operate are more demanding than the test conditions designed to qualify them.

“It’s not like the good old days where, if you have a single die and you know its processes, you can just go to production,” said Mike Kelly, vice president of chiplets and FCBGA integration at Amkor. “Most of these packages are quite complicated mechanically, and certainly electrically. It takes a lot of test field development to get to a point where you’ve got a nice, reliable solution. That can’t be overstated.”

What a material does in isolation or in a controlled laboratory sequence is increasingly a poor guide to what it will do when surrounded by dissimilar materials, subjected to multi-step thermal histories, and required to perform reliably over millions of operating hours. The packages now required by advanced AI hardware are mechanically and electrically more complex than those of earlier generations, and the accumulated production experience that once made design decisions straightforward no longer applies as directly. Put simply, the gap between the lab and the fab isn’t new, but it’s getting wider.

The complexity problem
The most direct explanation for why materials misbehave in production is also the most uncomfortable one to admit. The systems being built today are too complex for anyone to fully model in advance, and the interactions that cause problems are often the ones that no single discipline thought to check.

“When you are integrating a bunch of different materials, a bunch of different pieces of silicon, all of this will bring inherent variability together,” said Tiago Tavares, program and project manager at Critical Manufacturing. “The idea that we can predict all of this and be in control of all of this from the design board is unrealistic. You would be simulating for decades to cover all cases. It doesn’t work anymore.”

Semiconductor manufacturing has always involved managing variation, but what has changed is the number of sources that now interact with one another within a single package, and the degree to which those interactions are coupled. A traditional monolithic die had a single material set, a single process flow, and a set of interactions that decades of production experience made reasonably predictable. A modern multi-die assembly with stacked memory, heterogeneous chiplets, and organic interposers has a combinatorial explosion of potential interactions that accumulates with every new material introduced into the stack.

“You are using more and more exotic materials in between,” said Tavares. “It is a sandwich, and you don’t know exactly how the ham and the cheese are going to vary. That is why process enforcement and process design remain critical, but they are no longer sufficient. You need constant monitoring of what is going on.”

The monitoring challenge is compounded by a fundamental structural shift in how these packages are assembled. In a monolithic flow, a process engineer could treat each step as a largely independent optimization problem. Adjust the etch recipe, measure the result, adjust again. The degrees of freedom were manageable because changes in one step had limited downstream consequences. In a heterogeneous package, that independence no longer exists. Every process step inherits the mechanical, thermal, and chemical history of the preceding steps, and every adjustment has consequences that propagate forward in ways not always visible until much later.

“You cannot analyze a process like an island anymore,” added Tavares. “The interactions are more and more visible and growing. And therefore, you cannot just choose to change something on step A without wondering what is going to be happening in step B, C, and D afterwards.”

What simulation misses
If the complexity problem were just a matter of running more comprehensive simulations, it would be solvable in principle, even if the computational costs were high. Simulation tools are built around explicit choices about which effects are treated as first-order, second-order, or negligible. Under most conditions, those choices are well-founded. But the conditions encountered in advanced packaging are not always typical, and a second-order effect in a simple package can become the dominant failure mechanism in a more complex one.

“Mechanical stress affects not only reliability, but also changes the electrical parameters of stressed devices and wires,” said Marc Swinnen, director of product marketing at Synopsys. “But mechanical and electrical are rarely considered together. Any simulator is based on fundamental choices as to which effects to include. The problem that arises is that in certain cases a minor effect actually becomes much bigger.”

As a result, a package can pass electrical and mechanical simulations, but still fail in production because the interaction between the two effects was never modeled. This is a consequence of the way simulation tools have been developed historically, optimized for specific physics domains by teams whose expertise in adjacent domains is limited. Chip designers are not trained in electromagnetic simulation. Packaging engineers are not trained in static timing analysis. The boundaries between these domains have become the places where the model and the reality most frequently diverge.

“Chip, package, and board design are often done separately, but they are significantly linked,” said Swinnen. “This linkage is often buried under generous safety margins that account for the unknown impacts of connecting the chip, package, and board. Safety margins are not free. They bog down performance and increase costs.”

The variability problem adds another dimension that simulation handles poorly, even when the physics are correctly specified. A design that performs within spec at a nominal temperature may behave differently when exposed to thermal gradients from a neighboring component. A material rated to a certain mechanical stress limit may encounter stresses during manufacturing assembly that dwarf what it will experience in the field. The combinations of these variables that can occur simultaneously in production are difficult to comprehensively validate even with sophisticated simulation tools.

The materials data problem
Beneath the simulation challenge is a more fundamental one. The material property values used as inputs to simulations are often wrong, or at least incomplete, in ways that are difficult to correct without data that manufacturers are unwilling to share.

The IP problem is one of the central obstacles to closing the gap between simulation and production reality. Simulation tools draw material properties from databases that aggregate published measurements, scientific literature, or foundry-supplied specifications. For well-characterized materials like silicon and copper, those databases are reasonably accurate. For novel materials such as new glass compositions, specialized dielectrics, and proprietary polymer adhesives, the database entries are sparse, sometimes outdated, and occasionally incorrect.

“Simulation tools take some generic property from the internet or from scientific measurement data, or they take foundry-provided data,” said Lang Lin, product management principal at Synopsys. “Whoever is manufacturing has to give or disclose their secret material properties to our simulation tool, and then we can say the simulation result could be well-correlated. Without that, there is no correlation.”

The problem is that the most accurate material property data is also the most commercially sensitive. A glass substrate manufacturer that has spent years developing a specific composition and polishing process has no incentive to share the precise mechanical and thermal behavior of that material with the industry at large. The competitive advantage embedded in that data is exactly what justifies the development investment. The result is a structural mismatch. The engineers who most need accurate material data to build reliable simulations are working with the least accurate versions, while the organizations that hold the accurate data have legitimate reasons not to release it.

For novel materials at the frontier of what packaging processes aim to do, the problem is even more fundamental. The nonlinear behavior of material properties across different temperatures is well understood for established materials, but it is often less well understood for newer materials.

“You have to model the non-linear behaviors of how the mechanical properties of a material change with temperature,” said Lin. “We probably know pure copper well. But for glass with some kind of modified material properties, what will be the temperature dependence? It could be nonlinear in ways we don’t know.”

When the field finds what the lab missed
The consequences of these modeling gaps show up in production, and sometimes further downstream in field failures that are difficult to trace back to their origin. There is a consistent pattern in how failures reach the field. The dominant cause is rarely a material that failed to meet its nominal specification, but rather a latent defect introduced during manufacturing that the qualification process was never designed to catch.

“Many field issues come from latent defects introduced during manufacturing,” said Prasad Dhond, vice president for wire bond and BGA products at Amkor. “Contamination, process variations, and equipment excursions are sources of latent defects that can get exacerbated in the field. In addition to qualification, production control and how you run the factory and the assembly line are very critical.”

The difficulty is that latent defects do not always appear as defects at first. A signal that will eventually translate into yield loss can be present early in a process flow as something ambiguous: a slight color variation, an optical anomaly, or something that looks more like a nuisance than a failure mechanism. The connection between what is visible early and what will matter later is not obvious until enough data has accumulated to establish it.

This is a structural feature of complex manufacturing flows. The points at which a defect becomes visible, becomes measurable, and when it causes a failure are different, often separated by weeks of processing and dozens of intervening steps. The qualification test sits at the end of that sequence and asks only whether the device passes or fails. It doesn’t ask where the failure originated, which is the question that would actually close the gap between what the lab modeled and what the fab produced.

“You see a defect, and sometimes it is hard to see, and it may show up as a discoloration in the analysis. If it is just a nuisance or cosmetic, it really doesn’t do anything,” said Errol Akomer, applications director at Microtronic. “But then when the lot gets to probe, it fails. That’s how you learn which defects hurt you and which ones don’t, which ones you can ignore, and which ones we better figure out, because there is a problem.”

The challenge is made more acute by the economics of failure analysis in production. When a chip fails in the field, the first instinct is often to replace it and continue rather than recover it for analysis. The data that would allow engineers to understand what went wrong and to build better models is discarded along with the failed part.

“Being able to pull data together to determine what is happening on failures is only useful when you have failures to add data to it,” said Amkor’s Kelly. “The fewer failures you have, the less data you have, and the less accurate your model. It is a catch-22. At some point you stop modeling and you start building, and then you have continuous improvement in early production to get to where you really want to be. There is still a gap.”

A case study in the gap
The introduction of molybdenum as a replacement for tungsten in middle-of-line metallization illustrates the lab-to-fab gap from a direction that has nothing to do with packaging, and everything to do with the fundamental difference between characterizing a material and integrating it.

Molybdenum offers meaningful resistivity advantages over tungsten at the small feature sizes now being targeted in logic, DRAM, and NAND. A shorter mean free path means it can achieve its full conductance benefit in smaller dimensions, whereas tungsten increasingly cannot. It also eliminates the need for a separate barrier and liner layer because it adheres directly to the oxide and doesn’t penetrate the dielectric, allowing more of the available volume to be filled with the functional metal rather than a higher-resistivity supporting material. In the lab, measured against the specifications that matter for unit process qualification, molybdenum performs well.

But ramping a new material in production is a different problem. Developing the unit process – the deposition tool, film properties, uniformity, and particle behavior – requires collaboration between materials engineers and process engineers. What unit process development cannot address in advance of production data is how the new material will behave within a specific customer’s process flow, surrounding materials, and integration scheme.

“By the time we go to beta and customers are trying to adopt, the challenge is the integration of that film into their process flow,” said Kaihan Ashtiani, corporate vice president and general manager at Lam Research. “The unit film requirements, like how fast to run it, how well it fills the contacts, whether the resistivity meets the specs, and the uniformity, the particle behavior — that is our job to develop in the tools. But the integration into the customer’s existing flow is where the learning happens. The requirements are different between DRAM, NAND, and logic, and those are some of the challenges when we go into beta and eventually into production.”

The point is not that molybdenum misbehaves in any fundamental sense. It is that the behavior of any new material in production depends on its interactions with the specific process context around it, and that context cannot be fully replicated in the development environment where the material was characterized. Each customer’s integration brings its own thermal budget, adjacent materials, and process sequence constraints. A film property that looked like a minor variable in unit process development can become a first-order concern when it turns out to interact with a specific etch chemistry downstream or to behave differently than expected when deposited on a surface that has undergone a sequence of prior steps the lab never modeled. The years of unit process development that Lam invested in molybdenum bought a well-characterized film. It could not buy a pre-characterized integration because each integration differs by customer and device type. That last mile, where the lab result meets the production context, is where the gap lives.

Closing the gap
The industry is not sitting idle in the face of these challenges. A significant amount of engineering effort is now directed toward building better connections between the virtual and physical worlds by using machine learning to navigate parts of the design space that pure physics-based modeling can’t reach, and by treating the fab floor as a continuous source of model calibration rather than a downstream endpoint.

Still, unconstrained machine learning applied to manufacturing data has no inherent understanding of the physical space it is navigating, which means it can optimize aggressively within its training data while producing results that fail in production for reasons the model was never taught to consider.

“You can train into some data set you have, but machine learning really has no concept of the space that it is in or how to optimize within that,” said Joseph Ervin, managing director of Semiverse Solutions at Lam Research. “Using virtual silicon puts constraints and physics into the machine learning space to be able to guide where the process steps and parameters can actually achieve results.”

The approach involves building a three-dimensional virtual representation of the device under construction, aligning it with inline metrology data from the actual production process, and using the aligned virtual model to guide machine-learning optimization across multiple yield-failure modes simultaneously.

The data problem remains harder to solve. The data needed to close the gap between lab and fab exists, at least in principle. The challenge is that collecting it, interpreting it, and connecting it to the right engineering decisions requires a level of institutional knowledge and collaborative willingness that is still developing.

“People are still learning the effects and the combinations,” said Critical Manufacturing’s Tavares. “This will take a while to settle in. The data is available, but first you need to know what you are looking for. Data is not equal to information. The ability to transform data into information is still a challenge.”

The tools for closing the lab-to-fab gap are improving. There are better simulation frameworks, physics-constrained machine learning, richer inline metrology, and more sophisticated digital twins. But the materials being asked to perform in these new environments are genuinely novel, the interactions between them are only partially understood, and the experience base needed to characterize their behavior reliably in production is still being built. The gap exists because the pace of materials adoption is outpacing the pace at which its consequences can be fully understood.

Related Stories

Every Atom Now Counts In Advanced Chip Manufacturing
How atomic-layer deposition and hybrid dielectrics are redefining reliability and scaling for AI-era semiconductors.

Semiconductor Virtual Fabrication And Its Applications
Easily and vividly visualize complicated 3D structures.

Big Changes Ahead For Semiconductor Manufacturing
Inside Chips Podcast: PDF Solutions’ CEO discusses AI in chip manufacturing, 3D-ICs, and shorter cycle times.