Impact of Scaled Technology on Radiation Testing and Hardening

Kenneth A. LaBel
ken.label@nasa.gov
Co-Manager, NASA Electronic Parts and Packaging (NEPP) Program

Lewis M. Cohn
Lewis.Cohn@dtra.mil
Defense Threat Reduction Agency (DTRA)
Outline

- Emerging Electronics Technologies
  - What has changed and is changing in the commercial semiconductor world
- Radiation Effects and Sources
- Challenges to Radiation Testing and Modeling
  - TID Trends
  - Fault isolation
  - Scaled Geometry
  - Speed
- Summary/Comments

Note: the emphasis of this presentation is digital technologies and SEE. Some discussion of mitigation implications is included.
Changes in the Electronics World

- Over the past decade plus, much has changed in the semiconductor world. Among the rapid changes are:
  - Scaling of technology
    - Increased gate/cell density per unit area (as well as power and thermal densities)
    - Changes in power supply and logic voltages (<1V)
      - Reduced electrical margins within a single IC
    - Increased device complexity
      - More functions per chip: >1 billion gates in a single device
    - Speeds to >> GHz (CMOS, SiGe, InP…)
  - Changes in materials
    - Use of antifuse structures, phase-change materials, alternative K dielectrics, Cu interconnects (previous – Al), insulating substrates, ultra-thin oxides, etc…
  - Increased input/output (I/O) in packaging
    - Use of flip-chip, area array packages, etc
  - Increased importance of application specific usage to reliability/radiation performance

“Impact of Scaled Technology on Radiation Testing and Hardening” presented by Kenneth A. LaBel, GOMAC 2005, Las Vegas, NV, April 7, 2005
Radiation Effects and Spacecraft

- Critical areas for design in the natural space radiation environment
  - Long-term effects causing parametric and/or function failures
    - Total ionizing dose (TID)
    - Displacement damage
  - Transient or single particle effects (Single event effects or SEE)
    - Soft or hard errors caused by proton (through nuclear interactions) or heavy ion (direct deposition) passing through the semiconductor material and depositing energy

An Active Pixel Sensor (APS) imager under irradiation with heavy ions at Texas A&M University Cyclotron
Typical Ground Sources for Space Radiation Effects Testing

- **Issue: TID**
  - Co-60 (gamma), X-rays, Proton

- **Issue: Displacement Damage**
  - Proton, neutron, electron (solar cells)

- **SEE (GCR)**
  - Heavy ions, Cf

- **SEE (Protons)**
  - Protons (E>10 MeV)

- **SEE (atmospheric)**
  - Neutrons, protons

TID is typically a local source with nearby ATE. All others require travel and shipping - A constraint for how testing is done.

Wide Field Camera 3 E2V
2k x 4k n-CCD in front of Proton Beam at UCDavis
Total Ionizing Dose – Summary trends

- Deep sub-micron (<0.25um) CMOS basic structures have shown increasing tolerance to TID (thinner oxides)
  - >100 krad(Si)
- However,
  - Complex structures and those that require higher voltage fields such as charge pumps in flash memories or FPGAs may be MUCH more TID sensitive
  - Bipolar devices do not scale as easily and are susceptible to enhanced low dose rate sensitivity (ELDRS)
    - Failure at << 100 krad(Si) at low space dose rates
    - CMOS devices approaching bipolar like structure? (Fleetwood, et al.)
Radiation Test Challenge – Fault Isolation

- Issue: understanding what within the device is causing fault or failure.
  - Identification of a sensitive node.

- Technology complications
  - “Unknown” and increased control circuitry (hidden registers, state machines, etc.)
  - Monitoring of external events such as an interrupt to a processor limits understanding of what may have caused the interrupt
    - Example: DRAM
      » Hits in control areas can cause changes to mode of operation, blocks of errors, changes to refresh, etc...
    - Not all areas in a device are testable
Fault Isolation – (2)

- **Example: SRAM-based reprogrammable FPGA** measuring sensitivity of user-defined circuit
  - SEE in configuration area corrupts user circuitry function
    - Can cause halt, continuous misoperation, increased power consumption (bus conflicts), etc.
  - Often the sensitivity of the configuration latches overwhelm user circuitry sensitivity
    - Must have correct configuration to measure user circuit performance

- **Increased number of control structures in a device** drives an increasing rate of single event functional interrupts (SEFIs)

Complex new FPGA architectures include hard-cores: processing, high-speed I/O, DSPs, programmable logic, and configuration latches.
Fault Isolation –(3)

- **Macrobeam structure**: implies probabilistic chance of hitting a single node that may be sensitive
  - If test is run for SEE, typical heavy ion test run is to $1 \times 10^7$ particles/cm$^2$.
    - Ex., SDRAM – 512 Mb (5x10$^8$ bits plus control areas)
      - If all memory cells are the same, no issue. BUT if there are weak cells how do you ensure identifying them?
      - Control logic may be a very small area of the chip. If you fly 1000 devices, area is no longer “small”
  - Difficult to evaluate clock edge sensitivity of a node
- **Die access** (required for most single event testing)
  - Typical heavy ion single event macrobeam simulators have limited energy range
    - Implies limited penetration through packaged device
    - Access to die typically required
      - Overlayers, metalization, etc must be taken into account

<table>
<thead>
<tr>
<th>Facility</th>
<th>Ion (Energy)</th>
<th>LET (Si)</th>
<th>Range in Si (μm)</th>
<th>Peak LET</th>
</tr>
</thead>
<tbody>
<tr>
<td>NSCL</td>
<td>Xe (3.2 GeV)</td>
<td>40</td>
<td>272</td>
<td>69</td>
</tr>
<tr>
<td>TAMU</td>
<td>Ar (2 GeV)</td>
<td>5.9</td>
<td>390</td>
<td>18</td>
</tr>
</tbody>
</table>

Table assumes ion traverses 1.5 mm plastic LET given in MeV-cm$^2$/mg
Fault Isolation –(4)

- Standard microbeam and laser test facilities have similar limitations for range of particle
  - On older technologies, these facilities are used to determine what structure within a device is causing fault/failure
  - New technique (two-photon absorption - TPA) with the laser is being developed, but is still in research phase
  - New test structures built specifically for test may be required
    - Reduced metalization, special packaging, etc.

TPA is a new technique to overcome some of the test limitations from packaged device and metalization issues. Courtesy Dale McMorrow, NRL
Radiation Test Challenge – Geometry

- Issue: the scaling of feature size and closeness of cells
- Technology complications
  - Multiple node hits with a single heavy ion track
    - Because of the closeness of transistors and thinness of the substrate material, a single particle strike can effect multiple nodes potentially defeating hardening schemes.

<table>
<thead>
<tr>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
<th></th>
</tr>
</thead>
<tbody>
<tr>
<td>Logic Half Pitch (nm)</td>
<td>150nm</td>
<td>130nm</td>
<td>107nm</td>
<td>90nm</td>
<td>80nm</td>
<td>70nm</td>
<td>65nm</td>
<td>45nm</td>
<td>32nm</td>
<td>22nm</td>
</tr>
<tr>
<td>Logic Gate in Resist (nm)</td>
<td>90nm</td>
<td>70nm</td>
<td>65nm</td>
<td>53nm</td>
<td>45nm</td>
<td>40nm</td>
<td>35nm</td>
<td>25nm</td>
<td>18nm</td>
<td>13nm</td>
</tr>
<tr>
<td>DRAM Half Pitch (nm)</td>
<td>130nm</td>
<td>115nm</td>
<td>100nm</td>
<td>90nm</td>
<td>80nm</td>
<td>70nm</td>
<td>65nm</td>
<td>45nm</td>
<td>32nm</td>
<td>22nm</td>
</tr>
<tr>
<td>Contact in Resist (nm)</td>
<td>165nm</td>
<td>140nm</td>
<td>130nm</td>
<td>110nm</td>
<td>100nm</td>
<td>90nm</td>
<td>80nm</td>
<td>55nm</td>
<td>40nm</td>
<td>30nm</td>
</tr>
<tr>
<td>Overlay</td>
<td>45nm</td>
<td>40nm</td>
<td>35nm</td>
<td>32nm</td>
<td>28nm</td>
<td>25nm</td>
<td>23nm</td>
<td>18nm</td>
<td>13nm</td>
<td>9nm</td>
</tr>
</tbody>
</table>

Source: ITRS

“Impact of Scaled Technology on Radiation Testing and Hardening” presented by Kenneth A. LaBel, GOMAC 2005, Las Vegas, NV, April 7, 2005
Geometry Implications (2)

- Multiple node hits (cont’d)
  - Ex., memory array
    - A single particle strike can spread charge to multiple cells. If the cells are logically as well as physically located
      - Standard memory scrub techniques such as Hamming Code can be defeated
    - This is not new, simply exacerbated by scaling. Traditional SEU modeling considers particle strikes directly on a transistor
  - Charge spreading for strikes near but not on the transistor can generate errors
    - Measured error cross-sections may exceed physical cross-sections
  - Albeit actual individual targets are smaller for a single particle
    - More targets and the spread of non-target hits implied potentially increased error rates per device
  - The role of particle directionality and of secondaries requires future use of physics-based particle interaction codes coupled with circuit tools.
    - GEANT4, MCNPX, etc. are the type of codes required
      - Efforts begun to turn these into tools and not just science codes

Charge spreading from a single particle in an active pixel sensor (APS) array impacts multiple pixels
Geometry Implications (3)

- High-aspect ratio electronics
  - For “standard” devices, the direction of the secondary particles produced from a proton (or neutron) are considered omnidirectional
  - However, for electronics where there is a high-aspect ratio (very thin with long structure), this is not the case
    - The forward spallation of particles when the proton enters the device along the long structure increases the potential error measurement cross-section
    - Test methods and error rate predictions need to consider this

Effects of protons in SOI with varied angular direction of the particle; Blue line represents expected response with “standard” CMOS devices.

after Reed, 2002
Geometry Implications (4)

- **Ultra-thin oxides provide two concerns**
  - Single particles rupturing the gate
    - This is a function of the thinness and the current across a gate oxide
  - The impact of oxide defects
    - Role for TID

- **Secondaries from packaging material**
  - Even on the ground, particle interaction with packaging materials can cause upsets to a sensitive device
    - Ex., Recent FPGA warning of expectation of up to 1 upset/spontaneous reconfiguration a day!

- **Small probability events have increased likelihood of occurring**
  - If 1 in a $10^9$ particles causes a “larger” LET event or 1 in $10^6$ transistors can cause a more complex error
    - With billion plus transistor devices and potential use of >1000 of the same device (re: solid state recorders), small probabilities become finite

Sample 100 MeV proton reaction in a 5 um Si block.
Reactions have a range of types of secondaries and LETs.
(after Weller, 2004)
Radiation Test Challenge – Speed Implications

- Issue: the increasing device speeds (> GHz) impact testing, test capability requirements, and complicate effects modeling.

![Graph showing MPU Clock Frequency Actual vs ITRS](image)

- Actual Scaling Acceleration, or Equivalent Scaling Innovation Needed to maintain historical trend
- MPU Clock Frequency Historical Trend:
  - Gate Scaling, Transistor Design contributed ~ 17-19%/year
  - Architectural Design Innovation contributed additional ~ 21-13%/year

Sources: Sematech, 2001 ITRS ORTC
Speed (2)

- Technology Complications
  - Propagation of single event transients (SETs)
    - As opposed to a direct upset by a particle strike on a latch-structure, the particle hit causes a transient (think hit on a combinatorial logic or such) that can propagate to change the state of a memory structure down the chain.
      - The transient pulse width can be on the order of picoseconds to nanoseconds (or longer depending on circuit response)
        - Older, slower devices didn’t recognize the transient (i.e., minimum pulse width required for circuit response was greater than that generate by a single particle)
        - Newer devices can now respond to these hits increasing circuit error rates
    - Transient size in analog devices has been seen to be a partial function of the range of the particle entering the device
      - Impacts facility usage choices

```
Speed (2)

- Technology Complications
  - Propagation of single event transients (SETs)
    - As opposed to a direct upset by a particle strike on a latch-structure, the particle hit causes a transient (think hit on a combinatorial logic or such) that can propagate to change the state of a memory structure down the chain.
      - The transient pulse width can be on the order of picoseconds to nanoseconds (or longer depending on circuit response)
        - Older, slower devices didn’t recognize the transient (i.e., minimum pulse width required for circuit response was greater than that generate by a single particle)
        - Newer devices can now respond to these hits increasing circuit error rates
    - Transient size in analog devices has been seen to be a partial function of the range of the particle entering the device
      - Impacts facility usage choices
```
Speed (3)

- Propagation of SETs (cont’d)
  - Crossover appears in the ~400-500 MHz regime
  - Charge generation can now last for multiple clock cycles
    - Impact is to defeat hardening schemes that assume only a single clock cycle is affected
Speed (4)

Average number of errors noted by a single particle event increases with speed and LET.

Effects of heavy ions on SiGe devices at 12 GHz speeds notes anomalous charge collection of this high-speed technology; Drawn line represents expected response with “standard” models.

Expected curve shape

Anomalous angular effects at low LET
Testing at a remote facility requires highly portable test equipment capable of high-speed measurements

- Tester needs to be near the device or utilize high-speed drivers
  - Cable runs between the device under test (DUT) and the tester can be up to 75 feet
- Simple devices like a shift register chain can be tested using bit error rate testers (BERTs)
  - BERTs can run to ~$1M and tend to be very sensitive to problems from shipping
    - At proton test facilities secondaries are generated (neutrons) that can cause failures in the expensive test equipment if they are located near the DUT
- Self-test techniques for testing devices being developed for shift-registers
  - Modern reconfigurable FPGA-based test boards being developed to test more generic devices

Beware of stray neutrons impinging on your test equipment.
Here, Borax is shown on top of a power supply to absorb neutrons.
Speed (6)

- Testing in a vacuum chamber implies mechanical, power/thermal, and hardware mounting constraints
  - High-speed devices often mean high power consumption
    - Issue is mounting of DUT in vacuum chamber and removal of thermal heat
      - Can also be a challenge NOT in a vacuum
      - DUT may need to be custom packaged to allow for thermal issues
    - Active system required for removal of heat

Brookhaven National Laboratories’ Single Event Upset Test Facility (SEUTF)
Specialty Packaging for Radiation Test

Front

Back
Summary and Comments

- We have presented a brief overview of SOME of the radiation challenges facing emerging scaled digital technologies
  - Implications on using consumer grade electronics
  - Implications for next generation hardening schemes
- Comments
  - Commercial semiconductor manufacturers are recognizing some of these issues as issues for terrestrial performance
    - Looking at means of dealing with soft errors
  - The thinned oxide has indicated improved TID tolerance of commercial products
    - Hardened by “serendipity”
      - Does not guarantee hardness or say if the trend will continue
    - Reliability implications of thinned oxides

“Impact of Scaled Technology on Radiation Testing and Hardening” presented by Kenneth A. LaBel, GOMAC 2005, Las Vegas, NV, April 7, 2005
The Top Five Research/Development Areas Required for Radiation Test and Modeling – Author’s Opinions

- 5 Understanding extreme value statistics as it applies to radiation particle impacts
- 4 System Risk Tools
- 3 High-Energy SEU Microbeam and TPA Laser
- 2 Portable High-Speed Device Testers
- 1 Physics Based Modeling Tool
Backup Slides
Mainstream digital – CMOS scaling

Semiconductor Roadmap

"Moore’s Law" continues to drive semiconductor roadmap
- ~30% reduction in transistor size with each new technology

From <10k in 1975 to >1B in 2010
Total Ionizing Dose (TID)

- Cumulative long term ionizing damage due to protons & electrons
- Effects
  - Threshold Shifts
  - Leakage Current
  - Timing Changes
  - Functional Failures
- Unit of interest is krad(material)
- Can partially mitigate with shielding
  - Low energy protons
  - Electrons

Erase Voltage vs. Total Dose for 128-Mb Samsung Flash Memory
Displacement Damage (DD)

- Cumulative long term *non-ionizing* damage due to protons, electrons, and neutrons
- Effects
  - Production of defects which results in device degradation
  - May be similar to TID effects
  - Optocouplers, solar cells, CCDs, linear bipolar devices
- Unit of interest is particle fluence for each energy mapped to test energy
  - Non-ionizing energy loss (NIEL) is one means of discussing
- Shielding has some effect - depends on location of device
  - Reduce significant electron and some proton damage
Single Event Effects (SEEs)

- An SEE is caused by a *single charged particle* as it passes through a semiconductor material
  - Heavy ions
    - Direct ionization
  - Protons for sensitive devices
    - Nuclear reactions for standard devices
    - Optical systems, etc are sensitive to direct ionization
- Effects on electronics
  - If the LET of the particle (or reaction) is greater than the amount of energy or *critical charge* required, an effect may be seen
    - Soft errors such as upsets (SEUs) or transients (SETs), or
    - Hard (destructive) errors such as latchup (SEL), burnout (SEB), or gate rupture (SEGR)
- Severity of effect is dependent on
  - type of effect
  - system criticality

"Impact of Scaled Technology on Radiation Testing and Hardening" presented by Kenneth A. LaBel, GOMAC 2005, Las Vegas, NV, April 7, 2005
Total Ionizing Dose (TID) – Technology Trends (1)

- CMOS Digital Volatile Memory & Logic Technology
  - As CMOS has scaled in the past few years, the trend is for TID tolerance to increase
  - 0.25um feature size and below has shown 100krad(Si) tolerance and greater without any additional hardening
  - Thinner oxides are prime driver in this

![DRAM Cell Area History / 2001 ITRS Model](image)

- Actual Scaling Acceleration, Or Equivalent Scaling Innovation Needed to maintain historical trend

[Source: SEMATECH, 2001 ITRS Roadmap]
Total Ionizing Dose (TID) – Technology Trends (2)

- CMOS Programmable and Non-Volatile Memory Technologies
  - Both technologies show sensitivity to TID, < 100 krad (Si) in some cases, due to need for higher control voltages such as charge pumps and sensitivity of sense amps

```
Submicron FPGA TID Tolerance
0.35 μm to 0.6 μm

| RT54SX16 Proto 0.6 μm, 3.3V, MEC |
|-------------------|-----------------|
| A54SX16 Proto 0.35 μm, 3.3V, CSM |
| A42MX09, 0.45 μm, 5.0V, CSM |
| QL3025, 0.35 μm, 3.3V, TSMC |

XQR4000XL, 0.35 μm, 3.3V, 60
RH54SX16 Proto, 0.6 μm, 3.3V, > 200

FPGA TID Response showing TID Sensitivity
```
Bipolar Linear Technologies
- Demonstrate extreme sensitivity to TID, parametric & functional fails < < 100 krads
- Many modern devices subject to Enhanced Low Dose-Rate Sensitivity (ELDRS) Effects
  - It has been predicted that this effect may be seen in scaled CMOS as the scaling approaches a bipolar-like structure (Fleetwood, et al.)

ELDRS Enhancement Factor (EF) vs. dose rate for several bipolar linear circuits