



# Survey of Single Event Functional Interrupt Test and Prediction Methods

Student: Nicole E. Ogden

**Electrical Engineer, Ph.D. Student** 

Advisor: Dr. Brian D. Sierawski

Vanderbilt University

**NEPP Electronics Technology Workshop, 2020** 

This work supported by NASA NEPP Grant #80NSSC20K0424





### Vanderbilt Engineering

- Description
  - Single event functional interrupts (SEFIs) are an elusive soft-error that can unexpectedly debilitate a component in a radiation environment
- Objective
  - Survey the community's handling of accelerated test and on-orbit prediction of SEFIs in published literature
  - Characterize signature SEFI behaviors and their repercussions
  - Identify shortcomings and considerations for test, operation, measurement, and extrapolation to failure rates
- Outcome
  - Provide a source of reference for radiation effects engineers



### Overview



#### • What is a SEFI?

- Working Definition
- SEFI Importance
- Component-Level SEFI Analysis

### • Experimental Considerations

- Testing
- Preparation
- Measurement
- Prediction Model
- Future Work

# What is a SEFI?



Vanderbilt Engineering

- JESD57 Test Standard [1]
  - **Previous Definition (1996):** "The loss of functionality of the device that does not require cycling of the device's power to restore operability unlike SEL and does not result in permanent damage as in SEB."
  - Latest Definition (2017): "A non-destructive interruption resulting from a single ion strike that causes the component to reset, hang, or enter a different operating condition or test mode."



[1] JESD57 Test Standard – "Test Procedures for the Measurement of Single-Event Effects in Semiconductor Devices from Heavy Ion Irradiation"
 [2] Texas Instrument - Radiation Handbook for Electronics (Texas Instruments Incorporated - copyright 2019)

# **SEFI Importance**

Vanderbilt Engineering

- Causes errors to manifest overtime in a part
- Errors can potentially propagate
  - Internally to other regions of the component
  - Externally to other parts in a complex system
- Component recovery method
  - Reset part (i.e. power off/on cycle, soft reset signal issued)
  - Re-initialization of part if required/necessary
  - Data lost at component- and system-level



SEFI Examples in a Memory

Figure 4-11. Schematic representation of a SEFI fail mode in a memory. A single bit corrupted in the control logic leads to erroneous behavior that causes many failures in the memory array (red bits) – SEFIs usually manifest as blocks, sections of rows or columns, depending on what logic was affected.

[2] Texas Instrument - Radiation Handbook for Electronics (Texas Instruments Incorporated - copyright 2019)





### **Component Analysis**



- Reviewed
  - Processors
  - Graphical Processing Units (GPUs)
  - Field-Programmable Gate Arrays (FPGAs)
  - Microcontrollers
  - Non-Volatile Memory
- Ongoing
  - Sensors
  - Radio
  - Microelectronics
  - Box-level part
  - Other Avionic & Space Technology

### SEFI Example in SDRAM Memory



Fig. 3. Graphic display of SEU information grouped by row for one bank. Two band-type SEFIs are visible. Shade decoding information is in Table [3] IV.

[3] S. M. Guertin et al., "Programmatic Impact of SDRAM SEFI", IEEE REDW, Jul. 2012



# **Component-Specific SEFI Examples**



Vanderbilt Engineering

- Common SEFI Characteristics:
  - Data loss or corruption
  - Abrupt transition to an unknown/undefined/illegal state
  - Halted program operation
  - Normal functionality recoverable post-power cycle

| Processor                                                                                                                | GPU                                                                                                     | FPGA                                                                                                  | Microcontroller                                                                                               | Non-Volatile<br>Memory                                                                                    |
|--------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| <ul> <li>Undefined<br/>state transition</li> <li>Modified<br/>control flow</li> <li>Repetitive<br/>exceptions</li> </ul> | <ul> <li>Non-zero exit<br/>status</li> <li>Scheduler<br/>errors</li> <li>PCIe bus<br/>errors</li> </ul> | <ul> <li>Communication<br/>disruption</li> <li>Phase-lock<br/>loop loss</li> <li>Sync loss</li> </ul> | <ul> <li>Impulse on<br/>GPIO "Reset<br/>Indicator" line</li> <li>Current<br/>consumption<br/>spike</li> </ul> | <ul> <li>Page Errors<br/>(PE)</li> <li>Block Errors<br/>(BE)</li> <li>Vertical Errors<br/>(VE)</li> </ul> |

# **Testing Considerations**



Vanderbilt Engineering

- Equipment Usage & Setup
  - Common tools used in experiments
  - Setup and configurations
- Test Procedure
  - Device preparation
    - Package removal
    - Device program
  - Sequence for executing irradiation test
  - Researcher response to an observed SEFI (i.e. pause irradiation and tabulate, continue irradiation experiment)



Figure 2. This is the layout of the test apparatus used. There are two test [4] PCs (on left) which are connected via the bulk head adapter to the DUT board (on right).

[4] S. M. Guertin et al., "Dynamic SDRAM SEFI Detection and Recovery Test Results", IEEE REDW, pp. 62-67, Jul. 2004

# **Preparation Considerations**





#### Component Operating States

+ Statically operated part – programmed DUT, with or without a voltage bias, executing no operations; idle state

+ Dynamically operated part – continuous execution of an operation



Functional Operation Executions

+ Performing a standard (i.e. read, write, erase) and/or complex data processing operation (i.e. matrix multiplication, FFT)



Part Monitoring

+ Analyzing continuous communication and control of device under testing (DUT)



DUT Programing

+ All 0's, all 1's, checkerboard pattern (i.e. "55" or "AA")

+ Input stimulus signals that creates worst-case scenario for producing SEFIs



# **Measurement Considerations**



- Factors
  - Delineating between SEFIs and other SEs
    - SEFIs can propagate additional errors over time
    - Potentially lead to an overcount in alternative SEs
  - Examining the ramifications of masked SEFIs:
    - An observed SEFI obstructing the presence of other SEFIs in a part
    - Recovery method resolves observed SEFI vs. hidden, non-observable SEFIs
    - Latency and distinction between a new SEFI vs. a lingering SEFI
  - Tabulation method for observed SEFIs based on signature characteristics
- Reporting
  - Standard cross section with fitted Weibull curve
  - Calculations: FIT, MTTF, MFTF



## **Prediction Model Considerations**



#### **RPP-based Methods**

Approach: Sensitive volume pathlength distribution calculated per bit.

**Benefits:** Tilt-angular response is captured.

**Limitations:** Assumes volume is uniformed; requires total number of bits in DUT be known in advance.

#### **Effective Flux-based Methods**

Approach: Converts an omnidirectional environment into a directional one. Critical charge input generates an effective-LET flux.

Benefits: Uses a macroscopic crosssection for DUT.

Limitations: Assumes DUT sensitive region(s) are planar. Tiltangle response follows cosine of angle of incidence.



### **Future Work**



- Develop a document for test engineers to reference
- Investigate family-specific descriptions of SEFIs that could aid in delineating them from other soft-errors
- Assess the differing techniques used to detect, classify and tabulate SEFI events
- Evaluate methods for predicting SEFIs
- Seek collaborations with interested researchers and engineers