NEPP Electronic Technology Workshop
June 22-24, 2010

FPGAs—Working with a Commercial Consortium: Tales of Xilinx Virtex-4 and SIRF Radiation Testing

Greg Allen (JPL)

This research was carried out in part by the Jet Propulsion Laboratory, California Institute of Technology, under contract with the National Aeronautics and Space Administration under the NASA Electronic Parts and Packaging Program. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not constitute or imply its endorsement by the United States Government or the Jet Propulsion Laboratory, California Institute of Technology.
Introduction

• Historically reconfigurable FPGAs have had relatively sensitive radiation responses
  – Altera (SEL)
  – Actel (TID/SEU)
  – Xilinx (SEU/SEFI)

• The aerospace community has traditionally used one time programmable FPGAs (e.g. antifuse) due to relative SEE/TID robustness
  – Increasing interest in recent years to implement reconfigurable devices (Xilinx QR in particular)
  – Lead to challenges in mitigation, verification, and system error rate calculations
SEE Mitigation—TMR and RHBD

- **EDAC (Virtex-4)**
  - TMR and scrubbing
    - Complicated implementation
    - Increased engineering cost
    - Complicated verification and error rate calculation

- **RHBD (Virtex-5)**
  - Transparent implementation from the designer perspective
  - Complex radiation response requires new flight qualification methodologies
NEPP Task

Description:

This task aims to enable reconfigurable FPGAs (field programmable gate arrays) to be used in critical applications as a replacement for custom ASICs (application-specific integrated circuits). To accomplish this, in spite of how soft to upsets these devices are, requires the development of effective mitigation techniques and tools. Further, mission assurance methodologies are progressing to make sure that FPGA-based flight designs are as robust as intended. In particular, we intend to:

1. Further develop upset performance assessment techniques and guidelines targeting reprogrammable FPGA flight designs.
2. Participate in tests with the Xilinx Radiation Test Consortium.
3. Investigate reconfigurable FPGAs from other vendors, as available.
4. Develop and evaluate radiation tolerance and upset mitigation techniques, models, and tools.

FY10 Plans:

- Xilinx SIRF testing
  - Evaluation of the final SIRF product
  - SET studies
  - uBlaze/Leon3 characterization
  - IP block characterization
- SIRF Qualification report
- Exploratory radiation testing current FPGA technology
  - SiliconBlue iCE65
  - Altera Stratix IV
- Explore available non-volatile memory technologies for configuration storage solutions

Schedule:

<table>
<thead>
<tr>
<th>Reconfigurable FPGAs Tech. - Rad., cont.</th>
<th>2009</th>
<th>2010</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>O</td>
<td>N</td>
</tr>
<tr>
<td>Altera Heavy Ion test</td>
<td></td>
<td></td>
</tr>
<tr>
<td>SIRF tests</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Test Development (all FPGA)</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Fault Injection comparison document</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Technology Comparison Document</td>
<td></td>
<td></td>
</tr>
<tr>
<td>SiliconBlue Test</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

Deliverables:

- Test reports and technology comparison of current reconfigurable FPGA technology and non-volatile memory technologies
- Test reports on SIRF irradiations
  - Embedded processor solution
  - IP blocks

NASA and Non-NASA Organizations/Procurements:

- Xilinx Radiation Test Consortium (beam time)
- Xilinx, Inc. (beam time and SIRF device samples)
- Actel Corp. (potential samples)
Goals

- **Full static radiation characterization of the Xilinx XQR5VFX130 [Single Event Immune Reconfigurable FPGA (SIRF)] device in conjunction with the Xilinx Radiation Test Consortium**
  - Provide a methodology for NASA missions to determine error rates and mitigation methodologies (as necessary)
- **Evaluate other reconfigurable FPGA vendors for SEE/TID**
  - SiliconBlue iCE65
  - Altera Stratix IV/Stratix V
- **Evaluate non-volatile memory products as available**
  - SONOS devices
  - Mitigated flash
What is the XRTC?

- Xilinx Radiation Test Consortium was founded in 2002 by NASA/JPL and Xilinx to evaluate Xilinx FPGAs for aerospace applications
- Open to US organizations
- Goals:
  - Provide the aerospace community with collaborative testing of radiation effects in Xilinx FPGA
  - Provide mitigation methodologies for observed radiation effects in said FPGAs
Why Participate with a Commercial Consortium?

- Qualifying a new FPGA is like peeling the layers off an onion
- Extremely complicated and expensive devices to test
- Allows the organization to bring in and/or develop various “specialists”
- Different aerospace programmatic/engineering perspectives
- Provide “watchdog” role for the output of the Consortium

**Leverages the cost of very expensive device qualification amongst several aerospace entities**
Working Commercial Consortium Gotcha’s

- AKA...how to avoid a marketing campaign
- Balancing the forest and the trees
Expected Impact to Community

• Virtex-II and Virtex-4 devices are SEU-soft, complicated devices that require complex mitigation/verification, and error rate calculation
  – Mitigation schemes need to be evaluated on a per-implementation basis
• A “rad-hard” reconfigurable FPGA has the potential to be a game-changing technology
  – Relative radiation robustness needs to be completely understood
  – Implementation tradeoffs need to be characterized as a function of radiation robustness
• New Product Evaluation
Status/Schedule

• Preliminary characterization performed (engineering samples
  – Preliminary SEL
  – Preliminary SEU characterization:
    • Configuration Cells
    • SET Filters
    • BRAM/BRAM ECC
    • DCM/PLL
    • MGT

• Just received product silicon and processed them (thinning/assembly)
• Purchased SiliconBlue and Altera devices (Q4 testing)
General FPGA Radiation Effects Evaluation Path

- Single-Event Latchup
- Static Characterization (Heavy Ion/Proton)
  - Configuration Elements, RAM, Registers, and Device-Level Single-Event Functional Interrupt
- Total Ionizing Dose Susceptibility
- IP Block Characterization (Dynamic Testing)
  - Clock Management, I/O, Processors, Multipliers, etc
Enabling, yet SEU sensitive devices, require complex upset mitigation to use in most cases.
Virtex-4 Mitigation and Verification Selection

- What is the underlying, unmitigated system error rate?
  - Fault injection, accelerator testing, or software estimation
- What is the probability of observing an error?
  - Error rate and operating period
- What is the level of mitigation that is going to be required?
  - Engineering vs. reliability
- What level of configuration correction is going to be required?
  - Level of error persistence
- How will this mitigation scheme be verified?
  - Fault injection or accelerator testing

Enabling, yet SEU sensitive devices, require complex upset mitigation to use in most cases
Highlights/Accomplishments
Virtex-4

\[ T_C = 0.669 \text{ sec} \]
\[ M = 8650 \]
\[ M_2 = M_3 = M_4 = 240 \]
\[ M_U = 4016 = \text{FI Errors} \]

For \( r = 3.2 \times 10^{-12} \) (GCR)
R 4016 configuration bits

Verification of complex devices with complex mitigation is facilitated by new models and test methodologies
# Virtex-5 Overview

## What’s Different?

<table>
<thead>
<tr>
<th>Description</th>
<th>Available Resources</th>
<th>Radiation-Hardened Implementation/Mitigation</th>
</tr>
</thead>
<tbody>
<tr>
<td>CFG* Configuration bits (millions)</td>
<td>49</td>
<td>12T cells</td>
</tr>
<tr>
<td>BRAM Block memory bits</td>
<td>10,985,472</td>
<td>EDAC</td>
</tr>
<tr>
<td>LOGIC Slices (2, 6-input lookup tables/slice)</td>
<td>20,480</td>
<td>12T FF, 12T cells, SET filtering</td>
</tr>
<tr>
<td>DSP** 18 × 18 MACs</td>
<td>320</td>
<td>None</td>
</tr>
<tr>
<td>PPC PowerPC405 processors</td>
<td>2</td>
<td>None</td>
</tr>
<tr>
<td>CMT*** Clock managers</td>
<td>6</td>
<td>None</td>
</tr>
<tr>
<td>MGT High-speed transceivers</td>
<td>18</td>
<td>None</td>
</tr>
<tr>
<td>IOBs Input/output blocks</td>
<td>840</td>
<td>12T registers, TMR’ed Digitally Controlled Impedance Controller</td>
</tr>
</tbody>
</table>
Highlights/Accomplishments
Virtex-5

New methodology developed for characterizing dual-node configuration cells
Highlights/Accomplishments

Virtex-5

• Other “Configuration” Upsets
  – INIT/Capture Bits
    • No design impact
  – Weakly Loaded Common Address Lines
    • SET’s now effect “static” cross-section
  – Dynamic Reconfiguration Port (DRP) Bits
    • Allow users to dynamically update configuration values
    • Need to be separately monitored and scrubbed

SET’s dominate overall error mode at normal incidence, implied paradigm shift in SEE characterization of Xilinx FPGAs
Highlights/Accomplishments
Virtex-5

- BRAM ECC

\[ R \approx M_U r + \frac{1}{2} T_C N_W N_{B/W} (N_{B/W} - 1) r^2 \]

- \( T_C = .03 \) Sec
- \( N_W = 152576 \)
- \( N_{B/W} = 72 \)
- \( M_U = 350 \)

\[ R \approx M_U r + \frac{1}{2} T_C N_W N_{B/W} (N_{B/W} - 1) r^2 \]

- For \( r = 3.2 \times 10^{-12} \) (GCR)
- \( R = 3.98 \times 10^{-12} \)

Combination of RHBD and ECC mitigation implies need for updated flight qualification methodologies.
Highlights/Accomplishments

Virtex-5

SEFI testing is an evolutionary process, significantly aided by a symbiotic relationship with company. Six SEFI’s identified, including the power cycle SEFI
Highlights/Accomplishments

Virtex-5

Efforts with NEPP, NRL, and XRTC to locate power cycle SEFI circuitry and eliminate it

Complicated, unfavorable SEE modes often require the full collaboration of the manufacturer to characterize
Plans (FY10/11)

- System fault characterization methodology for XQR5VX130
  - Accelerator testing of SEFIs is complicated: cross-section dependence on LET, flux, rotation/tilt, and configuration monitor implementation
  - System-level qualification is convoluted:
    - Beam testing won’t express error rate from configuration bit upsets
    - Collaboration with proprietary Xilinx software to locate SET sensitive configuration bits
    - Will require fault injection methodology

- Unhardened IP characterization qualification
- SEE testing of SiliconBlue and Altera FPGA

Complex SEE response will require flight qualification guidelines to be updated for this device