

### Class D Missions and CubeSats: EEE Parts Diatribe – A Starting Point for Discussion

Kenneth A. LaBel ken.label@nasa.gov 301-286-9936 Michael J. Sampson michael.j.sampson@nasa.gov 301-614-6233 NEPP Program

Co- Managers, NEPP Program NASA/GSFC

http://nepp.nasa.gov

#### Acknowledgment:

This work was sponsored by: NASA Office of Safety & Mission Assurance NASA Office of Chief Engineer

#### Unclassified



### Acronyms

| 3D      | Three Dimensional                             | Μ         | Meg                                                  |
|---------|-----------------------------------------------|-----------|------------------------------------------------------|
| ADC     | Analog to Digital Converter                   | MER       | Mars Exploration Rover                               |
| Aero    | Aerospace                                     | MHz       | Megaherz                                             |
| ARC     | Ames Research Center                          | MIDEX     | Medium-Class Explorer                                |
| ASIC    | Application Specific Integrated Circuit       | MIL       | Military                                             |
| CMOS    | Complementary Metal Oxide Semiconductor       | MIPS      | Millions of Instruction per Second                   |
| COTS    | Commercial Off The Shelf                      |           |                                                      |
| CSLI    | CubeSat Launch initiative                     | MP3       | Moving Picture Experts Group-I or II Audio Layer III |
| DIP     | Dual Inline Package                           | MRO       | Mars Reconnaissance Orbiter                          |
| DNL     | Differential Non-Linearity                    | Msps      | Megasamples per second                               |
| DSP     | Digital Signal Processor                      | NASA      | National Aeronautics and Space Administration        |
| EDAC    | Error Detection and Correction                | NEPP      | NASA Electronic Parts and Packaging                  |
| EEE     | Electrical, Electronic, and Electromechanical | NID       | NASA Interim Directive                               |
| ENOB    | Effective Number of Bits                      | nm        | nanometer                                            |
| EPI     | Epitaxial                                     | NMOS      | N-type Metal Oxide Semiconductor                     |
| ESSP    | Earth System Science Pathfinder               | NPR       | NASA Procedural Requirements                         |
| FCBGA   | Flip Chip Ball Grid Array                     | NPSL      | NASA Parts Selection List                            |
| FPGA    | Field Programmable Gate Array                 | NRE       | Non-Recurring Engineering                            |
| GAS can | GetAway Special can                           | PCB       | Printed Circuit Board                                |
| Gb      | Gigabit                                       | POF       | Physics of Failure                                   |
| Gbps    | Gigbits per Second                            | RF        | Radio Frequency                                      |
| GHz     | Gigaherz                                      | SAA       | South Atlantic Anomaly                               |
| GSFC    | Goddard Space Flight Center                   | SCD       | Source Control Drawing                               |
| HST     | Hubble Space Telescope                        | SDRAM     | Synchronous Dynamic Random Access Memory             |
| IC      | Integrated Circuit                            | SEE       | Single Event Effect                                  |
| INL     | Integral Non-Linearity                        | SERDED    | Serializer Deserializer                              |
| Ю       | Input Output                                  | SEU       | Single Event Upset                                   |
| ISS     | International Space Station                   | Si<br>SMA | Silicon                                              |
| JIMO    | Jupiter Icy Moons Orbiter                     | SMA       | Safety and Mission Assurance                         |
| JPL     | Jet Propulsion Laboratories                   | SOC       | Small Explorer                                       |
| JWST    | James Webb Space Telescope                    | SOL       | Systems on a Chip<br>Silicon on Insulator            |
| k       | Kilo                                          | SWaP      |                                                      |
| kb      | Kilobit                                       | TID       | Size, Weight, and Power<br>Total lonizing Dose       |
| LCC     | Leadless Chip Carrier                         | TMR       | Triple Modular Redundancy                            |
|         |                                               |           | micron                                               |
|         |                                               | um        | micron                                               |



### Workshop Agenda

- Introduction, Ground Rules, and Objectives
  - This Presentation Ken LaBel/GSFC
- EEE Parts Categories
  - Overview Shri Agarwal/JPL
  - Automotive Electronics Mike Sampson/GSFC
- NASA/Center/JPL Approaches to EEE Parts for Class D Missions and CubeSats
  - Risk Discussion Jesse Leitner/GSFC
  - JPL Approach TBD/JPL
  - **TBD Other Request ARC Presentation?**
- COTS EEE Parts Usage/Screening/Qualification and Fault-tolerant Architecture Discussion
  - Alternative methods? TBD
  - TBD
- Go-forward Discussion
  - What can we agree upon? Parts plan components? Minimum requirements and parts review? Guideline? COTS database? Reliability tools?



### Agenda

| Tuesday Sep 24 2013 |        |                                                                                        |                                |                                                                      |
|---------------------|--------|----------------------------------------------------------------------------------------|--------------------------------|----------------------------------------------------------------------|
| Start               | Finiah | Торіс                                                                                  | Presenter                      | Notes                                                                |
| 8:30                | 9:00   | Coffee                                                                                 |                                |                                                                      |
| 9:00                | 9:30   | Introductory comments by NASA HQ                                                       | OSMA, OCE                      | We are expecting Mike Ryschkewitsch<br>(OCE) and Tom Whitmeyer (OCE) |
| 9:30                | 10:00  | Class D Missions and CubeSats: EEE Parts Diatribe – A<br>Starting Point for Discussion | Ken LaBel, Mike Sampson - GSFC | Background and charge to the workshop                                |
| 10:00               | 10:30  | The Various Shades of Microcircuits                                                    | Shri Agarwal - JPL             | Background on different EEE part<br>categories                       |
| 10:30               | 11:00  | Break                                                                                  |                                |                                                                      |
| 11:00               | 11:30  | Risk Discussion                                                                        | Jesse Leitner - GSFC           | General Class D Risk Discussion                                      |
| 11:30               | 12:00  | Class D/CubeSat Mission Assurance Approach at JPL                                      | Tim Larson - JPL               | General Class D Risk Discussion                                      |
| 12:00               | 13:00  | Lunch                                                                                  | On your own                    | Bldg 1 Cafeteria or other                                            |
| 13:00               | 13:30  | Current JPL Class D projects and Their Requirements                                    | Rob Menke - JPL                | Not a full presentation                                              |
| 13:30               | 14:00  | Class D Mission (GEMS): EEE Parts Lessons Learned                                      | Muzariatu Jah - GSFC           |                                                                      |
| 14:00               | 14:30  | Automotive Electronics                                                                 | Mike Sampson - GSFC            | More detailed automotive electronics background material             |
| 14:30               | 15:00  | COTS parts – Myth vs. Reality                                                          | Doug Sheldon - JPL             | сотѕ                                                                 |
| 15:00               | 15:30  | Break                                                                                  |                                |                                                                      |
| 15:30               | 16:00  | ARC Class D Philosophy and Examples                                                    | Josh Forgione, Kuok Ling - ARC |                                                                      |
| 16:00               | 16:30  | Alternative Qualification Approaches                                                   | Discussion                     | Discussion                                                           |
| 16:30               | 17:00  | Go-Forward Plan Development                                                            | Discussion                     | Discussion                                                           |



### Outline

- Class D Missions
  - NPR 8705.4
- CubeSats
- Assurance for Electronics
- Commercial Off The Shelf (COTS) Usage
- Testing at Board/Box Level?
- Summary and Discussion



Hubble Space Telescope courtesy NASA

### NPR 8705.4 Appendix B Classification Considerations for NASA Class A-D Payloads

| Characterization                | CLASS A                     | CLASS B                       | CLASS C                        | CLASS D                        |
|---------------------------------|-----------------------------|-------------------------------|--------------------------------|--------------------------------|
| Priority (Criticality to Agency | High priority, very low     | High priority, low risk       | Medium priority, medium risk   | Low priority, high risk        |
| Strategic Plan) and             | (minimized) risk            |                               |                                |                                |
| Acceptable Risk Level           |                             |                               |                                |                                |
| National significance           | Very high                   | High                          | Medium                         | Low to medium                  |
| Complexity                      | Very high to high           | High to medium                | Medium to low                  | Medium to low                  |
| Mission Lifetime (Primary       |                             |                               |                                |                                |
| Baseline Mission                |                             |                               |                                |                                |
| Cost                            | High                        | High to medium                | Medium to low                  | Low                            |
| Launch Constraints              | Critical                    | Medium                        | Few                            | Few to none                    |
| In-Flight Maintenance           | N/A                         | Not feasible or difficult     | Maybe feasible                 | May be feasible and planned    |
| Alternative Research            | No alternative or re-flight | Few or no alternative or re-  | Some or few alternative or re- | Significant alternative or re- |
| Opportunities or Re-flight      | opportunities               | flight opportunities          | flight opportunities           | flight opportunities           |
| Opportunities                   |                             |                               |                                |                                |
| Achievement of Mission          | All practical measures are  | Stringent assurance standards | Medium risk of not achieving   | Medium or significant risk of  |
| Success Criteria                | taken to achieve minimum    | with only minor compromises   | mission success may be         | not achieving mission success  |
|                                 | risk to mission success The | in application to maintain a  | acceptable. Reduced            | is permitted. Minimal          |
|                                 | highest assurance standards | low risk to mission success.  | assurance standards are        | assurance standards are        |
|                                 | are used.                   |                               | permitted.                     | permitted.                     |
| Examples                        | HST, Cassini, JIMO, JWST    | MER, MRO, Discovery           | ESSP, Explorer Payloads,       | SPARTAN, GAS Can,              |
|                                 |                             | payloads, ISS Facility Class  | MIDEX, ISS complex subrack     | technology demonstrators,      |
|                                 |                             | Payloads, Attached ISS        | payloads                       | simple ISS, express middeck    |
|                                 |                             | payloads                      |                                | and subrack payloads, SMEX     |



#### NPR 8705.4 Appendix C Recommended SMA-Related Program Requirements for NASA Class A-D Payloads

|                                         | CLASS A                                                                                                                                                            | CLASS B                                                                                                                          | CLASS C                                                                                                                      | CLASS D                                                                                                   |
|-----------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|
| EEE Parts<br>*http://nepp.nasa.gov/npsl | NASA Parts Selection List<br>(NPSL)* Level 1, Level 1<br>equivalent Source Control<br>Drawings (SCDs), and/or<br>requirements per Center<br>Parts Management Plan. | Class A requirements or<br>NPSL Level 2, Level 2<br>equivalent SCDs, and/or<br>requirements per Center<br>Parts Management Plan. | Class A, Class B or NPSL<br>Level 3, Level 3 equivalent<br>SCDs, and/or requirements<br>per Center Parts<br>Management Plan. | Class A, Class B, or Class C<br>requirements, and/or<br>requirements per Center<br>Parts Management Plan. |

Note that this is strictly based on mission priority, significance, and so forth, but has no delineation based on electronic system criticality or environment exposure (limited on lifetime).

# NASA's CubeSat Launch initiative

- NASA's CubeSat Launch Initiative (CSLI) provides opportunities for small satellite payloads to fly on rockets planned for upcoming launches. These CubeSats are flown as auxiliary payloads on previously planned missions.
- CubeSats are a class of research spacecraft called nanosatellites. The cube-shaped satellites are approximately four inches long, have a volume of about one quart and weigh about 3 pounds. To participate in the CSLI program, CubeSat investigations should be consistent with NASA's Strategic Plan and the Education Strategic Coordination Framework. The research should address aspects of science, exploration, technology development, education or operations.

http://www.nasa.gov/directorates/heo/home/CubeSats\_initiative.html No general guidance related to EEE parts and NASA CubeSats available



### **Assurance for Electronic Devices**

- Assurance is
  - Knowledge of
    - The supply chain and manufacturer of the product,
    - The manufacturing process and its controls, and,
    - The physics of failure (POF) related to the technology.
  - Statistical process and inspection via
    - Testing, inspection, physical analyses and modeling.
  - Understanding the application and environmental conditions for device usage.
    - This includes:
      - Radiation,
      - Lifetime,
      - Temperature,
      - Vacuum, etc., as well as,
      - Device application and appropriate derating criteria.



### **Reliability and Availability**

- Reliability (Wikipedia)
  - The ability of a system or component to perform its required functions under stated conditions for a specified period of time.

### Availability (Wikipedia)

- The degree to which a system, subsystem, or equipment is in a specified operable and committable state at the start of a mission, when the mission is called for at an unknown, *i.e.*, a random, time. Simply put, availability is the proportion of time a system is in a functioning condition. This is often described as a *mission capable rate*.
- The question is:
  - Does it HAVE to work? Or
  - Do you just WANT it to work?

## What does this mean for EEE parts?

- The more understanding you have of a device's failure modes and causes, the higher the confidence level that it will perform under mission environments and lifetime
  - High confidence = "have to work"
    - The key is operating without a problem when you need it to (appropriate availability over the mission lifetime)
  - Less confidence = "want to work"
    - This is not saying that it won't work, just that our confidence to be available isn't as high (or even unknown)

#### • Standard Way of Doing Business

- Qualification processes are statistical beasts designed to understand/remove known reliability risks and uncover unknown risks inherent in a part.
  - Requires significant sample size and comprehensive suite of piecepart testing (insight) – high confidence method

# NASA

## **Screening <> Qualification**

- Electronic component screening uses environmental stressing and electrical testing to identify marginal and defective components within a "lot" of devices.
  - This is opposed to *qualification* which is usually a suite of harsher tests (and often destructive) intended to fully determine reliability characteristics of the device over a standard environment/application range
- *Diatribe*: what is a "lot"?
  - For the Mil/Aero system, it is devices that come from the same wafer diffusion (i.e., silicon lot from the same wafer)
  - For all others, it is usually the same "packaging" date
    - I.e., silicon may or may not be the same, but the devices were packaged at the same time. This raises a concern often known as "die traceability".
  - Device failure modes often have variance from silicon lot to silicon lot.



### The Trade Space Involved With Part Selection

- Evolution of IC space procurement philosophy
  - OLD: Buy Mil/Aero Radiation Hardened Devices Only
  - NEW: Develop Fault /Radiation Tolerant Systems
- This is now systems design that involves a risk management approach that is often quite complex.
- For the purposes of this discussion, we shall define ICs into two basic categories
  - Space-qualified which may or may not be radiation hardened, and,
  - Commercial (includes automotive)
- Understanding Risk and the Trade Space involved with these devices is the new key to mission success
  - Think size, weight, and power (SWaP), for instance



Performance Inside a Apple iPhone<sup>TM</sup> Courtesy EE Times Magazine

The Challenge for Selecting ICs for Space

- Considerations since the "old days"
  - High reliability (and radiation tolerant) devices
    - Now a very small market percentage
  - Commercial "upscreening\*"
    - Increasing in importance
    - Assesses reliability, does not enhance
  - System level performance and risk
    - Hardened or fault tolerant "systems" not devices

\*upscreening – performing tests/analysis on electronic parts for environments outside the intended/guaranteed range of a device



System Designer Trying to meet high-resolution instrument requirements AND long-life

> ADC: analog-to-digital converter SDRAM: synchronous dynamic random access memory SerDes: serializer-deserializer ASIC: application-specific integrated circuit DSP: digital signal processor



### **IC Selection Requirements**

- To begin the discussion, we shall review IC selection from three distinct and often contrary perspectives
  - Performance,
  - Programmatic, and,
  - Reliability.



Each of these will be considered in turn, however, one must ponder all aspects as part of the *process* 

Graphic courtesy http://www.shareworld.co/



### **Performance Requirements**

- Rationale
  - Trying to meet science, surveillance, or other performance requirements
- Personnel involved
  - Electrical designer, systems engineer, other engineers
- Usual method of requirements
  - Flowdown from science or similar requirements to implementation
    - i.e., ADC resolution or speed, data storage size, etc...
- Buzzwords
  - MIPS/watt, Gbytes/cm<sup>3</sup>, resolution, MHz/GHz, reprogrammable
- Limiting technical factors beyond electrical
  - Size, weight, and power (SWaP)



Race Car courtesy http://wot.motortrend.com/

MIPS: millions of instructions per second



### Programmatic Requirements and Considerations

- Rationale
  - Trying to keep a program on schedule and within budget
- Personnel involved
  - Project manager, resource analyst, system scheduler
- Usual method of requirements
  - Flowdown from parent organization or mission goals for budget/schedule
    - I.e., Launch date
- Buzzwords
  - Cost cap, schedule, critical path, risk matrix, contingency
- Limiting factors
  - Parent organization makes final decision



Burroughs Accounting Machine courtesy http://www.piercefuller.com/collect/before.html

> Programmatics A numbers game



### **Risk Requirements**

- Rationale
  - Trying to ensure mission parameters such as reliability, availability, operate-through, and lifetime are met
- Personnel involved
  - Radiation engineer, reliability engineer, parts engineer
- Usual method of requirements
  - Flowdown from mission requirements for parameter space
    - I.e., SEU rate for system derived from system availability specification
- Buzzwords
  - Lifetime, total dose, single events, device screening, "waivers"
- Limiting factors
  - Management normally makes "acceptable" risk decision



SOHO/SWAN Ultraviolet Image courtesy

http://sohowww.nascom.nasa.gov/gallery/Particle/swa008.html



### **Understanding Risk**

- The risk management requirements may be broken into three considerations
  - Technical/Design "The Good"
    - Relate to the circuit designs not being able to meet mission criteria such as jitter related to a long dwell time of a telescope on an object
  - Programmatic "The Bad"
    - Relate to a mission missing a launch window or exceeding a budgetary cost cap which can lead to mission cancellation
  - Radiation/Reliability "The Ugly"
    - Relate to mission meeting its lifetime and performance goals without premature failures or unexpected anomalies
- Each mission must determine its priorities among the three risk types





### The Risk Trade Space –

#### **Considerations for Device Selection (Incomplete)**

- Cost and Schedule
  - Procurement
  - NRE
  - Maintenance
  - Qualification and test
- Performance
  - Bandwidth/density
  - SWaP
  - System function and criticality
  - Other mission constraints (e.g., reconfigurability)
- System Complexity
  - Secondary ICs (and all their associated challenges)
  - Software, etc...

- Design Environment and Tools
  - Existing infrastructure and heritage
  - Simulation tools
- System operating factors
  - Operate-through for single events
  - Survival-through for portions of the natural environment
  - Data operation (example, 95% data coverage)
- Radiation and Reliability
  - SEE rates
  - Lifetime (TID, thermal, reliability,...)
  - "Upscreening"
- System Validation and Verification

NRE: non-recurring engineering IC: integrated circuit SEE: single-event effect TID: total ionizing dose

### **Systems Engineering and Risk**

- The determination of acceptability for device usage is a complex trade space
  - Every engineer will "solve" a problem differently
    - Ex., approaches such as synchronous digital circuit design may be the same, but the implementations are not
- A more omnidirectional approach is taken weighing the various risks
  - Each of the three factors may be assigned weighted priorities
    - The systems engineer is often the "person in the middle" evaluating the technical/reliability risks and working with management to determine acceptable risk levels



### **Traditional Risk Matrix**





### An Example "Ad hoc" Battle

- Mission requirement: High resolution image
  - Flowdown requirement: 14-bit 100 Msps ADC
    - Usually more detailed requirements are used such as Effective Number of Bits (ENOB) or Integral Non-Lineariy (INL) or Differential Non-Linearity (DNL) as well
  - Designer
    - Searches for available radiation hardened ADCs that meet the requirement
    - Searches for commercial alternatives that could be upscreened
    - Looks at fault tolerant architecture options
  - Manager
    - Trades the cost of buying Mil-Aero part requiring less aftermarket testing than a purely commercial IC
    - Worries over delivery and test schedule of the candidate devices
  - Radiation/Parts Engineer
    - Evaluates existing device data (if any) to determine reliability performance and additional test cost and schedule
- The best device? Depends on mission priorities







### **NASA and COTS**

- NASA has been a user of COTS electronics for decades, *typically* when
  - Mil/Aero alternatives are not available (performance or function or procurement schedule),
  - A system can assume possible unknown risks, and,
  - A mission has a relatively short lifetime or benign space environment exposure.
- In most cases, some form of "upscreening" has occurred.
  - A means of measuring a portion of the inherent reliability of a device.
  - Discovering that a COTS device fails during upscreening has occurred in almost every flight program.

### Why COTS? The Growth in Integrated Circuit Availability

- The semiconductor industry has seen an explosion in the types and complexity of devices that are available over the last several decades
  - The commercial market drives features
    - High density (memories)
    - High performance (processors)
    - Upgrade capability and time-to-market
      - Field Programmable Gate Arrays (FPGAs)
    - Wireless (Radio Frequency (RF) and mixed signal)
    - Long battery life (Low-power Complementary Metal Oxide Integrated Cycling Bib Semiconductors (CMOS)) and MP3



Zilog Z80 Processor circa 1978 8-bit processor

Processor pictures courtesy

NASA, Code 561



Intel 65nm Dual Core Pentium D Processor circa 2007 Dual 64-bit processors

FPGA: field programmable gate array RF: radio frequency

CMOS: complementary metal oxide semiconductor





## The Changes in Device Technology

- Besides increased availability, many changes have taken place in
  - Base technology,
  - Device features, and,
  - Packaging

- DIP: dual in-line package LCC: leaded chip carrier FCBGA: flip chip ball grid array SOI: silicon on insulator
- The table below highlights a few selected changes

| <u>Feature</u>    | <u>circa 1990</u>    | <u>circa 2007</u>            |
|-------------------|----------------------|------------------------------|
| Base technology   | bulk CMOS/NMOS       | CMOS with strained Si or SOI |
| Feature size      | > 2.0 um             | 65 nm                        |
| Memory size -     | 050 kh               | 1.01                         |
| volatile (device) | 256 kb               | 1 Gb                         |
| Processor speed   | 64 MHz               | > 3 GHz                      |
| FPGA Gates        | 2k                   | > 1 M                        |
| Package           | DIP or LCC - 40 pins | FCBGA - 1500 balls           |
| Advanced system   |                      | >Gbps Serial Link, Serdes,   |
| on a chip (SOC)   |                      | embedded processors,         |
| features          | Cache memory         | embedded memory              |
|                   |                      |                              |

 Now commercial technology is pushing towards 14nm, 3D transistors, and substrates, etc...

## Suggested EEE Parts Usage Factors

#### **Environment/Lifetime**

|             |        | Low                                                                                                               | Medium                                                                                                    | High                                                                                                              |
|-------------|--------|-------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------|
| ity         | Low    | COTS upscreening/<br>testing optional; do<br>no harm (to others)                                                  | COTS upscreening/<br>testing recommended;<br>fault-tolerance<br>suggested; do no<br>harm (to others)      | Rad hard<br>suggested. COTS<br>upscreening/<br>testing<br>recommended;<br>fault tolerance<br>recommended          |
| Criticality | Medium | COTS upscreening/<br>testing<br>recommended; fault-<br>tolerance suggested                                        | COTS upscreening/<br>testing recommended;<br>fault-tolerance<br>recommended                               | Level 1 or 2, rad<br>hard suggested.<br>Full upscreening<br>for COTS. Fault<br>tolerant designs for<br>COTS.      |
|             | High   | Level 1 or 2<br>suggested. COTS<br>upscreening/<br>testing<br>recommended. Fault<br>tolerant designs for<br>COTS. | Level 1 or 2, rad hard<br>suggested. Full<br>upscreening for COTS.<br>Fault tolerant designs<br>for COTS. | Level 1 or 2, rad<br>hard<br>recommended. Full<br>upscreening for<br>COTS. Fault<br>tolerant designs for<br>COTS. |

Comments on the "Matrix" Wording

- "Optional" implies that you might get away without this, but there's risk involved
- "Suggested" implies that it is good idea to do this
- "Recommended" implies that this really should be done
- Where just the item is listed (like "full upscreening on COTS") – this should be done to meet the criticality and environment/lifetime concerns

#### Good mission planning identifies where on the matrix it lies



### "How to Save on EEE Parts for a Payload on a Budget"

- First and foremost: SCROUNGE
  - Are there spare devices available at either your Center or elsewhere at the Agency?
    - NASA has already bought devices ranging from passives to FPGAs
  - Some may be fully screened and even be radiation hardened/tested
    - You may still have to perform some additional tests, but it's cheaper than doing them all!
- Engage parts/radiation engineers early to help find and evaluate designers "choices"
  - Use their added value to help with the choices and even on fault tolerance approaches – you'll need them to "sign off" eventually
- If you can't find spares, try to use parts with a "history"
  - At a minimum, the hope is that your lot will perform similarly to the "history" lot – not guaranteed
  - Even riskier, choose devices built on the same design rules by the same company (i.e., different part, but on the same process/design as a part with "history")
- If you absolutely need something new, you will pay for the qualification or take the risk

### **Brief Diatribe:**

Add Fault Tolerance or Radiation Hardening?

- Means of making a system more "reliable/available" can occur at many levels
  - Operational
    - Ex., no operation in the South Atlantic Anomaly (proton hazard)
  - System
    - Ex., redundant boxes/busses
  - Circuit/software
    - Ex., error detection and correction (EDAC) scrubbing of memory devices by an external device or processor
  - Device (part)
    - Ex., triple-modular redundancy (TMR) of internal logic within the device
  - Transistor
    - Ex., use of annular transistors for TID improvement
  - Material
    - Ex., addition of an epi substrate to reduce SEE charge collection (or other substrate engineering)

#### Good engineers can invent infinite solutions, but the solution used must be adequately validated

### **Discussion:** Is knowledge of EEE Parts Failure Modes Required To Build a Fault Tolerant System?

- This is NOT to say that the system won't work without the knowledge, but is our confidence in the system to work adequate?
  - What are the "unknown unknowns"? Can we account for them?
  - How do you calculate risk with unscreened/untested EEE parts?
  - Do you have common mode failure potential in your design? (i.e., a identical redundant string rather than having independent redundant strings)
  - How do you adequately validate a fault tolerant system for space?
- If we go back to the "Matrix", how critical is your function and harsh your environment/lifetime? This will likely drive your implementation "answers".



#### **Example:**

#### Is Radiation Testing Always Required for COTS?

#### • Exceptions for testing may include

- Operational
  - Ex., The device is only powered on once per orbit and the sensitive time window for a single event effect is minimal
- Acceptable data loss
  - Ex., System level error rate may be set such that data is gathered 95% of the time. This is data availability. Given physical device volume and assuming every ion causes an upset, this worst-case rate may be tractable.
- Negligible effect
  - Ex., A 2 week mission on a shuttle may have a very low Total lonizing Dose (TID) requirement. TID testing could be waived.

Memory picture courtesy NASA, Code 561



A flash memory may be acceptable without testing if a low TID requirement exists or not powered on for the large majority of time.



**Evaluation Method of Commercial Off-the-**Shelf (COTS) Electronic Printed Circuit **Boards (PCBs) or Assemblies** 



We can test devices, but how do we test systems? Or better yet, systems of systems on a chip (SOC)?





# Sample Challenges for the Use and Testing of COTS PCBs:

- Limited parts list information
  - Bill-of-materials often does NOT include lot date codes or manufacturer of device information
  - Die or in some cases lack of information on "datasheets"
  - Full PCB datasheet may not have sufficient information on individual device usage
  - The possibility of IC variances for "copies" of the "same" PCBs:
    - Form, fit, and function doesn't equal same device from same manufacturer
    - Lot-to-lot, device-to-device variance
- The limited testability of boards due to complex circuitry, limited IO, and packaging issues ("*visibility*" issues) as well as achieving full-range thermal/voltage acceleration. This includes "fault coverage".
- The issue of piecepart versus board level tests
  - Board performance being monitored, not device
  - Error/fault propagation often time and application dependent
- The inability to simulate the space radiation environment with a single particle test
  - Potential masking of faults during radiation exposure (too high a particle rate or too many devices being exposed simultaneously)
- Statistics are often limited due to sample size



### **Workshop Discussion Points**

- Share Experiences and Plans
  - What rules have we put in place, what missions are we working on or planning, how have we (NASA) dealt with Class D and CubeSats?
  - How good is our information sharing? Too competitive between Centers/JPL?
    - Is the new EEE Parts Database a good "central" place to share what we are using and how to find spares?
  - Can we coordinate COTS testing better (working group)? Or utilize a central program like NEPP to focus on testing new COTS (budget limited)?
- Guidelines/Policy
  - Can we agree across the Agency on EEE Parts and Class D/CubeSats?
  - Do we need a NASA Interim Directive (NID) for adding environment and system criticality into EEE Parts Requirements?
  - Do we need a CubeSat EEE Parts Usage Guideline Document?
- Not yet discussed: counterfeits, Trojans, malware,...



### **Summary**

- In this talk, we have presented considerations for selection of ICs focusing on COTS for space systems
  - Technical, programmatic, and risk-oriented
    - As noted, every mission may view the relative priorities between the considerations differently
- As seen below, every decision type may have a process.
  - It's all in developing an appropriate one for your application and avoiding "buyer's remorse"!



#### Five stages of Consumer Behavior

http://www-rohan.sdsu.edu/~renglish/370/notes/chapt05/