Hardware Systems Engineer, AI NPI
Meta
**Summary:**
Hardware Systems Engineers in RTP work closely with Hardware/Software co-design teams, hardware designers, networking teams, system manufacturers, component vendors, capacity engineering, production engineering, production services, and data center operations teams to enable new systems that will be deployed in our production data centers. Ramping to production and solving the datacenter scaling and deployment challenges requires us to take a systems based approach to the new product introduction (NPI) phase.
**Required Skills:**
Hardware Systems Engineer, AI NPI Responsibilities:
1. Drive and execute end-to-end system validation strategy (hardware and software), with a focus on various AI/HPC hardware systems in datacenter applications.
2. Lead the bring-up, validation, and deployment of cutting-edge hardware systems in large scale deployment with active hands-on participations.
3. Explore new use cases with customer teams and identify related test methodologies/test cases accordingly.
4. Investigate and troubleshoot complex failures potentially related to Hardware systems with cross-function teams, which may involve different stacks like silicon, firmware, software, etc
5. Triage failures and continue rootcausing while driving project development work forward.
6. Identify gaps and opportunities to improve test process and test methodologies across the NPI space
7. Guide automation efforts and data analysis for NPI projects through engagement with related cross-function teams
8. Communicate project progress and assessments to related internal and external teams
**Minimum Qualifications:**
Minimum Qualifications:
9. Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.
10. 12+ years of experience in hands-on SW, FW or HW engineering to build any of the following products (AI Silicon, GPUs, TPUs, Autonomous cars, AI servers)
11. 7+ years of work experience in one or more domains such as: ASIC development (Silicon design, bringup, characterization, validation), board level debug, firmware validation, system validation.
12. 3+ years of experience with leading Silicon or System troubleshooting and debugging
13. 3+ years of experience in developing test specifications, procedures, and debug guides for test solutions.
14. 5+ years of experience with one or more of the following modules/domains: PCIe, NVlink, Networking, Flash, Memory, CPU, GPU, TPU, DRAM (DDR4/5 or HBM), AI silicon/AI accelerators
15. 3+ years of experience in Linux environment
16. 3+ years of experience in Python, C/C++, Rust and/or similar languages (data structures, algorithms, and OOP).
**Preferred Qualifications:**
Preferred Qualifications:
17. Proficiency in High-Performance Computing (HPC) or AI system architecture at rack level and at scale.
18. 10+ years of hands-on experience in software, firmware, and hardware engineering to develop systems/products for datacenter applications such as video processing, AI/ML, and networking.
19. 7+ years of experience in ASIC development/validation, including silicon bring-up, emulation, characterization or system-level testing.
20. 7+ years of Experience in GPU/TPU related system bring-up, testing and debugging
21. Proven history to optimize software algorithms for performance and scalability.
22. Proven history in embedded systems architecture, components, and test development with a focus on automation.
23. Familiarity with lab debugging tools such as oscilloscopes, protocol analyzers, and traffic generators.
24. 7+ years of experience with debugging tools for SoCs (e.g., JTAG, GDB, Trace32) and knowledge of common bus protocols such as I2C, SPI, USB, and PCIe.
25. Proficiency in Linux environment and server system management.
26. Demonstrated history to explore and author new test plans based on new test methodologies.
27. 7+ years of experience integrating lab tools for automated workflows and managing large-scale deployments.
28. 7+ years of experience in using continuous integration and version control tools for system development and testing.
**Public Compensation:**
$163,000/year to $225,000/year + bonus + equity + benefits
**Industry:** Internet
**Equal Opportunity:**
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at accommodations-ext@fb.com.
Confirm your E-mail: Send Email
All Jobs from Meta