Shanghai, China
61 days ago
Deep Learning Performance Architect

We are now looking for a Deep Learning Architect!

Are you passionate about exploring computer architectures for AI? Do you like to build the industry leading product at the intersection of hardware and software? We are seeking world-class programmers and performance architects who love to squeeze out every cycle of performance from deep learning codes. In this role, you will craft and maintain a library that ships our best-performing GPU kernels to NVIDIA's industry-leading AI products. This position in our team offers the opportunity to have real impact in a fast-moving, technology-focused company.

What you'll be doing:

Design and develop the architecture, interface and features of the GPU kernel library

Keep improving the quality and performance of the library and its GPU kernels

Explore and expand the boundary of innovative technologies like GPU code generation and fusion

Contribute to NVIDIA's AI business by collaborating closely with DL product teams as well as kernel development teams

What we need to see:

MS, PhD or equivalent in relevant fields (CS, EE, Math)

2+ years of relevant work or research experience

Strong programming skills in C, C++, and Python

Excellent problem solving skills and learning capability

Experience with designing software architecture, interfaces, and building testing infrastructures

Good communication and a great teammate

Ways to stand out from the crowd:

Familiar with CUDA programming and GPU architecture

Familiar with TensorRT/cuDNN/cuBLAS etc.

Background with DL fundamentals, frameworks, graph compilers, LLVM, MLIR etc.

Hands-on experience in development on Linux and Windows platforms, C++ build tools like CMake and DevOps tools, including Docker, Jenkins, Kubernetes etc.

Track record of mentoring junior engineers and leading a project and a team

Confirm your E-mail: Send Email
All Jobs from Nvidia