We are now looking for a Deep Learning Architect!
Are you passionate about exploring computer architectures for AI? Do you like to build the industry leading product at the intersection of hardware and software? We are seeking world-class programmers and performance architects who love to squeeze out every cycle of performance from deep learning codes. In this role, you will craft and maintain a library that ships our best-performing GPU kernels to NVIDIA's industry-leading AI products. This position in our team offers the opportunity to have real impact in a fast-moving, technology-focused company.
What you'll be doing:Design and develop the architecture, interface and features of the GPU kernel library
Keep improving the quality and performance of the library and its GPU kernels
Explore and expand the boundary of innovative technologies like GPU code generation and fusion
Contribute to NVIDIA's AI business by collaborating closely with DL product teams as well as kernel development teams
What we need to see:MS, PhD or equivalent in relevant fields (CS, EE, Math)
2+ years of relevant work or research experience
Strong programming skills in C, C++, and Python
Excellent problem solving skills and learning capability
Experience with designing software architecture, interfaces, and building testing infrastructures
Good communication and a great teammate
Ways to stand out from the crowd:Familiar with CUDA programming and GPU architecture
Familiar with TensorRT/cuDNN/cuBLAS etc.
Background with DL fundamentals, frameworks, graph compilers, LLVM, MLIR etc.
Hands-on experience in development on Linux and Windows platforms, C++ build tools like CMake and DevOps tools, including Docker, Jenkins, Kubernetes etc.
Track record of mentoring junior engineers and leading a project and a team