Shanghai, CHN
19 hours ago
GPU Kernel Dev & Perf Analysis Architect
NVIDIA is developing processor and system architectures that accelerate machine learning, automotive and high performance computing (HPC) applications. We are seeking a strong candidate to do GEMM kernel development and performance analysis for NVIDIA's new architectures. Your work will play a critical role in shaping the future of deep learning hardware and software, ensuring optimal performance for next-generation AI applications. This position offers the opportunity to make a meaningful impact in a fast-moving, technology focused company. What you'll be doing: + Design, develop, and optimize GEMM (General Matrix Multiply) kernels for NVIDIA's new architectures. + Implement and fine-tune kernels to achieve optimal performance on NVIDIA GPUs. + Conduct in-depth performance analysis of GPU kernels, including GEMM and other critical operations. + Identify bottlenecks, optimize resource utilization, and improve throughput, and power efficiency + Create and maintain workloads and micro-benchmark suites to evaluate kernel performance across various hardware and software configurations. + Generate performance projections, comparisons, and detailed analysis reports for internal and external stakeholders. + Collaborate with architecture, software, and product teams to guide the development of next-generation deep learning hardware and software. What we need to see: + 4+ years of industry experience in GPU programming or performance optimization for DL applications. + Hands-on experience in developing and optimizing GEMM (General Matrix Multiply) kernels. + Demonstrated experience in analyzing and improving the performance of GPU kernels, with measurable results (e.g., performance improvements, efficiency gains). + Expertise in CUDA programming for GPU acceleration. + Experience with performance profiling tools (e.g., NVIDIA Nsight). + Excellent communication skills, both written and verbal. + Strong organizational and time management abilities, with the ability to prioritize tasks effectively.
Confirm your E-mail: Send Email
All Jobs from NVIDIA