Seattle, WA, US
15 days ago
Senior Software Development Engineer, AWS Kubernetes (K8s)
Utility Computing (UC)
AWS Utility Computing (UC) provides product innovations — from foundational services such as Amazon’s Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS’s services and features apart in the industry. As a member of the UC organization, you’ll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services.
Our software developers build the next generation technologies that change how millions of AWS customers connect, and interact with AWS services ecosystem. We use ideas from every facet of computer science including distributed computing, large-scale design, service oriented architecture, networking, big data processing, machine learning, and artificial intelligence. We are looking for highly-motivated and passionate engineers to build next generation of network management for one of the largest networks in the world.

We are seeking an experienced software engineer with background in building and operating an infrastructure platform responsible for training, fine-tuning and inference of AI models.

Key job responsibilities
You will be working with customers spanning from industry-leading AI labs to Fortune 500 enterprise companies that are leveraging EKS to train and deploy models at scale. You will work with Amazon's AI teams to operationalize purpose-built models to improve the EKS experience for new and existing users.

Our ideal candidate will have experience with machine learning infrastructure like GPUs or AWS silicon like Trainium as well as supporting networking infrastructure like NCCL. They should be able to set technical strategy and oversee the development of high-scale, reliable infrastructure. They should possess a working knowledge of Kubernetes and associated machine learning frameworks such as PyTorch, JAX, Kubeflow, Kueue, etc.

A day in the life
This is a cross-functional role where you will work across a team of Kubernetes experts, Product Managers, and Applied Scientists to build machine learning capabilities for external and internal use.

We believe in a flexible approach to work that empowers you to strike a balance tailored to your needs, fostering long-term happiness and fulfillment. It's not about the time spent in any one location, but rather about cultivating a harmonious equilibrium that enhances all aspects of your life. Our team is dedicated to supporting new members, regardless of experience level or tenure, in an environment that encourages knowledge sharing and mentorship.

This team plays a critical role in AI at Amazon. Every customer that uses AWS for large model training and inference will be utilizing your software, and the performance of that software is of paramount importance. If you are looking to make a significant impact in the AI industry, this is an excellent team to be a part of.

About the team
Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge-sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects that help our team members develop your engineering expertise so you feel empowered to take on more complex tasks in the future.
Confirm your E-mail: Send Email