Design, develop, troubleshoot and debug software programs for databases, applications, tools, networks etc.
Department Description
Oracle Cloud Infrastructure (OCI) combines the elasticity and utility of public cloud with the granular control, security, and predictability of on-premises infrastructure to deliver high-performance, high availability, and cost-effective infrastructure services. Multiple compute options provide the flexibility to run the most demanding workloads and less compute-intensive applications in a secure and highly available cloud environment. Customers can self-service provision virtual machines alongside bare metal servers and clusters on the same virtual cloud networks through a unified web console, APIs, CLI, or via industry-standard tools such as Terraform and Chef. OCI's approach gives the customer choices for storage, such as industry-leading local NVME storage or elastic network block storage.
The Oracle Kubernetes Engine (OKE) builds the OCI service that runs our managed Kubernetes experience. The platform runs mission critical workloads for large internal and external customers at large scale. We are currently investing across the board on features to build a better experience running customer data-planes and run AI/ML workloads on top of GPUs.
Position Overview
We’re looking for hands-on engineers with expertise and passion for solving difficult problems in distributed systems, virtualized infrastructure, and highly available services. If this is you, at Oracle, you can design and build innovative new systems from the ground up. These are exciting times in our space—we are growing fast, still at an early stage, and working on ambitious new initiatives. An engineer at any level can have a significant technical and business impact.The ideal candidate for this team is an experienced architect and proficient programmer with a wide breadth of knowledge and experience, including areas such as containers, networking, storage, internet protocols, and operating systems. We write distributed, highly available systems to build, update, and deploy Kubernetes, plus automation and tooling for testing, deployments, and other needs.
Qualifications
BS degree in Computer Science or related technical field involving coding or equivalent practical experience. Ten or more years of experience delivering and operating large-scale, highly available distributed systems. History of working in large Java or Golang codebases and experience with scripting languages such as Python, Perl, etc. Strong knowledge of data structures, algorithms, operating systems, and distributed systems. Systematic problem-solving approach, strong communication skills, a sense of ownership, and drive. Experience building large scale, multi-tenant, virtualized infrastructure. Experience in developing or managing containerized workloads using Kubernetes along with one of the following two areas will be considered a strong plus: Container networking. GPU AI/ML workloads and RDMA Clusters.Career Level - IC5