Boston, MA, US
10 days ago
Data Engineer, Amazon AGI, AGI Data Services
AI is the most transformational technology of our time, capable of tackling some of humanity’s most challenging problems. Amazon is investing in generative AI and the responsible development and deployment of large language models (LLMs) across all of our businesses. Come build the future of human-technology interaction with us.
We are looking for those candidates who just don’t think out of the box, but make the box they are in ‘Bigger’. The future is now, do you want to be a part of it? Then read on!


We’re looking for a Data Engineer on Amazon’s AGI team to build world-class data platforms and deploy scalable data ingestion tools with a commitment to foster the safe, responsible, and effective development of AI technologies . The ideal candidate is an expert with Petabyte scale data ingestion, processing data, data modeling, ETL/ELT design and business intelligence tools and passionately partners with the business to identify strategic opportunities where improvements in data infrastructure creates outsized business impact. They are a self-starter, comfortable with ambiguity, able to think big (while paying careful attention to detail) and enjoys working in a fast-paced team. The ideal candidate needs to possess exceptional technical expertise with largescale lakehouses, distributed computing at a scale of thousands of hosts on multiple clusters, Spark, BI systems and AWS services.


Core Responsibilities

· Design, implement, and support a platform providing ad hoc access to large datasets
· Interface with other technology teams to extract, transform, and load data from a wide variety of data sources using Spark or any other state of the art systems
· Implement data structures using best practices for lakehouses
· Model data and metadata for ad hoc and pre-built reporting, meeting read/write/summary optimized storages
· Interface with business customers, gathering requirements and delivering complete reporting solutions
· Build robust and scalable data integration (ETL) pipelines using Kotlin, Python, typescript and Spark
· Build and deliver high quality datasets to support business analyst and customer reporting needs
· Continually improve ongoing automating or simplifying self-service Data ingestion at scale for customers
· Participate in strategic & tactical planning discussions
Confirm your E-mail: Send Email