Mexico
6 days ago
Data Engineer

Role Summary

Connect and model complex distributed data sets to build repositories, such as data warehouses, data lakes, using appropriate technologies. Lead teams in the management of data related contexts ranging across addressing small to large sized data sets, structured/unstructured or streaming data, extraction, transformation, curation, modelling, building data pipelines, identifying right tools, writing SQL/Java/Python code. Leader within the Community of Practice/Center of Excellence to create/enhance standards and best practices

 

Responsibilities

Partner with Senior Data Solution Architect to create and maintain optimal solutions aligned to published standards with focus on automation and orchestration Lead efforts to ensure the health and hygiene of platforms including upgrades, migrations, etc. Assemble large, complex data sets that meet functional /non-functional business requirements Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc. Build the infrastructure required for optimal extraction transformation, and loading of wide variety of data Develop models/prototypes to provide observations, identify trends and patterns with leadership to assess potential solutions Develop statistical models, algorithms needed for reporting and analytics with high level complexity Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics Work with stakeholders including Management, Domain leads, and Teams to assist with data-related technical issues and support their data infrastructure needs Based on new/enhanced data security policies and procedures, build/enhance technology footprint related to encryption, obfuscation and role-based access Create data tools for analytics and data scientist team members Extensive knowledge of data and analytics framework supporting data lakes, warehouses, marts, reporting, etc Defining data retention policies, monitoring performance and advising any necessary infrastructure changes based on functional and non-functional requirements In depth knowledge of data engineering discipline Extensive experience working with Big Data tools and building data solutions for advanced analytics Minimum of 7+ years' hands-on experience with a strong data background Solid programming skills in Java, Python and SQL Clear hands-on experience with  database systems - Hadoop ecosystem, Cloud technologies (e.g. AWS, Azure, Google), in-memory database systems (e.g. HANA, Hazel cast, etc) and other database systems - traditional RDBMS (e.g. Teradata, SQL Server, Oracle), and NoSQL databases (e.g. Cosmos, MongoDB, DynamoDB) Practical knowledge across data extraction and transformation tools - traditional ETL tools (e.g. Informatica, DataBricks) as well as more recent big data tools ML distributed computing, dataset processing in parallel for training. Python coding with GPU parallelism. Expertise of docker and k8s. Being familiar with yaml for deployments. Extensive background in programming, databases and/or big data technologies OR BS/MS in software engineering, computer science, economics or other engineering fields
Confirm your E-mail: Send Email