职位描述
You will be reporting to the Engineering manager, you will work with Product Manager, other Engineering Team Members and with a variety of talented teammates. You will be part of team that will be a driving force in building Data and Analytic solutions for Technology.
Responsibilities:
1. Design and build reusable data assets for China data lake.
2. Anticipate, identify and solve issues concerning data management to improve data quality.
3. Clean, prepare and optimize data at scale for ingestion and consumption.
4. Implement complex automated workflows and routines using workflow scheduling tools.
5. Drive collaborative reviews of design, code, test plans and dataset implementation performed by other data engineers in support of maintaining data engineering standards.
6. Troubleshoot complex data issues and perform root cause analysis to proactively resolve product and operational issues.
7. Mentor and develop other data engineers in adopting best practices.
Requirements:
1. Above 4 years experiencing developing scalable data lake / data warehouse on top of big data platform.
2. Have deep Knowledge and experiences on Spark SQL / Hive SQL language, good knowledge of Presto or other MPP databases.
3. Have good experiences on Airflow or other data warehouse scheduling tools.
4. Have good experiences on data warehouse modeling.
5. Have good knowledge on AWS S3, EMR, lambda and AWS components or similar tech stack on other cloud.
6. Experiences on real time data processing, streaming data processing will be strong plus
7. Have experiences on Python programming
8. Strong skills building positive relationships across Product and Engineering.
9. Able to quickly pick up new programming languages, technologies, and frameworks.
10. Experience working in Agile and Scrum development process.
11. Fluent English skill (including oral and written English)