Position Summary Big Data Engineers serve as the backbone of the Strategic Analytics organization, ensuring both the reliability and applicability of the teamrsquos data products to the entire Samsung organization. They have extensive experience with ETL design, coding, and testing patterns as well as engineering software platforms and large-scale data infrastructures. Big Data Engineers have the capability to architect highly scalable end-to-end pipeline using different open source tools, including building and operationalizing high-performance algorithms. Big Data Engineers understand how to apply technologies to solve big data problems with expert knowledge in programming languages like Java, Python, Linux, PHP, Hive, Impala, and Spark. Extensive experience working with both 1) big data platforms and 2) real-time streaming deliver of data is essential. Big data engineers implement complex big data projects with a focus on collecting, parsing, managing, analyzing, and visualizing large sets of data to turn information into actionable deliverables across customer-facing platforms. They have a strong aptitude to decide on the needed hardware and software design and can guide the development of such designs through both proof of concepts and complete implementations. Additional qualifications should include bull Tune Hadoop solutions to improve performance and end-user experience bull Proficient in designing efficient and robust data workflows bull Documenting requirements as well as resolve conflicts or ambiguities bull Experience working in teams and collaborate with others to clarify requirements bull Strong co-ordination and project management skills to handle complex projects Key Responsibilities (MUST-HAVE SKILLS) bull Ownership of the platform framework and tools. bull Enhance, Enable and Implement CICD. bull Code new operators, functions to automate ETL use cases. bull Implement container based architecture, infrastructure as code and configuration management for the platform. Required Skills (MUST-HAVE SKILLS) Bachelors in Computer Science bull Exceptionally strong coding, optimized algorithm skills in Python. bull AWS administration skills with automation, architecture and implementation experience. bull Experience with Apache Airflow Implementation. bull Data EngineeringData Operations and ETL Development experience. bull Strong communication, troubleshooting and coordination skills. Communication with the stakeholders of the platforms, Data center partners and vendors. Preferred Skills 5 years of Python development experience ? Ability to write MapReduce jobs Knowledge and ability to implement workflowschedulers within Oozie andor Airflow ndash need good understanding Understanding and implementation of Flume processes. God aptitude in multi-threading and concurrency concepts. Good knowledge of database structures, theories, principles, and practices.
Associated topics: data analytic, data architect, data integrity, data manager, data warehouse, data warehousing, etl, mongo database administrator, sybase, teradata