Loading...

Hadoop Developer/Data Engineer

Location: 

Curitiba, PR, BR

Company:  ExxonMobil

Job Role Summary

Develop & Support data pipelines/ETL feeds/applications such as reading data from external sources, merge data, transform data, perform data enrichment and load into target data destinations..

Job Role Responsibilities

Primary Job Functions:
•    Technical support and development expertise for data pipeline such as Hadoop/Spark workflows & Kafka Spark Streaming real time data pipelines.
•    Develop or enhance Hadoop workflows including data ingestion, extraction and processing of structured and unstructured data.
•    Applying analytical experience with database to write complex queries, query optimization, debugging, Troubleshooting data issue on Hbase and Hive.
•    Support ongoing activities ingestion of additional data or new data sources, processing and monitoring incremental flows, managing of Solr collections, reprocessing data.
•    Datalake knowledge with experience of migrating use cases on-Prem HDP clusters to CDP Platform.(Knowledge on Cloudera distribution migration)
•    Optimizing existing Hadoop workflow and working with Stakeholders for data gap related issue.
•    Engage in proof of concepts, technical demos and interaction with customers and other technical teams
•    Work to utilize common CI/CD best practices and Working with Continuous Integration Tools for the big data use cases.

Expected Level of Proficiency

Job Requirements:
•    3-10 Years of experience working with Hadoop Big data ecosystems.
•    Must have hands-on working experience on Big Data technologies such as Spark, Hive, Hbase, Kafka & Spark Streaming. 

Primary skills: 
•    Experience with Spark components Like Spark (Data frame/Dataset) and Streaming. 
•    Experience with programming language like Scala and/or Python.
•    Experience with Apache Solr Search engine.
•    Experience with NO-SQL database like Hbase and Apache Phoenix.
•    Experience with Hive Query Language and optimization.
•    Sound Knowledge in performance tuning and optimization(Spark, Hbase & Hive).
•    Sound Working knowledge in Kafka and Spark Streaming.
•    Experience with Job scheduling & monitoring with Oozie.
•    Good Experience in SQL based language.
•    Experience in UNIX or UNIX-like systems.
•    Ability to interface with customers, technology partners, testing, architecture and analysis groups.
Secondary Skills:
•    Good to have NIFI knowledge.
•    Familiarity with Azure DevOps and GIT tools.
•    Experience with SCRUM or other similar Agile framework 
•    Strong verbal and written communication skills.
•    Ability to work with data scientists and develop seamless workflows.
•    Self-starter and individual contributor.
•    Strong problem Solving skills.
•    Able to create Technical Documentation for the use cases.

 

 


Job Segment: Database, Engineer, SQL, Unix, Programmer, Technology, Engineering