Data Engineer

Full time at Valuelabs in India

Posted on February 20, 2025

Job details

Job Title: Pyspark Developer Experience: 5+ Years Location: Dubai Notice Period: Immediate to 30 Days About Us: ValueLabs is a global technology services company empowering businesses through innovative solutions and services. With expertise in AI, cloud, and cybersecurity, we deliver cutting-edge solutions that drive growth and efficiency. Education and Experience: • Bachelor’s or master’s degree in computer science, Data Engineering, Information Systems, or a related field. • 3+ years of experience as a Data Engineer, with a strong focus on PySpark and the Cloudera Data Platform. Required Skills and Qualifications: Responsibilities: • Data Pipeline Development: Design, develop, and maintain highly scalable and optimized ETL pipelines using PySpark on the Cloudera Data Platform, ensuring data integrity and accuracy. • Data Ingestion: Implement and manage data ingestion processes from a variety of sources (e.g., relational databases, APIs, file systems) to the data lake or data warehouse on CDP. • Data Transformation and Processing: Use PySpark to process, cleanse, and transform large datasets into meaningful formats that support analytical needs and business requirements. • Performance Optimization: Conduct performance tuning of PySpark code and Cloudera components, optimizing resource utilization and reducing runtime of ETL processes. • Data Quality and Validation: Implement data quality checks, monitoring, and validation routines to ensure data accuracy and reliability throughout the pipeline. • Automation and Orchestration: Automate data workflows using tools like Apache Oozie, Airflow, or similar orchestration tools within the Cloudera ecosystem. • Monitoring and Maintenance: Monitor pipeline performance, troubleshoot issues, and perform routine maintenance on the Cloudera Data Platform and associated data processes. • Collaboration: Work closely with other data engineers, analysts, product managers, and other stakeholders to understand data requirements and support various data-driven initiatives. • Documentation: Maintain thorough documentation of data engineering processes, code, and pipeline configurations. Technical Skills • PySpark: Advanced proficiency in PySpark, including working with RDDs, Data Frames, and optimization techniques. • Cloudera Data Platform: Strong experience with Cloudera Data Platform (CDP) components, including Cloudera Manager, Hive, Impala, HDFS, and HBase. • Data Warehousing: Knowledge of data warehousing concepts, ETL best practices, and experience with SQL-based tools (e.g., Hive, Impala). • Big Data Technologies: Familiarity with Hadoop, Kafka, and other distributed computing tools. • Orchestration and Scheduling: Experience with Apache Oozie, Airflow, or similar orchestration frameworks. • Scripting and Automation: Strong scripting skills in Linux. Thank You, Noorjahan Shaik shaik.noorjahan@valuelabs.com

Apply safely

To stay safe in your job search, information on common scams and to get free expert advice, we recommend that you visit SAFERjobs, a non-profit, joint industry and law enforcement organization working to combat job scams.

Hiring company

Valuelabs

See All Valuelabs Jobs

Improve your chance to get this job. Do an online course on HADOOP starting now.

Big Data Hadoop: SQL & NoSQL Skill-Up

INR 1,168

Duration: Upto 23 Hours

Enrol
Become a Big Data Hadoop Expert eduCBA

INR 1,801
~~INR 8,150~~

Duration: 28 Hours

Enrol

See all courses

See All Data Jobs

Data Engineer

Job details

Apply safely

Hiring company

Valuelabs

Jobs

Courses

Location

Follow us

Home India Data Engineer

Home India Data Engineer

Data Engineer

Job details

Apply safely

Hiring company

Valuelabs

Why are you reporting this job?

Laimoon Job Alert fresh jobs directly from websites*

Jobs

Courses

Location

Follow us