TQuanta Technologies - Spark/Scala Developer - Data Pipeline
Job details
Job Description :We are looking for an experienced Spark Scala Developer to join our dynamic team. The ideal candidate will have strong technical expertise in Apache Spark and Scala, along with hands-on experience in building, managing, and optimising large-scale data pipelines. This role offers an opportunity to work in a fast-paced environment, tackling complex data challenges, and contributing to the development of cutting-edge solutions.Key Responsibilities : - Design, develop, and maintain scalable data pipelines using Apache Spark and Scala. - Optimise and transform large datasets into structured formats for analytics and machine learning models. - Integrate data from various sources such as APIs, relational databases, and No SQL databases. - Implement data validation and cleansing procedures to ensure data quality and consistency. - Optimise Spark jobs for performance, ensuring efficient use of resources and cost-effective solutions. - Analyse and resolve performance bottlenecks in distributed data processing systems - Work closely with data engineers, architects, analysts, and other stakeholders to understand business requirements and deliver data-driven solutions. - Provide technical guidance to junior developers and contribute to code reviews and best practices. - Write and maintain unit tests to ensure code reliability and functionality. - Document system designs, workflows, and processes for knowledge sharing and future reference. - Monitor and troubleshoot production issues related to data processing pipelines. - Implement robust error-handling mechanisms to ensure data pipeline resilience.Required Skills and Qualifications :Experience: 5-10 years in software development with a focus on big data technologies. - Proficiency in Scala programming language and functional programming concepts. - Hands-on experience with Apache Spark (core, SQL, streaming, and MLlib modules). - Strong understanding of distributed computing principles and architectures. - Familiarity with data formats such as Avro, Parquet, JSON, and ORC. - Experience with Hadoop, Hive, HDFS, and related tools is a plus. - Familiarity with Kafka or similar message queue systems for real-time data processing. - Proficient in SQL and working knowledge of relational databases (e.g., MySQL, PostgreSQL) and No SQL databases (e.g., Cassandra, MongoDB). - Exposure to cloud-based data services like AWS EMR, Azure Data bricks, or GCP Data flow is highly desirable. - Familiarity with Git and CI/CD pipelines for automated deployments. Soft Skills :- Strong problem-solving abilities and analytical mindset. - Excellent communication skills to collaborate with technical and non-technical stakeholders. - Ability to work independently and as part of a team in a dynamic, fast-paced environment.Preferred Qualifications :- Education : Bachelor's or Master's degree in Computer Science, Information Technology, or a related field. (ref:hirist.tech)
Apply safely
To stay safe in your job search, information on common scams and to get free expert advice, we recommend that you visit SAFERjobs, a non-profit, joint industry and law enforcement organization working to combat job scams.