Job Title:
Big Data Developer

Company: ConsultNet

Location: Rockville, MD

Created: 2024-05-04

Job Type: Full Time

Job Description:

Title: Big Data Developer Location: Fully remote Type: Long-term Contract Pay: Based off experience Overview: We are searching for a Big Data Developer to support our client, a large financial regulator, to be a part of a multi-year initiative to build a new system that assists with the collection and processing of up to 250 million records to support financial regulations. The Big Data Developer will be responsible for ingesting, storing, validating, and disseminating data in a consumable format for intelligence teams to gain insights. Responsibilities: Understand complex business requirements Design and develop ETL pipeline for collecting, validating and transforming data according to the specification Develop automated unit tests, functional tests and performance tests. Maintain optimal data pipeline architecture Design ETL jobs for optimal execution in AWS cloud environment Reduce processing time and cost of ETL workloads Lead peer reviews and design/code review meetings Provide support for production support operations team Implement data quality checks. Identify areas where machine learning can be used to identify data anomalies Requirements: BS degree in computer science or related field 5+ years of experience in programming language Java or Scala 5+ years of experience in ETL projects 3+ years of experience in Big Data projects 2+ years of experience with API development (REST APIs) Strong experience in Java or Scala Strong experience in big data technologies like AWS EMR, AWS EKS, Apache Spark Strong experience with serverless technologies like AWS Dynamo DB, AWS Lambda Strong experience in processing with JSON and csv files Must be able to write complex SQL queries Experience in performance tuning and optimization Familiar with columnar storage formats (ORC, Parquet) and various compression techniques Experience in writing Unix shell scripts Unit testing using JUnit or ScalaTest Experience with CI/CD pipelines BPM/ AWS Step Functions Python scripting Performance testing tools like Gatling or JMeter