Preloader

Loading

Sr. Data Engineer – Spruce Technology

Job Information

  • icon
    No. of Openings 1 opening
  • icon
    Job Experience : 5-10 years

Job Description

Spruce Technology, Inc., a mid-size and rapidly-growing technology services firm, seeks to hire an in-house corporate counsel. An award-winning firm (Inc 5000, SmartCEO, Entrepreneur of the Year) with a steadily growing portfolio of commercial and government clients, Spruce provides innovative technology solutions, specialized IT staff, and IT strategy consulting nationwide. Spruce maintains partnerships with major technology vendors and continually develops leading-edge offerings in service areas such as digital experience, data services, application development, infrastructure, cyber security, and IT staffing.

Please visit {removed} for additional information on our services.

Spruce Technology, Inc. is an Equal Opportunity Employer that does not discriminate on the basis of actual or perceived age, sex, pregnancy, race, creed, color, national origin, disability, marital status, sexual orientation, citizenship status, genetic information, religion, or any other characteristic protected by applicable federal, state or local laws.

Title:
Sr. Data Engineer

Job Type:
Full Time Position

Duration:
Full Time Employment

Location:
Remote Work

DESCRIPTION:

Design, develop and maintain scalable data pipelines, build out new API integrations for data transfer.

Perform data analysis required to troubleshoot data-related issues and assist in the resolution of data issues.

Collaborate with Analysts to understand upstream data assets

Own big data enrichment pipelines using AWS technologies like Glue, PySpark and MWAA with data hosted in S3, Snowflake, and PostgreSQL

Develop locally, empowered by Docker and tools like VSCode, and DBeaver, Jupyter Notebook, GLUE interactive sessions.

Work with QA Engineers to create & maintain tests for CI/CD gates and ETL validation

Contribute beyond the data layer by developing application layer code in Python

Build resilient CI/CD alongside DevOps using GitHub Actions and Terraform

Consult with Architects and other data engineers to define and follow best practices

Knowledge and Skills

Must-have:

BS or MS degree in Computer Science or a related technical field

5+ years of extensive ETL development experience using Pyspark/Glue on AWS

5+ years of experience in CSV, JSON, Parquet file formats.

5+ years of experience in S3, Athena, RDS, Glue catalogue, Cloudformation/Terraform

Experience with querying nested/json data stored in parquet files or tables.

Strong understanding of ETL/Data-pipelines/BigData architecture

Strong Database/SQL experience in any RDBMS

Nice-to-have:

Experience in schema design, data ingestion experience on Snowflake (or equivalent MPP)

Experience in orchestrating data processing jobs using Step Function/Glue workflow/Apache Airflow (MWAA)

Experience in data analysis using Excel formulas, vlookup, pivot, slicers

Education

Bachelor’s degree in Computer Science (or related field) or equivalent combination of education and experience

Typical Range of Experience

Associated topics:
data administrator, data engineer, data integration, data manager, data warehousing, database administrator, etl, mongo database, sybase, teradata

Job alerts

Receive emails for the latest jobs matching your search criteria

Uploading
Color SWITCHER