Spruce Technology, Inc., a mid-size and rapidly-growing technology services firm, seeks to hire an in-house corporate counsel. An award-winning firm (Inc 5000, SmartCEO, Entrepreneur of the Year) with a steadily growing portfolio of commercial and government clients, Spruce provides innovative technology solutions, specialized IT staff, and IT strategy consulting nationwide. Spruce maintains partnerships with major technology vendors and continually develops leading-edge offerings in service areas such as digital experience, data services, application development, infrastructure, cyber security, and IT staffing.
Please visit {removed} for additional information on our services.
Spruce Technology, Inc. is an Equal Opportunity Employer that does not discriminate on the basis of actual or perceived age, sex, pregnancy, race, creed, color, national origin, disability, marital status, sexual orientation, citizenship status, genetic information, religion, or any other characteristic protected by applicable federal, state or local laws.
Title:
Sr. Data Engineer
Job Type:
Full Time Position
Duration:
Full Time Employment
Location:
Remote Work
DESCRIPTION:
Design, develop and maintain scalable data pipelines, build out new API integrations for data transfer.
Perform data analysis required to troubleshoot data-related issues and assist in the resolution of data issues.
Collaborate with Analysts to understand upstream data assets
Own big data enrichment pipelines using AWS technologies like Glue, PySpark and MWAA with data hosted in S3, Snowflake, and PostgreSQL
Develop locally, empowered by Docker and tools like VSCode, and DBeaver, Jupyter Notebook, GLUE interactive sessions.
Work with QA Engineers to create & maintain tests for CI/CD gates and ETL validation
Contribute beyond the data layer by developing application layer code in Python
Build resilient CI/CD alongside DevOps using GitHub Actions and Terraform
Consult with Architects and other data engineers to define and follow best practices
Knowledge and Skills
Must-have:
BS or MS degree in Computer Science or a related technical field
5+ years of extensive ETL development experience using Pyspark/Glue on AWS
5+ years of experience in CSV, JSON, Parquet file formats.
5+ years of experience in S3, Athena, RDS, Glue catalogue, Cloudformation/Terraform
Experience with querying nested/json data stored in parquet files or tables.
Strong understanding of ETL/Data-pipelines/BigData architecture
Strong Database/SQL experience in any RDBMS
Nice-to-have:
Experience in schema design, data ingestion experience on Snowflake (or equivalent MPP)
Experience in orchestrating data processing jobs using Step Function/Glue workflow/Apache Airflow (MWAA)
Experience in data analysis using Excel formulas, vlookup, pivot, slicers
Education
Bachelor’s degree in Computer Science (or related field) or equivalent combination of education and experience
Typical Range of Experience
Associated topics:
data administrator, data engineer, data integration, data manager, data warehousing, database administrator, etl, mongo database, sybase, teradata
Receive emails for the latest jobs matching your search criteria