Data Engineer
As a Data Engineer, you will design, develop, and optimize scalable data pipelines and workflows to support advanced analytics and business intelligence needs. You will collaborate with cross-functional teams to ensure data accessibility, integrity, and security.
Core Responsibilities:
- Design, develop, and implement robust data pipelines for data collection, transformation, and integration.
- Collaborate with senior engineers to architect scalable data solutions using Azure services, including Azure Data Factory and Databricks.
- Integrate data from SAP ERP systems and other enterprise platforms into modern cloud-based data ecosystems.
- Leverage Databricks for big data processing and workflow optimization.
- Work with stakeholders to understand data requirements, ensuring data quality and consistency.
- Maintain data governance practices to support compliance and security protocols.
- Support analytics teams by providing well-structured, reliable data for reporting and machine learning projects.
- Troubleshoot and resolve data pipeline and workflow issues.
Qualifications:
- Bachelor’s degree in Computer Science, Data Engineering, Information Systems, or a related field.
- 3–5 years of experience in data engineering or a related role.
- Proficiency in Azure technologies, including Azure Data Factory, Azure SQL Database, and Databricks.
- Experience with SAP data integration is a plus.
- Strong SQL and Python programming skills for data engineering tasks.
- Familiarity with data modeling concepts (e.g., star and snowflake schemas) and best practices.
- Experience with CI/CD pipelines for deploying data workflows and infrastructure.
- Knowledge of distributed file systems like Azure Data Lake or equivalent cloud storage solutions.
- Basic understanding of Apache Spark for distributed data processing.
- Strong problem-solving skills and a collaborative mindset.
Technical Knowledge:
- Deep understanding of Azure cloud infrastructure and services, particularly those related to data management (e.g., Azure Data Lake, Azure Blob Storage, Azure SQL Database).
- Experience with Azure Data Factory (ADF) for orchestrating ETL pipelines and automating data workflows.
- Familiarity with Azure Databricks for big data processing, machine learning, and collaborative analytics.
- Expertise in Apache Spark for distributed data processing and large-scale analytics.
- Familiarity with Databricks, including managing clusters and optimizing performance for big data workloads.
- Understanding of Databricks Bronze, Silver, and Gold Model.
- Understanding of distributed file systems like HDFS and cloud-based equivalents like Azure Data Lake.
- Proficiency in SQL and NoSQL databases, including designing schemas, query optimization, and managing large datasets.
- Experience with data warehousing solutions like Databricks, Azure Synapse Analytics or Snowflake.
- Familiarity with connecting data Lakehouse’s with Power BI.
- Understanding of OLAP (Online Analytical Processing) and OLTP (Online Transaction Processing) systems.
- Strong grasp of data modeling techniques, including conceptual, logical, and physical data models.
- Experience with star schema, snowflake schema, and normalization for designing scalable, performant databases.
- Knowledge of data architecture best practices, ensuring efficient data flow, storage, and retrieval.
- Knowledge of CI/CD pipelines for automating the deployment of data pipelines, databases, and infrastructure.
- Experience with infrastructure as code tools like Terraform or Azure Resource Manager to manage cloud resources.
Preferred Qualifications:
- Familiarity with tools like Apache Airflow or other workflow orchestration tools.
- Knowledge of Azure Monitor or similar tools for system performance tracking.
- Certifications in Azure Data Engineering or related cloud platforms.
Location: Törökbálint Tópark u. 9.