Lead Data Engineer

Cloudera

  • Costa Rica
  • Permanente
  • Tiempo completo
  • Hace 7 horas
Business Area: ITSeniority Level: Mid-Senior levelJob Description:At Cloudera, we empower people to transform complex data into clear and actionable insights. With as much data under management as the hyperscalers, we're the preferred data partner for the top companies in almost every industry. Powered by the relentless innovation of the open source community, Cloudera advances digital transformation for the world’s largest enterprises.Cloudera Data Engineers are core contributors to Cloudera’s internal Data Platform, responsible for building scalable, resilient pipelines that support Business Operations, Analytics, and cutting-edge AI/ML initiatives. In this role, you’ll design, implement, and optimize data pipelines using a wide range of Big Data and cloud-native technologies, supporting use cases from reporting and compliance to LLM-enabled apps.As a Lead Data Engineer, you will not only contribute technically, but also lead technical initiatives, guide team members, and ensure architectural consistency across projects. This role is ideal for someone who thrives on solving complex data problems and is deeply motivated to incorporate AI into modern data engineering practices.You will work in a fast-paced, agile, product-driven environment, collaborating with cross-functional teams to build systems that power insights, automation, and innovation. We’re looking for someone with a passion for both traditional data engineering and the exploration of AI-assisted tools and GenAI-powered solutions in day-to-day development.As a Lead Data Engineer you will:Provide technical leadership and mentorship to other data engineers, helping them grow and succeed in their roles.Lead design reviews, enforce coding standards, and guide architectural decisions to ensure platform scalability and maintainability.Partner with senior stakeholders and engineering leadership to prioritize and plan technical work that aligns with strategic business goals.Act as a technical point of contact for cross-team collaborations, participating in project planning, scoping, and retrospectives.Demonstrate curiosity, initiative, and capability to integrate AI-powered tools (e.g., Copilot, GenAI code assistants, LLM-enhanced documentation search) into day-to-day engineering workflows.Support AI/GenAI projects by building pipelines that feed LLM apps and recommendation systems.Drive adoption of data engineering best practices across the team, including testing, CI/CD, and observability.Design, build, and maintain batch and streaming data pipelines using Cloudera's native data platform (CDP).Ingest data from various sources, including REST APIs, cloud services, and enterprise SaaS platforms (e.g., Salesforce, NetSuite, Workday).Implement robust transformation logic using Python, PySpark, SQL, and Java across structured and semi-structured datasets.Ensure data integrity, lineage, and performance across ingestion, transformation, and delivery layers.We are excited about you if you have:6+ years of experience as a Data Engineer in large-scale data environments.2+ years of experience leading technical projects or mentoring engineers.Experience leading design/architecture discussions in high-volume data environments.Expert-level coding skills in SQL and Python, with working knowledge of PySpark or Java.Strong understanding of data modeling, ETL/ELT processes, and data warehouse concepts (e.g., Kimball/Inmon).Experience working with Hadoop ecosystem tools (e.g., Spark, Hive, Impala, Kafka).Demonstrated experience designing and maintaining production-grade data pipelines under SLAs.Familiarity with API-based ingestion, performance tuning, and schema evolution.You might also have:Hands-on experience with Apache NiFi, Apache Airflow.Experience with containerized services (Docker), FastAPI, or internal developer tooling.Experience supporting AI/ML platforms or developing pipelines for LLM and generative AI use cases, with a track record of contributing to internal AI tooling or automation within the data engineering lifecycle.What you can expect from us:Generous PTO PolicySupport work life balance withFlexible WFH PolicyMental & Physical Wellness programsPhone and Internet Reimbursement programAccess to Continued Career DevelopmentComprehensive Benefits and Competitive PackagesEmployee Resource GroupsEEO/VEVRAA#LI-SZ1#LI-REMOTE

Cloudera

Empleos similares

  • Especialista Microsoft (Hybrid Cloud Engineer) CR

    GBM

    • Costa Rica
    Descripción Responsable de implementar, configurar, automatizar y entregar soluciones en la nube, entrega y ejecución de proyectos. Ingeniero(a) especializado(a) en la entrega y …
    • Hace 1 mes
    • Postúlese fácilmente
  • Cook-Lead

    Marriott

    • San José
    **Número de Empleo** 23106418 **Categoría de Empleo** Food and Beverage & Culinary **Ubicación** Courtyard San Jose Escazu, Calle Marginal Norte Plaza Itskatzu, San Jose, Costa R…
    • Hace 1 día