
QA Analyst: Data Reliability
- Heredia
- Permanente
- Tiempo completo
- 3+ years of experience in data validation, data QA, analytics, or related reliability testing roles, with expertise in SQL for querying large-scale data (e.g., Databricks).
- Proficiency in Python or PySpark for validation scripting or data analysis, and experience with Kafka and stream-processing systems.
- Familiarity with NoSQL databases like MongoDB and validation of REST APIs using tools such as Postman or Swagger.
- Solid understanding of data pipelines, ETL/ELT processes, and transformation workflows, with the ability to analyze and troubleshoot large datasets.
- Strong communication and documentation skills to support collaborative projects and reporting.
- Knowledge of data validation tools like Great Expectations or dbt tests, and experience integrating tests into CI/CD pipelines.
- Exposure to GenAI tools or development of agentic workflows for automation and decision support, as well as dashboard creation for data quality monitoring.
- Undergraduate degree in Statistics, Computer Science or other quantitative/technology field required.
- Validate data transformations across the pipeline-from Kafka event streams to processed data in Databricks and MongoDB, through APIs and UI.
- Design and execute validation strategies that ensure data accuracy, completeness, and consistency.
- Analyze large datasets to identify anomalies, data mismatches, and transformation issues.
- Build, curate, and maintain test datasets to support multiple QA use cases and test environments.
- Collaborate with developers and QA engineers to design test cases, validation rules, and data mapping documentation.
- Develop scripts, dashboards, or alerts to proactively monitor data quality.
- Participate in code reviews and provide input on schema design, transformation logic, and reliability strategies.
- Document validation plans, data exception reports, and reliability metrics.