We're seeking a Data Engineer to architect and develop sophisticated data solutions using advanced Spark, PySpark, Databricks and EMR implementations in our mission to transform the cyber-security breach readiness and response market.
Why Mitiga?
Mitiga preemptively detects and stops attacks before damage is done.Mitiga moves your security beyond configuration-focused prevention. In today’s cloud-first, AI-driven world, attackers inevitably get in. Mitiga promptly stops them.Our platform connects Cloud, SaaS, AI, and Identity into one panoramic forensic system that gives SecOps total awareness, attack decoding, and autonomous containment. The result: attacks stop mid-flight, investigations are instant, and impact disappears.We replace the false promise of “zero breach” with a promise we can keep - Zero Impact.
When attackers get in, Mitiga ensures they get nothing.
Zero Impact Breach Mitigation.Mitiga is used by many well-known brands to reduce risk, enhance their SecOps, and improve business resilience.
What You’ll Do:
Join us in crafting cutting-edge solutions for the cyber world using Spark/PySpark ETLs and data flow processes. Dive into the realm of multi-Cloud environments while collaborating closely with investigators to fine-tune PySpark performance. Harness the power of top-notch technologies like Databricks to elevate our technical projects, scaling them for efficiency. Embrace innovation as you research and implement new techniques. Evolve with us as a key member of the Mitiga R&D team.
Technical Impact:
- Design and implement complex data processing architectures for cloud security analysis
- Optimize and scale critical PySpark workflows across multi-cloud environments
- Develop innovative solutions for processing and analyzing massive security datasets
- Drive technical excellence through sophisticated ETL implementations
- Contribute to architectural decisions and technical direction
Core Responsibilities:
- Build robust, scalable data pipelines for security event processing
- Optimize performance of large-scale PySpark operations
- Implement advanced data solutions using Databricks and cloud-native technologies
- Research and prototype new data processing methodologies
- Provide technical guidance and best practices for data engineering initiatives
Preferred Qualifications:
- Experience with security-focused data solutions
- Deep expertise with Splunk and AWS services (S3, SQS, SNS, Stream)
- Advanced understanding of distributed systems
- Strong Linux systems knowledge
- Experience with real-time data processing architectures
Who You Are:
- 4+ years of hands-on data engineering experience in cloud-based SaaS environments
- Deep expertise in PySpark, Python, and SQL optimization
- Advanced knowledge of AWS, Azure, and GCP cloud architectures
- Proven track record implementing production-scale data systems
- Extensive experience with distributed computing and big data processing
- Strong collaboration skills and technical communication abilities
Top Skills
Similar Jobs
What you need to know about the San Francisco Tech Scene
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine



