Phota Labs Jobs

ML Engineer - Inference

Phota Labs

ML Engineer - Inference

Reposted 20 Days Ago

In-Office

San Jose, CA, USA

Mid level

In-Office

San Jose, CA, USA

Mid level

As an ML Engineer specializing in inference, you will optimize models for production, implement neural network techniques, and collaborate with researchers.

The summary above was generated by AI

About us:

At Phota Labs, we’re building visual GenAI that helps people capture, express, and relive their memories — in ways that feel effortless, personal, and emotionally resonant. Our core technology enables personalized image generation that faithfully reflects who you are and the moments you experienced. Our first goal is to bring visual GenAI into everyday photography.

We're a small team of researchers, engineers, and designers who have always been at the forefront of how people capture, edit, and share images and videos. We build with our hands and hearts. We believe GenAI is the next shift for photography, and are seeking builders who share this vision — people like us, like you. We're just getting started!

The role:

As our first ML Engineer specializing in inference and optimization, you'll bridge the gap between cutting-edge research models and production systems. Your expertise will transform PyTorch research code into highly optimized, low-latency inference solutions that power our user-facing applications. You'll work closely with our GenAI researchers, vision ML engineers, and backend team to deliver exceptional performance.

What you’ll do:

Deploy and integrate researcher-trained model checkpoints into our cloud infrastructure and production pipelines.
Conduct thorough performance profiling and benchmarking to identify and eliminate computational bottlenecks.
Implement neural network optimization techniques including quantization, pruning, and architectural refinements while preserving model accuracy.
Develop efficient training and fine-tuning strategies with optimal precision trade-offs and parallelism.
Build and maintain scalable multi-GPU inference solutions with sophisticated model parallelism and serving architectures.
Collaborate with the research team to ensure optimization integrate smoothly with model development workflows.

You may be a strong fit if you:

Have experience deploying and optimizing deep learning models for production environments, particularly with multi-GPU inference and large-scale model serving.
Are well-versed in cutting-edge techniques for optimizing both inference and training workloads.
Possess strong knowledge of efficient attention mechanisms and algorithms.
Have hands-on experience implementing model quantization and working with inference frameworks.
Can write production-quality code and successfully integrate ML models into robust inference pipelines.
Are familiar with various cloud platforms, storage solutions, and modern training frameworks.

Logistics:

This role is based in San Jose, where we work in person. We believe the best ideas come from being in the same room.
We sponsor visas. We are committed to working through the process together for the right candidates. If you're currently outside the US, we're also committed to helping you relocate to the US throughout this process.
We offer generous health, dental, and vision coverage, unlimited PTO, paid parental leave, and relocation support as needed.
Don't meet every single qualification? That’s okay — we care more about your trajectory than checking every box. If the role excites you and the mission resonates, we'd love to hear from you.

Note: In the event your application is successful and an offer of employment is made to you, any offer of employment will be conditional on the results of a background check, performed by a third party acting on our behalf.

San Jose, CA, United States

Similar Jobs

Snap Inc.

Machine Learning Engineer

3 Days Ago

Hybrid

178K-313K Annually

Senior level

178K-313K Annually

Senior level

Artificial Intelligence • Cloud • Machine Learning • Mobile • Software • Virtual Reality • App development

Design, build, and productionize causal ML models (uplift, heterogeneous treatment effects), analyze A/B tests and quasi-experiments, collaborate with product and engineering, ensure scalable, maintainable infrastructure, and balance model complexity with interpretability and business impact.

Top Skills: NumpyPandasPythonScikit-Learn

Wells Fargo

Principal Engineer

9 Days Ago

Hybrid

San Francisco, CA, USA

159K-305K Annually

Senior level

159K-305K Annually

Senior level

Fintech • Financial Services

Lead design, development, and operationalization of enterprise-scale AI/ML systems across hybrid environments. Drive platform architecture, model lifecycle management, migration to cloud-native platforms, generative and agentic AI integrations, automation and governance, cross-functional collaboration, and mentorship to accelerate predictive and generative AI adoption.

Top Skills: A2AAgentic AiAgentspaceAnsibleAzureAzure MlChainlitCrew AiCrossplaneGenerative AiGithub ActionsGoogle Cloud Platform (Gcp)Harness CdJavaLangchainLanggraphMcpMicrosoft AutogenNoSQLOn-Prem AimlOpenshiftPlaywrightPythonPyTorchRag PipelinesShellSQLTensorFlowTerraformVector DatabasesVertex Ai

Wells Fargo

Principal Engineer

9 Days Ago

Hybrid

Concord, CA, USA

159K-305K Annually

Senior level

159K-305K Annually

Senior level

Fintech • Financial Services

Lead architecture, development, and operationalization of enterprise-scale AI/ML across hybrid environments. Drive platform strategy (GCP Vertex AI, Azure ML, On‑Prem), model lifecycle management, migration to cloud-native platforms, generative/agentic AI integration, automation of governance, and mentor engineering teams while advising senior leadership on AI strategy.

Top Skills: A2AAgentic Ai FrameworksAgentspaceAnsibleAzureAzure MlChainlitCrew AiCrossplaneGcp Vertex AiGemini Enterprise Agent PlatformGithub ActionsGoogle Cloud Platform (Gcp)Harness CdJavaLangchainLanggraphMcpMicrosoft AutogenNoSQLOpenshift Container PlatformPlaywrightPythonPyTorchRag PipelinesShell ScriptingSQLTensorFlowTerraformVector Databases

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Phota Labs

ML Engineer - Inference

Phota Labs San Jose, California, USA Office

Similar Jobs

Machine Learning Engineer

Principal Engineer

Principal Engineer

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech