Writer Logo

Writer

Performance engineer

Sorry, this job was removed at 06:13 p.m. (PST) on Friday, Nov 21, 2025
Be an Early Applicant
In-Office or Remote
Hiring Remotely in USA
In-Office or Remote
Hiring Remotely in USA

Similar Jobs

3 Days Ago
Remote or Hybrid
Mid level
Mid level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The role involves troubleshooting complex technical issues, leveraging AI, and providing customer support while owning the technical direction. Candidates need to possess strong technical skills across the technology stack and excellent customer-facing abilities.
Top Skills: Active DirectoryAIAjaxCSSHTMLHTTPJ2EeJavaJavaScriptLdapMySQLOraclePerlPythonRestSoapSQLSQL ServerSsoTomcatUnix ShellWeblogicWebsphereWindows ShellXhtmlXML
Yesterday
Remote
United States
248K-336K Annually
Senior level
248K-336K Annually
Senior level
Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy
The role involves leading application performance improvements across multiple platforms, focusing on measurable metrics and driving engineering standards and workflow optimization, using both hands-on coding and strategic influence.
Top Skills: Ai ToolingGoJavaScriptLlmsPythonReactRum Telemetry
2 Days Ago
In-Office or Remote
15 Locations
83K-222K Annually
Senior level
83K-222K Annually
Senior level
Fitness • Healthtech • Retail • Pharmaceutical
Design, build, and maintain data platform components for event streaming, databases, and data warehouses. Troubleshoot performance and reliability issues, automate provisioning and backups, participate in on-call rotation, mentor junior engineers, and contribute to the technical roadmap and platform scalability, security, and automation.
Top Skills: AWSAzureBashGithub ActionsInfrastructure As CodeKafkaPythonRedis

📐 About this role 
WRITER is seeking a highly skilled and motivated Principal performance engineer to lead the performance optimization of our cutting-edge Generative AI technology stack. This role is critical in ensuring the scalability, efficiency, and reliability of our Large Language Models (LLMs) and Retrieval Augmented Generation (RAG) systems. You will be a key driver in identifying and resolving performance bottlenecks, optimizing resource utilization, and ensuring a seamless user experience. You will work closely with our AI research, software engineering, and infrastructure teams to deliver world-class AI solutions.


🦸🏻‍♀️ Your responsibilities 

  • Performance leadership:

    • Define and implement performance engineering strategies for our Generative AI full stack, including services, application, LLMs, RAG pipelines, and related infrastructure.

    • Lead performance testing, profiling, and analysis efforts to identify and resolve performance bottlenecks.

    • Establish and maintain performance benchmarks and SLAs for critical AI services.

    • Provide technical leadership and mentorship to performance engineering team members.

  • LLM capacity and tuning:

    • Analyze and improve LLM inference performance, including latency, throughput, and resource utilization.

    • Develop and implement strategies for LLM capacity planning and scaling.

    • Collaborate with AI researchers to evaluate and improve LLM model architectures and training techniques for performance.

    • Optimize LLM inference through techniques such as quantization, distillation, and optimized kernel implementation.

  • RAG performance optimization:

    • Design and implement performance tests for RAG pipelines, including retrieval, ranking, and generation components.

    • Identify and optimize performance bottlenecks in RAG systems, such as database queries, vector search, and document processing.

    • Evaluate and optimize RAG system architectures for scalability and efficiency.

    • Tune vector databases for optimal recall and latency.

  • Infrastructure optimization:

    • Collaborate with infrastructure teams to optimize hardware and software configurations for AI workloads.

    • Evaluate and recommend new technologies and tools for performance monitoring and analysis.

    • Develop and maintain performance dashboards and reports to track key metrics.

    • Optimize GPU utilization and memory management for LLM inference.

  • Collaboration and communication:

    • Work closely with AI researchers, software engineers, and product managers to ensure performance requirements are met.

    • Communicate performance findings and recommendations to stakeholders at all levels.

    • Stay up-to-date with the latest developments in Generative AI and performance engineering.

⭐️ Is this you?

  • Education:

    • Bachelor's degree in Computer Science, Engineering, or a related field (Master's preferred).

  • Experience:

    • 10+ years of experience in performance engineering, with a focus on large-scale distributed systems.

    • 2+ years of experience working with AI/ML technologies

    • Proven experience in performance testing, profiling, and analysis of complex software systems.

    • Deep understanding of NLP architectures, training, and inference.

    • Experience with vector databases and search technologies.

    • Experience with cloud computing platforms (e.g., AWS, Azure, GCP) and containerization technologies (e.g., Docker, Kubernetes).

    • Strong programming skills in python.

    • Familiarity with Postgres and Elasticsearch

    • Experience with performance analysis tools (e.g., profilers, debuggers, monitoring tools).

  • Skills:

    • Strong analytical and problem-solving skills.

    • Excellent communication and collaboration skills.

    • Ability to work in a fast-paced and dynamic environment.  

    • Passion for AI and a desire to push the boundaries of performance engineering


🍩 Benefits & perks (US Full-time employees)

  • Generous PTO, plus company holidays

  • Medical, dental, and vision coverage for you and your family

  • Paid parental leave for all parents (12 weeks)

  • Fertility and family planning support

  • Early-detection cancer testing through Galleri

  • Flexible spending account and dependent FSA options

  • Health savings account for eligible plans with company contribution

  • Annual work-life stipends for:

    • Home office setup, cell phone, internet

    • Wellness stipend for gym, massage/chiropractor, personal training, etc.

    • Learning and development stipend

  • Company-wide off-sites and team off-sites

  • Competitive compensation, company stock options and 401k

WRITER is an equal-opportunity employer and is committed to diversity. We don't make hiring or employment decisions based on race, color, religion, creed, gender, national origin, age, disability, veteran status, marital status, pregnancy, sex, gender expression or identity, sexual orientation, citizenship, or any other basis protected by applicable local, state or federal law. Under the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records.

By submitting your application on the application page, you acknowledge and agree to WRITER's Global Candidate Privacy Notice.

Compensation Range: $195.9K - $292.4K


#BI-Remote
HQ

Writer San Francisco, California, USA Office

140 Geary St, San Francisco, CA, United States, 94110

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account