Arcade.dev Jobs

Applied AI Engineer

Arcade.dev

Applied AI Engineer

Reposted 6 Days Ago

In-Office

San Francisco, CA, USA

179K-240K Annually

Senior level

In-Office

San Francisco, CA, USA

179K-240K Annually

Senior level

As an Applied AI Engineer, you will design and ship agentic tools, automate tool creation, and compose workflows. Responsibilities include running model-aware experiments and bringing rigorous applied ML practices to tool design.

The summary above was generated by AI

Applied AI Engineer

Everyone's talking about AI. But here's the truth: ChatGPT can't send your emails. It can't book your flights. It can't even order you lunch.

Why? Because AI is trapped in a chat box. It can't take real actions in the real world.

We are changing that forever. We're not just building another AI company - we're creating the infrastructure that will power every AI application you'll use in the future.

The Revolution Needs You

Every AI app needs agentic "tools" - special functions that let AI models take real actions. Without tools, AI can only chat. With tools, AI can actually do things. We're building the actions runtime that allows AI agents to safely take real-world actions at enterprise scale. As an Applied AI Engineer on the Tools team, you'll push the boundary of what "a tool" even means at Arcade — designing agentic tools that go beyond deterministic API wrappers, building agents that build new tools, and composing tools into workflows that solve higher-level problems.

Why This Is The Opportunity of a Lifetime

Founder-Market Fit : Our CEO previously founded Stormpath (acquired by Okta), where he created the first Authentication API for developers. He's done this before - and this time the market is 10x bigger. Our CTO led the vector database team at Redis, shipped 100+ LLM applications, and is a contributor to LangChain and LlamaIndex. He knows this space better than anyone.
Dream Team : We've assembled authentication, integrations, distributed systems, and AI experts from Okta, Redis, Microsoft, Splunk, Ngrok, Google, Airbyte, Disney, and HPE who've built and founded multiple successful developer platforms.
Perfect Timing : We're at the inflection point of AI adoption. The biggest problem isn't better models - it's connecting AI to real-world actions. That's us.
Massive Market : We're building critical infrastructure for the biggest technological shift of our generation. Every AI app will need what we're building.
Backed By The Best: Our investors have backed Databricks, Clickhouse, MongoDB, Perplexity, Cohere, ScaleAI, Confluent, Elastic, and Firebase. They see what we see - this is going to be huge.

The Challenge

You'll report to the Engineering Manager for Tools and Growth. The Tools team owns Arcade's tool catalog — thousands of tools across many services, growing faster than any human can review by hand. The next leap in agent quality lives inside this team's work, and you'll be the applied-AI seat that pushes it forward.

Three real problems define the role.

Agentic tools vs. deterministic tools. Most tools today are deterministic: call X API with Y arguments, get Z result. That model breaks down for entire classes of agent work — research a topic, summarize a thread, decide which of three accounts to act on. Agentic tools, the ones that internally reason, plan, or call models are the answer, but the design space is wide open. When is agentic better than deterministic? How do you make an agentic tool fast, reliable, and debuggable? You'll set the bar for what these look like at Arcade.

Agents that build tools. The toolkit catalog is too big for hand-crafting to scale. We need agent harnesses that can take a vendor's API and produce a high-quality toolkit — design, code, eval, docs with a human in the loop only where the human is actually needed. There's early work on this already. You'll take it from a prototype into the production pipeline that produces the next thousand tools.

Workflows that compose tools. Individual tools solve narrow problems. Real customer outcomes: "close the quarter," "triage the inbox," "stand up the integration" need many tools, chained, with the right control flow. We need to figure out what the right primitive looks like above the tool layer, and you'll lead that design.

The most honest thing we can say about this work: most of the problems you'll be solving didn't exist three months ago. There's no prior art. There's no known solution. If that's the part of the job that makes you nervous, this isn't the right role. If that's the part that makes you lean in, it is.

We do real experiments. We form hypotheses. We publish learnings. Research is part of the job. But the role is built around shipping. If you want to spend six months proving an idea in a notebook before anything reaches a customer, this isn't the right role. If you want to ship the experiment and the writeup in the same quarter, it is.

What You'll Do

Design and ship agentic tools that go beyond deterministic API wrappers — and define the patterns the rest of the Tools team will use to build more.
Build the agent harness that automates tool creation — take a vendor's API, produce a high-quality toolkit end-to-end, keep humans in the loop only where humans add real value.
Design workflows that compose tools into higher-level abstractions customers can actually point at outcomes ("triage this inbox," "close out this account") rather than individual API calls.
Bring applied-ML rigor to tool design — evals, model-aware iteration, retrieval, tool description tuning, response shaping. Make decisions defensible with data.
Run model-aware experiments across Claude, GPT, Gemini — agentic tool behavior diverges across models in ways nobody else is studying, and we should.
Set the technical bar for what "good tool-building" looks like as the team scales — your patterns get inherited by every toolkit author after you.
Contribute back to the MCP and agent ecosystem where the conversation about agentic tools is forming.

Required Skills

5+ years software engineering experience, with at least 2 years shipping production ML or applied-AI systems. Formal title matters less than the work.
Strong Python.
LLM application depth — prompting, retrieval, tool use, agent design. You've built non-trivial agent systems and know where the rough edges are.
Experience designing or composing multi-tool / multi-agent workflows that produced real outcomes.
You've built evals at scale — not "I ran a benchmark once," but a measurement system real engineering decisions were made against.
Statistics fluency — significance, confidence intervals, A/B test design. You can defend whether a small delta is real or noise.
Comfort across multiple frontier models and reasoning about their behavioral differences.
A do-er, not a researcher-in-residence. You'd rather ship a working v0.5 next week than a polished v2.0 next quarter.
Comfort with ambiguity — early team, narrow charter that will expand. You make good decisions with incomplete data.
An insatiable desire to ship.

Bonus Points

You've built agents that build software (codegen agents, harness-style systems, meta-agents).
Prior work on tool-use specifically — BFCL, τ-bench, ToolBench, MCP eval work, or equivalent.
MCP ecosystem familiarity — extra bonus if you've filed an issue against the spec.
You've worked on agent frameworks (LangChain, CrewAI, AutoGen, Mastra) and have opinions about where they get tool use and workflow composition wrong.
Prior experience at an API platform, integrations-heavy product, or developer tools company.

Join The Movement

We're not just building a product - we're leading a movement to transform AI from just chatbots to agents that can take actions against real systems. This is your chance to be at the forefront of that revolution.

If you want to look back in 5 years and say, "I helped build that", then we want to talk to you. Ready to make AI actually useful? Apply Now

Compensation and Benefits

This role offers a competitive salary, equity, and benefits. Compensation is aligned with the range below and determined based on a candidate's background, experience, and performance.

Salary Range

$179,000-240,000 USD

San Francisco, CA, United States

Similar Jobs

Granica

Applied AI Research Engineer – ML Systems & Structured Data

Yesterday

In-Office

Mountain View, CA, USA

160K-240K Annually

Senior level

160K-240K Annually

Senior level

Artificial Intelligence • Big Data • Cloud • Machine Learning • Software • Business Intelligence • Data Privacy

Build and productionize ML algorithms and high-performance pipelines for large-scale structured, tabular, relational, and graph data. Translate research into prototypes and production-ready systems, design benchmarks and evaluation harnesses, optimize training/inference (cost, latency, throughput), and collaborate across research and engineering to iterate from prototype to production.

Top Skills: C++CudaDistributed SystemsJaxPythonPyTorchRustTensorFlow

CoreWeave

Senior Software Engineer

12 Days Ago

In-Office

Sunnyvale, CA, USA

182K-242K Annually

Senior level

182K-242K Annually

Senior level

Cloud • Information Technology • Machine Learning

Design and build production-grade full-stack, AI-enabled applications. Develop React/Next.js frontends, backend services on Kubernetes, integrate LLM/AI features, connect data platforms, implement CI/CD, automated testing, observability, and ensure secure, high-performance APIs and services.

Top Skills: Ai/MlAutomated TestingC#Ci/CdDockerGoGrpcHelmJavaJavaScriptKafkaKubernetesLlmNext.JsObservabilityPythonReactRestSparkTypescript

GC AI

Artificial Intelligence Engineer

19 Days Ago

In-Office or Remote

San Mateo, CA, USA

165K-350K Annually

Senior level

165K-350K Annually

Senior level

Artificial Intelligence • Legal Tech

As an AI Engineer, you will architect AI systems for legal work, lead technical initiatives, and mentor a team while impacting the future of legal technology.

Top Skills: AILlmTypescript

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Arcade.dev

Applied AI Engineer

Arcade.dev San Francisco, California, USA Office

Similar Jobs

Applied AI Research Engineer – ML Systems & Structured Data

Senior Software Engineer

Artificial Intelligence Engineer

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech