Fine-tune, deploy, and maintain multi-modal vision-language and transformer models for production. Build end-to-end training and inference pipelines in Python, implement prompt engineering and RAG, evaluate and iterate model performance, and optimize models (quantization) for efficient, scalable deployment across real-world environments.
We're seeking a Computer Vision AI Engineer with deep experience in transformers, generative models, and vision-language models (VLMs) to push City Detect's products beyond traditional object detection. You'll fine-tune, deploy, and maintain multi-modal models that combine visual and language understanding to deliver intelligent, scalable solutions across heterogeneous real-world environments.
- Fine-tune and deploy vision-language models (VLMs) and large language models for production use cases
- Design and maintain end-to-end pipelines for multi-modal model training, evaluation, and inference in Python
- Develop prompt engineering strategies, RAG architectures, and other techniques to maximize model performance
- Evaluate model outputs systematically and build feedback loops for continuous improvement
- Quantize large transformer models to improve model efficiency
- Stay current with rapid advances in transformer architectures, fine-tuning methods, and multi-modal research
- 3+ years of professional experience working with transformer-based architectures
- 2+ years of hands-on experience fine-tuning and deploying multi-modal models (VLMs)
- 2+ years of proven computer vision experience, with a strong preference for object detection
- Strong experience with LLMs — fine-tuning, inference optimization, and production deployment
- Proficiency in Python for model development, training, and deployment (2+ years)
- Experience with deep learning frameworks such as PyTorch or TensorFlow
- Solid understanding of attention mechanisms, tokenization, transfer learning, and generative model fundamentals
- Proven experience taking models from experimentation through production-ready deployment
- SQL proficiency for querying detection results, labeling metrics, or model performance data
- Strong preference: experience with roadside or infrastructure object detection (signs, signals, debris, pavement markings)
- Background in GovTech, public sector, or smart city projects
- Experience in automated driving, ADAS, or autonomous vehicle perception systems
- Familiarity with model-assisted labeling, active learning, or human-in-the-loop workflows
- Experience with edge deployment or model optimization (TensorRT, ONNX, quantization)
The base pay range for this role is $100,000 – $120,000 per year.
Similar Jobs
Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
Deploy, integrate, upgrade and troubleshoot core network, network management and cloud systems (OpenStack, Kubernetes, containers). Provide technical support and consulting for 4G/5G/6G, NFV and IoT solutions, join customer meetings, assist solution design, document best practices, and pursue automation and scripting to improve delivery.
Top Skills:
4G/5G/6GContainersDockerIotKubernetesLinuxNfvOpenstackPythonShell ScriptingTcp/IpVirtualization
Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
Deploy, integrate, and troubleshoot core network, network management and cloud systems (OpenStack, Kubernetes, virtualization) for 4G/5G/6G and IoT. Provide technical support and consulting, join customer meetings, assist solution design, collaborate with sales and project teams, perform installs/upgrades/migrations/testing, document best practices, and share technical knowledge while growing toward expert or solution-architect roles.
Top Skills:
4G5G6GContainersDockerIotKubernetesLinuxNfvOpenstackPythonServer HardwareShell (Bash)Tcp/IpVirtualization
Cloud • Information Technology • Internet of Things • Machine Learning • Software • Cybersecurity • Infrastructure as a Service (IaaS)
Deploy, integrate, upgrade, migrate, test and troubleshoot core network, network management and cloud systems (OpenStack, Kubernetes, virtualization). Provide technical support and consulting on 4G/5G/6G, IoT and NFV, work with global teams and customers, support project delivery, document best practices, and share technical knowledge.
Top Skills:
4G5G6GContainersDockerIotKubernetesLinuxNfvOpenstackPythonShellTcp/IpVirtualization
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

