Lead Infra/DevOps Engineer

| San Francisco | Remote
Sorry, this job was removed at 12:27 p.m. (PST) on Thursday, July 28, 2022
Find out who's hiring in San Francisco.
See all Developer + Engineer jobs in San Francisco
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

Who We Are
Cardless is a fintech startup. Our mission is to make consumer credit more accessible, useful, and engaging, and we do that by partnering with brands to help them launch financial products.
We've begun with a focus on co-branded credit cards. We've launched cards with Manchester United , the Boston Celtics , Liverpool F.C. and LATAM Airlines (click here to see a full list of all our cards); and we have more exciting brands on the way! Our cards are digital-first, consumer-friendly, and tailored to both brands & fans with delightful discounts & rewards.
Read more about our business, our product, and what it's like to work at Cardless in these blog posts from our team and industry experts:• How to make a credit card by Michael Spelfogel (Founder & President)• Designing for dignity & joy by Scott Kazmierowicz (Founder & CEO)• What it's like to celebrate a new card launch with Cardless by Zach Honig (Content & Loyalty)• Here's how fintech could disrupt credit card rewards by View from the Wing• Forget free flights, this card could make you an NBA benchwarmer by Bloomberg
We're headquartered in San Francisco, but we're building a remote-friendly team across the continental US. We were founded in 2019, and we last raised a $40M Series B in 2021. Our investors include top VCs like Activant, Accomplice, and Greycroft; founders from successful startups like Plaid, SeatGeek, and FiveStars; and executives from major sports teams like the Boston Celtics and Phoenix Suns.
The Role
Powering digital-first and consumer-friendly credit cards is no easy task! Behind the scenes, we need to integrate with a number of third-party services (like card networks, payment processors, and credit bureaus), solve tricky distributed systems problems (like eventual consistency and idempotent workflows), and balance moving fast with moving reliably, securely, and safely.
We've achieved much of that to date with a small but talented team of generalist engineers. We're now seeking a dedicated infra/devops engineer to own both our production infrastructure and our developer experience. In many ways, we'd love for this engineer to own the tech, tooling, and roadmap for our entire development+deployment lifecycle. And if it interests you, this role can transition into people management as we grow, though that'll never be a requirement.
Our Picture Today
Our backend stack today is a microservices architecture built primarily with Java & gRPC on AWS. We use several AWS tools & services, like EKS (hosted Kubernetes), API Gateway, Lambda, Step Functions, DynamoDB, Kinesis, Redshift, S3, and managed Airflow. We also use other tools & services like VeryGoodSecurity for tokenization, LaunchDarkly for feature flagging & experimentation, and Retool for internal admin UI.
We write some infra-as-code with Terraform, though we want to do more here. We CI/CD to staging using CircleCI & Harness, and manually deploy to prod once a week. We've invested in unit tests, integration tests, and end-to-end tests, and we've also made progress towards a single monorepo. We monitor & observe our systems through Datadog & Sentry, and we work hard to keep our on-call rotation page-free and stress-free. We also maintain a data pipeline through tools & services like FiveTran & DBT Cloud.
Our Aspirations for Tomorrow - What You Could Work On
We'd love for you to bring your experience and help us get to the next level across these fronts:• More infra-as-code ("cattle, not pets"). Is our current approach with Terraform still right, or should we revisit anything, e.g. reconsider AWS CloudFormation? Importantly, we'd love to broaden this to include things like database schema changes too. E.g. how can we achieve safe, code-reviewed schema migrations common in the SQL world with our NoSQL DynamoDB?• Continuous deployment to production too, not just staging. Many of the best companies continuously deploy many times a day, both to shorten developers' feedback loop and (counterintuitively) to reduce risk (since each change is smaller and more isolated). We'd love to get here, but we know that doing so responsibly means we should have great testing, monitoring/alerting, and rollback stories. How close are we, and what should our roadmap be to getting there?• Move to a true monorepo, using e.g. Bazel or Buck. We experienced the pain of having a separate repo per service, and we successfully consolidated most of our services to a single Java monorepo. But we still have a few other repos, like a Python one which houses Lambda functions, Airflow jobs, and integration tests. We'd love to finish the last mile here, but we know that'll require a bit more sophistication than we have today.• Better observability & debuggability. We've invested in logging, tracing, and metrics; we've refined our on-call dashboards; and we've tuned our monitors & alerts to reduce false positives - but bugs still feel hard to figure out sometimes, and we still need to minimize false negatives too. Is there low-hanging fruit here, or should we revisit our approaches?• Explicitly define & measure our reliability. By typical measurements of reliability (e.g. uptime/availability and latency), our systems are pretty reliable today. E.g. we almost never get paged for our own system issues today. But we also depend on third-party services, and our customers are also affected by bugs, not just downtime or latency. How should we define our reliability then, and how should we measure it to know if we're healthy or not?
You may see a theme here: we hold ourselves to a high bar and value excellence, but we also value pragmatism and customer impact. If this resonates with you and our aspirations excite you, reach out!
The Requirements
Research has shown that women & underrepresented minorities read lists of requirements and consider themselves unqualified if they don't meet every single one. This list represents what we're ideally looking for, but we encourage you to apply even if you don't meet everything 100%. Everyone has unique strengths & weaknesses, and we hire for strength & potential, not lack of weakness.
• Experience shipping, extending, and maintaining production infra in AWS. You should be proficient in key AWS technologies like EC2, VPC, IAM, and security groups, and know best practices and tradeoffs. This is more important to us than a specific # of years of experience, but we expect most candidates to need at least 3-5 years in the industry to gain this expertise.
• At least some experience with modern infra abstractions & tools like Docker, Kubernetes, Terraform or similar, and Lambda. These things are learnable, but some existing exposure to concepts, details, and tradeoffs here will be valuable.• Experience managing your own work and supporting others. We want this lead-level engineer to own and manage our infra & developer experience roadmap. Doing that effectively will require proactively communicating proposals, reasoning, and status. You'll also play a key role in teaching, mentoring, and leveling up other engineers.• An ownership mindset, and a passion for both your craft (building things right) and achieving real-world impact (building the right things for our business & team).
Location
We're headquartered in San Francisco, CA, with a beautiful office in the Mission District (near Dolores Park). We welcome employees who want to work from this office; we offer additional benefits to those who do (see below), and relocation assistance to those who'd like to.
We are hiring remotely across the continental US for most roles. We work hard to create a first-class, remote-friendly environment, with an additional benefit for remote employees, too (see below).
We regularly bring our team together for offsites & trips, about every 2 months, both for fun and for work. We cover all travel & lodging in these cases.
A condition of employment at Cardless is being fully vaccinated and boosted against COVID prior to beginning your employment. We'll ask you to send us a copy of your vaccination card in advance of your first day.
Benefits & perks
For all employees, we offer:
Highly competitive salaries, significant equity, and a 401(k) plan
Top-of-the-line healthcare (medical, vision, and dental), with 100% primary coverage and 75% dependent coverage for medical
Unlimited paid time off, with a minimum of 15 days off per year
👶 Parental leave for birthing and non-birthing parents
Amazing team trips & offsites (e.g. rock climbing in Utah, watching a Cavs game in a suite, and staying in an Italian villa in Carmel Valley)
Access to amazing advisors & investors, like Gil Shklarski (Flatiron Health CTO), Dorothy Kilroy (Airbnb & Marqeta exec), and Frank Mastrangelo (Bancorp founder)
For in-office employees:
Catered lunch at the office every day, plus plenty of snacks & drinks
$250/month commuter assistance for public transit or our parking garage
For remote employees:
$500 remote setup stipend to help you do your best work
Our Values
We believe that the best companies are the ones that successfully align their vision and values, that match their decision making processes, hiring criteria, and overall operating principles to the change they're striving to make. We're willing to spend significant time and resources to be above average at the following things, because we believe that will lead to long term success.
- Be curious and be the solution. We work in a complex space. In order to be successful, we must think globally and collaborate cross-functionally. We ask thoughtful questions without judgment and propose comprehensive solutions that address the full picture. We are all owners at Cardless - mindful and empowered.
- Start with the customer. Our customers are our priority. Our care for them drives our actions. The first question we always ask is: how will this impact the customer? We make every decision, no matter how big or small, with the customer top of mind.
- Move fast and build things. We are a startup in a highly regulated space, so we must execute with both speed and precision. We achieve this through deliberate focus, methodical preparation, and disciplined execution. We build, learn, improve, and repeat.
- A place to do your best work. We are a team, and we achieve more together than apart. At Cardless, we want to see each other succeed: we take as much pride in each other's work as our own. We celebrate and seek out diversity of experience, thought, and background in order to accelerate both individual and company growth. We're hungry, we're passionate, and we're inspired to achieve great things together.
Compensation
This role has an annual starting salary range of $160,000 - $200,000 + equity + benefits (see above). Actual compensation is influenced by a wide array of factors including but not limited to skills, experience, and specific work location.

Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Location

Our office is located in the Mission District of San Francisco and is within walking distance to multiple BART and muni stations. Take a stroll through Mission Dolores Park to see the iconic views of downtown SF or walk down the street to the trendy bars and restaurants of Valencia Street.

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about CardlessFind similar jobs