土星（YC S24）正在招聘高级人工智能工程师

土星（YC S24）正在招聘高级人工智能工程师
Saturn (YC S24) Is Hiring Senior AI Engineer

原始链接: https://www.ycombinator.com/companies/saturn/jobs/R9s9o5f-senior-ai-engineer

土星正在构建一个人工智能驱动的操作系统，旨在普及金融建议，服务十亿人。他们正在寻找一位**高级人工智能工程师**，负责在该高度监管的环境中拥有面向客户的关键人工智能功能。这是一个高度自主的角色，需要强大的软件工程技能、大型语言模型（LLM）专业知识以及对产品质量的关注。该工程师将负责完整的特性生命周期——从架构和开发到部署和监控——利用“明确编排”来构建可审计的人工智能代理。主要职责包括与金融专家合作设计强大的评估框架（“评估飞轮”），通过防御性设计确保系统可靠性，并通过简洁、经过良好测试的 Python 代码提高工程标准。理想的候选人拥有 5 年以上经验，在扩展人工智能产品（特别是使用 LLM 和代理系统）方面拥有良好记录，并且强烈倾向于行动、所有权和数据驱动的决策。土星强调构建值得信赖、可解释的人工智能，并以质量和客户成功为中心。

黑客新闻新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交登录土星 (YC S24) 正在招聘高级人工智能工程师 (ycombinator.com) 2小时前 | 隐藏指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系搜索：

原文

Why Saturn?

Saturn is revolutionizing financial services with AI, building the operating system for financial advisors. Our mission is to democratize financial advice for one billion people by providing the world's most trusted, intelligent platform for financial planning and compliance.

This is a rare chance to build a category-defining company in a high-stakes, regulated environment. We operate with a Dual Mandate: relentless Speed of Execution to deliver reliable, robust products today, and dedicated Speed of Learning to explore the frontier of AI and unlock the next generation of features.

If you are driven by the pursuit of greatness, thrive on end-to-end ownership, and want to build the gold standard for AI trust and reliability, we invite you to build with us.

Role Overview

As a Senior AI Engineer at Saturn, you are the single-threaded owner of critical, customer-facing AI features that form the backbone of the advisory operating system. This is a highly autonomous role requiring robust software engineering fundamentals, deep LLM intuition, and an obsessive focus on product quality in a regulated domain.

You will own the entire feature lifecycle: from defining the Gold Standard with our domain experts (Guardians), architecting the agentic workflow, designing and building the comprehensive evaluation suites, to deploying and operating the solution reliably in production. You are expected to move quickly, making pragmatic, data-backed decisions that drive measurable value.

What You'll Do

1. End-to-End Feature Ownership and Architecture:

Ownership: Take complete ownership of a product domain or complex feature, making architectural decisions independently and delivering high-quality results from concept through to long-term maintenance.
Defensive Design: Architect and implement fault-tolerant AI systems, incorporating robust fallbacks (via a model-agnostic gateway), retries, and comprehensive monitoring and tracing, driven by the Will to Care about system reliability.
Explicit Orchestration: Design and deploy complex, multi-step AI agents using explicit orchestration frameworks, ensuring state transitions are visible, testable, and auditable.

2. Drive Evaluation and Quality Discipline:

Design Evaluation Strategy: Design, implement, and maintain the comprehensive, systematic evaluation framework (Evals Flywheel) specifically for your features to rigorously measure performance, manage regressions, and ensure quality compounds over time.
Domain Partnership: Work directly with our domain experts to translate complex financial and compliance requirements into executable evaluation rubrics and Gold Standard datasets.
Quality Feedback Loop: Instrument features end-to-end to rapidly diagnose probabilistic failures, converting production issues into high-priority regression tests.

3. Elevate Engineering Standards:

Technical Excellence: Write clean, modular, Python code that raises the bar for the team. Actively participate in code review, using the process to mentor peers and reinforce architectural standards.

What You Have

5+ years of professional experience in a highly demanding engineering environment.
Proven track record (3+ years) of building, shipping, and operating scaled, impactful products where Generative AI or LLMs are a core component.
Deep Experience with Agentic Systems: Expertise in RAG pipelines, systematic prompt engineering, agentic workflow orchestration, and defining reliability trade-offs for production systems.
Evaluation Focus: Direct, demonstrable experience designing, writing, and maintaining automated evaluation frameworks (evals) used to rigorously test and improve probabilistic systems.
End-to-End Ownership: A history of thriving in ambiguity, taking complete ownership of large features, and driving initiatives forward independently with a strong bias for action.
Engineering Excellence: Mastery of Python and modern backend development practices, including system design, testing, CI/CD, and robust production observability.
Product & User Focus: Strong product sense and the drive to quickly build domain expertise, translating user needs and compliance context into high-value technical solutions (the expression of Will to Care for the customer).

Saturn Values in Practice:

Earn Trust: Building verifiably correct, explainable systems (Citation-First, Adviser-in-the-Loop).
Pursue Greatness: Driving our Evaluation-Driven Development flywheel to compound quality daily.
Seek Truth: Relying on data, traces, and customer feedback (Guardians) to inform every decision.
Be Audacious: Taking decisive ownership and building intelligent agents that solve previously unsolvable problems in finance.
Will to Care: Obsessively anticipating customer needs and building systems with extreme attention to detail, ensuring long-term quality, reliability, and the success of our users and peers.