AI summary: Named technical point of contact for strategic enterprise customers, managing GPU infrastructure deployments, incident escalation, and RMA coordination across compute, networking, and storage.
As a Dedicated AI Factory TAM at Together AI, you will serve as the named technical owner for one of our most strategic enterprise relationships. You will be the primary technical point of contact across all infrastructure domains — compute, networking, storage, and facilities — ensuring smooth delivery and operational health of large-scale GPU deployments. This role sits at the intersection of deep infrastructure expertise and high-stakes customer partnership, making you a critical driver of both customer success and company growth.
Serve as the named technical point of contact for a dedicated strategic customer, owning the end-to-end technical relationship across compute, networking, storage, and facilities
Lead issue lifecycle management, escalation, and RCA authorship across all infrastructure domains in partnership with Support, SRE, DC Ops, and Engineering teams
Own end-to-end RMA coordination and hardware lifecycle management, including acceptance testing, spare inventory management, and hardware health reporting for large-scale GPU deployments
Maintain deep technical expertise across the customer’s infrastructure stack — GPU compute, high-speed fabric, and large-scale storage systems — advising on configuration, operational best practices, and incident resolution
Own the observability strategy for the customer estate, including alert policy definition, dashboard development, and proactive health management across all infrastructure layers
Coordinate DC operations and facilities events in partnership with internal teams and hosting providers, ensuring SLA compliance and cluster availability
Act as project manager for all capacity expansions, owning the full node deployment lifecycle from freight receipt through production acceptance
Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers on our journey in building the next generation of AI infrastructure.
We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $260-290K OTE + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.
San Francisco, CA (Hybrid) or New York, NY (Hybrid)
Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.
Please see our Privacy Policy at https://www.together.ai/privacy