Manager (Site Reliability Operations)
- Tokyo
- Partial Remote
- Full-time
- January 6, 2026
Treasure Data:
At Treasure Data, we’re on a mission to radically simplify how companies use data and AI to create connected customer experiences. Our intelligent customer data platform (CDP) drives revenue growth and operational efficiency across the enterprise to deliver powerful business outcomes.
We are thrilled that Forrester has recognized Treasure Data as a Leader in The Forrester Wave™: Customer Data Platforms For B2C. It's an honor to be acknowledged for our efforts in advancing the CDP industry with cutting-edge AI and real-time capabilities.
Furthermore, Treasure Data employees are enthusiastic, data-driven, and customer-obsessed. We are a team of drivers—self-starters who take initiative, anticipate needs, and proactively jump in to solve problems. Our actions reflect our values of honesty, reliability, openness, and humility.
Your Role:
Your role will be to oversee our Japan-based Site Reliability Engineering team. Our SREs own our compute platform (AWS, Kubernetes, EC2, Lambda, ECS), our common tooling, and our overall site availability. They work directly with development teams to solve product challenges and provide education around best practices. As our SRE leader in Japan, you’ll work closely with your North-America-based counterparts to design and implement solutions to solve high-scale challenges.
Managers at Treasure Data prioritize solving people and communication challenges before technical problems, but are still active technical contributors. They are eager to build effective and dynamic teams that iteratively and rapidly deliver resilient systems. It will require working across product and engineering teams on complex problems where solutions require in-depth analysis and evaluation of multiple competing factors, identifying the best trade-offs for successful delivery.
This role requires leadership by example and will have you making regular individual contributions. The team and you will be directly responsible for solutions for the platform in these critical areas: availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning. Additionally, as a leader within the engineering organization you’ll be a part of broader planning and ultimately aligning your team with the outcomes.
Success in this role requires a passion for helping others and improving their lives. You do this by working with people to make team collaboration more effective and by helping them simplify complex systems to make them understandable and operable. You are able to effectively communicate decisions, ideas, designs, and operation of systems and services clearly and concisely but more importantly, derive a lot of satisfaction from teaching and enabling others to do this as well.
Responsibilities & Duties:
- Manage a team of 5-8 Site Reliability Engineers by setting clear expectations and providing continuous feedback.
- Providing ongoing career coaching on both technical and non-technical areas of improvement.
- Working with Engineering and Product stakeholders to organize and execute on large projects.
- Planning and facilitating agile sprints and holding the team accountable to sprint deliverables.
- Improving processes by introducing metrics, experimenting with improvements, and implementing new ways of working.
- Assisting with incident coordination as part of our on-call rotation.
- Assisting with system design activities to make the right tradeoffs that balance reliability and delivery speed, and communicating those decisions clearly.
Required Qualifications:
- Proven experience as a people manager for a technical team, including coaching, performance management, and delivering difficult feedback when necessary.
- Experience managing or supporting a distributed SRE or infrastructure team across multiple time zones.
- Hands-on experience with at least one major cloud provider such as AWS, Azure, or GCP.
- Working familiarity with infrastructure-as-code tools, including Terraform, CloudFormation, CDK, or Ansible.
- Working knowledge of at least one programming language, such as Python, Java, Ruby, or JavaScript.
- Experience leading or participating in production incident response, including incident command and post-incident review.
- Demonstrated ability to lead complex, cross-team software or platform initiatives from planning through delivery.
- Working knowledge of agile software development practices and backlog-driven delivery.
- Understanding of cloud governance fundamentals, including cost management, patching, and secure system design.
- Strong communication and leadership skills, with the ability to represent reliability concerns to engineering and senior leadership.
Language Requirements:
-
The official language for written and verbal communication for this position is English, but Japanese fluency is strongly preferred.
Physical Requirements:
Hybrid - 3-days in office in Tokyo per week
Travel Requirements:
Minimum once a year for Team onsite.
About Treasure Data:
Treasure Data is the Intelligent Customer Data Platform (CDP) built for enterprise scale and powered by AI. Recognized as a Leader by Forrester and IDC, Treasure Data empowers the world’s largest and most innovative companies to deliver hyper-personalized customer experiences at scale that increase revenue, reduce costs, and build trust.
Through unique capabilities such as the Diamond Record, AI Agent Foundry, and AI Decisioning with Real-Time Personalization, Treasure Data enables marketing and CX teams to personalize cross-channel engagement in real-time, optimize marketing spend while increasing ROI, and drive customer lifetime value through more intelligent retention and loyalty.
Our Dedication to You:
We value and promote diversity, equity, inclusion, and belonging in all aspects of our business and at all levels. Success comes from acknowledging, welcoming, and incorporating diverse perspectives.
Diverse representation alone is not the desired outcome. We also strive to create an inclusive culture that encourages growth, ownership of your role, and achieving innovation in new and unique ways. Your voice will be heard, and we will help amplify it.
About Treasure Data
Treasure Data is a best-of-breed enterprise customer data platform (CDP) that powers the entire business to shape customer-centricity in the age of the digital customer. We do this by connecting all data into one smart customer data platform, uniting teams and systems to power purposeful engagements that drive value and protect privacy for every customer, every time. Trusted by leading companies around the world, Treasure Data customers span the Fortune 500 and Global 2000 enterprises.
Get Job Alerts
Sign up for our newsletter to get hand-picked tech jobs in Japan – straight to your inbox.







