Tokyo

Senior Software Engineer - Query Engines & Storage

Tokyo
Partial Remote
Full-time
April 10, 2026

Treasure Data:

At Treasure Data, we’re on a mission to radically simplify how companies use data and AI to create connected customer experiences. Our intelligent customer data platform (CDP) drives revenue growth and operational efficiency across the enterprise to deliver powerful business outcomes.

We are thrilled that Forrester has recognized Treasure Data as a Leader in The Forrester Wave™: Customer Data Platforms For B2C. It's an honor to be acknowledged for our efforts in advancing the CDP industry with cutting-edge AI and real-time capabilities.

Furthermore, Treasure Data employees are enthusiastic, data-driven, and customer-obsessed. We are a team of drivers—self-starters who take initiative, anticipate needs, and proactively jump in to solve problems. Our actions reflect our values of honesty, reliability, openness, and humility.

Your Role:

The Plazma team at Treasure Data is one of the essential elements of our CDP solution and is part of the Core Services group, which supports customer data ingestion and availability at a rate of 70B records per day. You are expected to help the team develop the future of our Hadoop/Hive & Trino query engines and expand from there into our in-house developed storage solution. This includes maintaining technical excellence to address challenges that currently lack industry-wide solutions and delivering the roadmap together with your team. Our team consists of Big Data experts across Japan, Korea and Canada who are passionate about OSS contribution, and we take pride in the quality of service we offer.

Responsibilities & Duties:

Design and develop Hadoop/Hive & Trino solutions, providing technical expertise for modern data architecture assessment and use case development
Establish engineering standards for design, development, tuning, deployment, and maintenance of advanced data access frameworks and distributed systems
Collaborate with your team to define product roadmaps based on operational needs and customer-requested features while mentoring and training new team members
Own version and release management, including baseline evaluation, patch backporting, and deployment of customer-facing features
Coordinate with Support and Product teams on release cycles and feature delivery
Contribute to Hadoop/Hive & Trino OSS through bug fixes, new features, and technical documentation
Partner with SRE to automate cluster operations, reducing operational overhead through automated lifecycle management and load balancing workflows
Design and implement observability solutions, including health metrics, capacity planning tools, and automated failure detection and recovery systems
Provide expert customer support, including on-call responsibilities, escalation handling, and in-depth troubleshooting of performance and defect issues
Develop custom technical solutions, including user-defined functions (UDFs) and specialized tooling for Hadoop/Hive & Trino

Required Qualifications:

5+ years building and operating distributed systems
Strong Java and deep understanding of algorithms, data structures, and distributed systems fundamentals
Solid understanding of cloud architecture and services in public clouds like AWS, GCP, or Microsoft Azure
Strong capability in implementing new and improved data solutions for multi-tenant environments
Experience in developing use cases, functional specs, design specs, etc.
Experience working with distributed, scalable Big Data stores or NoSQL, including HDFS, S3, Cassandra, Big Table, etc.
Strong analytical and communication skills; able to influence across Product, SRE, and Support

It would be nice if you had:

Understanding of the capabilities of Hadoop/Hive or Trino
Proven experience operating production query engines on a petabyte scale
Microservices architecture, data integration patterns, and extending OSS
Infra-as-Code, SRE practices, and advanced observability
UDF development and familiarity with data visualization ecosystems
Security and privacy-by-design expertise
Experience with storage patterns and optimizations for massive parallel processing

Physical Requirements:

3 days at Treasure Data Office

APPLY NOW ➜🇯🇵 Residents Only

About Treasure AI

Treasure Data is a best-of-breed enterprise customer data platform (CDP) that powers the entire business to shape customer-centricity in the age of the digital customer. We do this by connecting all data into one smart customer data platform, uniting teams and systems to power purposeful engagements that drive value and protect privacy for every customer, every time. Trusted by leading companies around the world, Treasure Data customers span the Fortune 500 and Global 2000 enterprises.