MLOps Engineer

  • Tokyo
  • Partial Remote
  • Full-time
  • April 10, 2026
Conditions
yen-icon
¥8M ~ ¥20M /yr
location-icon
Apply from Anywhere 👍
visa-icon
Relocation to Japan 👍
(Overseas visa sponsorship supported)
Requirements
language-icon
Language Requirements
Japanese: Business Level
English: Fluent
career-icon
Minimum Experience
Senior or above

MISSION

As the founding MLOps engineer, design and build Shizuku’s ML infrastructure from the ground up. Establish the complete pipeline — from data ingestion through training environments to model serving — creating an internal platform that empowers ML engineers to iterate on models at maximum velocity.

Replace individual, siloed development environments with a unified team-scale ML development platform, maximizing the speed of Shizuku’s evolution.

 

ABOUT SHIZUKU

Shizuku is a Japan-born AI companion actively engaging audiences on YouTube and X (formerly Twitter). Already running live streams and cultivating a growing community, Shizuku is now entering its next phase of rapid scale.

As the first Japanese startup to receive investment from a16z, we closed our seed round and are on a mission to bring Japanese entertainment × AI to the global stage.

 

TEAM STRUCTURE

You will work closely with founder Aki (ML engineer and researcher, ex-Meta, ex-Luma AI) and Engineering Director Ohno to drive the design and construction of our ML infrastructure. As the first MLOps engineer, you’ll have significant autonomy — from technology selection to operational design.

Post-foundation, career paths include both a management track leading a growing team and an IC track deepening technical expertise, tailored to your aspirations.

 

CURRENT STATE & WHAT YOU’LL BUILD

  • Infrastructure Status: Modern application infrastructure is in place, but ML training and MLOps tooling are not yet established. AWS adoption is planned
  • What You’ll Build: An internal platform for ML engineers developing Shizuku’s AI models. The goal: eliminate siloed, ad-hoc local workflows and code ownership by individuals, replacing them with a team-oriented ML development foundation

 

KEY RESPONSIBILITIES

  • Design, build, and operate the end-to-end ML training pipeline: data collection/preprocessing → training → evaluation → deployment
  • Design and build GPU training infrastructure on AWS (A100, L4, etc.) with cost optimization
  • Build an internal ML platform for engineers: experiment tracking, model versioning, and reproducibility guarantees
  • Design and build model serving infrastructure: inference APIs, auto-scaling, and latency management
  • Establish training data management and quality assurance pipelines
  • Design and implement CI/CD for ML: automated training, model testing/evaluation, and staged rollouts
  • Drive production integration of models in collaboration with ML Engineer and SWE teams
  • Build monitoring and visibility infrastructure for long-term compute cost and GPU utilization tracking

 

REQUIREMENTS

  • 3+ years of experience designing, building, and operating cloud infrastructure on AWS, GCP, or equivalent platforms
  • Experience building ML/DL pipelines and infrastructure
  • Hands-on experience designing and operating production environments using container technologies (Docker/Kubernetes)
  • Experience managing infrastructure as code (Terraform, Pulumi, etc.)
  • Strong Python skills for building tools and pipelines
  • Ability to work on-site at our Tokyo office (primarily in-office with flexible remote arrangements)

 

NICE TO HAVE

  • Experience building, operating, and cost-optimizing GPU clusters (A100, H100, L4, etc.)
  • Experience with ML platforms: SageMaker, Vertex AI, Ray, Kubeflow, etc.
  • Experience deploying and operating experiment tracking infrastructure: MLflow, Weights & Biases, DVC, etc.
  • Experience building model serving infrastructure: Triton Inference Server, TorchServe, vLLM, SGLANG, etc.
  • Experience designing and building internal ML development platforms
  • Domain-specific knowledge of ML workloads in speech, NLP, or vision
  • Experience as a founding infrastructure/MLOps engineer at a startup
  • Technical communication skills in English (currently Japanese-first internally; transitioning to a global environment in the mid-term)

 

WHO YOU ARE

  • Founding Engineer Mentality — You don’t wait for established systems to improve — you define the design philosophy and build the foundation from zero. You’re energized by creating the system itself, not just refining one
  • ML-Literate Infrastructure Engineer — You understand the unique characteristics of ML training and inference workloads, and you translate that understanding into optimally designed infrastructure
  • Purpose-Driven Ownership — You reverse-engineer from “maximizing ML team velocity,” set your own priorities, and drive execution autonomously
  • Comfort with Ambiguity — You design for a world where model count, training frequency, and data volume are still being defined — starting small and scaling architecturally as the picture clarifies
  • Resilience & Respect — You engage as an equal partner with ML Engineers and SWEs, elevating the entire team’s productivity through collaboration

Shizuku AI is an AI-native entertainment company reimagining how people interact with technology through its AI companion, Shizuku.

The company builds real-time, interactive AI characters inspired by Japanese IP culture, blending generative AI, live streaming, and storytelling to create experiences where users can form meaningful relationships with AI.

As an early-stage startup, Shizuku AI is backed by top-tier investors including Andreessen Horowitz (a16z) and DeNA. The team is looking for individuals who are excited about building impactful new forms of entertainment powered by AI to join them.

The company’s mission is to create the world’s most lovable AI companion, supported by a global team based in San Francisco and Tokyo.

View Shizuku AI's company page

↑ Back to top ↑

MLOps Engineer at Shizuku AI
APPLY NOW  ➜Japanese Required ⚠️