The Platform Group is responsible for helping engineers at Mercari and its subsidiaries to build and deliver better products to our customers. We provide infrastructure and DevOps toolchains to increase the reliability of the service and make the work of engineers easier.
Currently, the Platform Group consists of four teams:
- Platform DX: Working on improving the developers experience by providing better abstraction and automated workflows
- CI/CD: Providing testing infrastructure, toolings, and the delivery system to make service delivery faster and more reliable
- Network: Responsible for end-to-end network infrastructure from the edge (CDN) to the cloud & service mesh (Istio) and physical data centers
- Platform Infra: Working on the base infrastructure operations as the cloud (GCP & AWS) and Kubernetes admin, as well as building the observability platform
- You can see more details about the team structure on our tech blog How we reorganize the platform team.
Recent or in-progress projects
The following are some of the recent or in-progress projects which the Platform Group has been working on:
- Provide a temporary role-granting system for Kubernetes RBAC and Google Cloud Platform IAM (Platform DX)
- This allows developers and SRE not to have modification roles by default and get it only when required. With the system, we move forward to zero-touch production and make the platform more secure.
- Provide a Kubernetes abstraction framework (Platform DX)
- Currently, developers need to handle the wall of YAML to deploy services. With this framework, we abstract many fundamental configurations away from developers’ hands and reduce the cost of deploying new services.
- Introduce Istio service mesh and work on stabilization to supporting gradual adoption to 100+ microservices (Network)
- Build tool to spin up pull request-based testing environments and flexible routing systems with Istio (CI/CD)
- With this, developers can easily set up the QA environment with multiple nested microservices
- Provide secure CI systems for automated infrastructure continuous delivery (Platform Infra)
- We have Terraform monorepo which manages different teams’ service infrastructure. Before CI was executed by a single strong account, which has a huge risk if leaked, now it’s executed with a delegated least-privileged account for each team.
- Migrate from zonal and routes-based Kubernetes cluster to regional and VPC-native Kubernetes cluster without any downtime and with fewer impacts on the product development (all teams)
- This cluster is our main multi-tenant cluster which runs 100+ microservices and receives 80,000+ req/sec (See more details on our engineering blog)
We are looking for a software engineer for one of the teams in the Platform Group who has a strong background (or interest) in platform or infrastructure system development. Someone who is passionate about increasing developer productivity and has a pragmatic ability to release and migrate features to large-scale systems gradually.
What you will do
- Design, develop, and maintain platform features and toolings which support the entire software development cycle from build to test through to deploy and operate
- Support migration and adoptions of new platform features and toolings
- Improve platform security and reliability with the SRE and Security teams
- Improve and automate daily platform operations and reduce toils
- Communicate with internal developers to understand their needs
You may be a fit if you
- Are passionate about improving developer productivity and experience
- Are passionate about infrastructure automation and building toolings
- Are neutral on the technology itself and can take pragmatic approaches to the issues
- Enjoy advocating for the new tooling and systems, and supporting to use it
- Avoid reinventing the wheel and utilize the existing tooling and ecosystem as much as possible
Since the platform and its toolings are used by not only Mercari JP but also Merpay and Mercari US, the changes and improvements can affect the whole organization’s performance from development agility to system reliability. As a software engineer of the team, your implementation can have a truly significant impact.
The team needs to understand what the developers are struggling with and what is required for the Mercari Group’s product development. With this empathy and collection of requirements, the team prioritizes the problems to solve and decides the solution for them. You can join this decision-making process and propose a pragmatic solution leveraging your knowledge and experience.
- Shared understanding and belief in our company’s mission and values
- Experience in infrastructure management and automation
- Experience in infrastructure and system architecture design
- Experience in writing design docs or proposals and reaching agreements with stakeholders
- Experience in using container management platforms (ex: Kubernetes) in production
- Experience in operating and being the admin of cloud (GCP or AWS) in production
- Good understanding of common software development lifecycle (SDLC)
Platform DX team-specific requirements
- 2 years of experience using Go
- Experience in writing CLI tools and packages in Go
Network team-specific requirements
- Experience working with network proxies such as Envoy/HAProxy/Nginx
- Strong understanding of networking, especially OSI Layer 4 to 7: load balancers, proxies, API gateways, DNS, TLS, and HTTP protocol
- Good understanding of Linux networking
- Working knowledge of cloud and Kubernetes networking
- Preferred Experience
- Experience in a distributed system or microservices architecture
- Experience developing and supporting tools for internal customers
- Experience making technical decisions as a tech lead
- Experience of working as an SRE
- Experience writing Go (and scripting with bash)
Platform DX team-specific preferred experience
- Experience in frontend development with React
- Experience in interface and UI/UX design
Network team-specific preferred experience
- Experience in using service meshes in production such as Istio or Linkerd
- Experience in using network reliability practices such as circuit breaking, rate limiting
- Experience in designing cloud-based network architectures
- English: Proficient (CEFR - C1)
- Japanese: Basic (CEFR - A2) optional