Job Overview
dotData is hiring a high-caliber Site Reliability Engineer (SRE). In this role, you will help architect, modify, improve, and support the platform running user-facing Software-as-a-Service (SaaS) and Managed service offerings on top of various dotData products.
Using your expertise in SRE principles of automation and continuous improvement, you will help create an environment where availability, reliability, and security are threaded through the entire application life cycle. As a SRE, you will write new software as required to automate the building, testing, deployment, promotion, monitoring, alerting, and maintenance of dotData products on both our cloud sites and customer cloud sites.
Things You Will Do
Build and architect systems for managing clusters deployed on multiple cloud providers and regions securely and efficiently.
Develop systems, primarily in Python, to automate deployments and prevent outages through automatic scanning and remediation.
Investigate system performance, errors, and problems for incident response.
Provide architectural guidance and recommendations about dotData product deployments for customers who deploy products on their sites.
Work with internal teams to create solutions designed to meet customer expectations.
Job Requirements
Mandatory
Experience with build and architect systems on Amazon Web Services, Microsoft Azure, or both
Familiar with DevOps toolsets such as Terraform, Ansible, and CI/CD tools etc,
Experience with Linux system administration and troubleshooting
Familiar with scripting languages (e.g., Python) and comfortable with developing deployment automation tools and their tests in at least one language
Strong troubleshooting skills to logically go about solving problems
Japanese communication skill
Nice to Haves
Operations experience with a production user-facing application
Experience with operating applications on top of Kubernetes or Hadoop
Experience with remote work in asynchronous ways
Excellent learner who loves trying out new things and thinks deeply about how things can be done better
English communication skill
What we’re offering
Direct impact on the product
Possibility to improve your skills in a very experienced and highly technical team
Powerful hardware, 2 monitors
Friendly atmosphere, no stress, no dress-code
Flexible working hours: we understand if you have to leave at 3pm or prefer to work in the evenings
Personal development budget for conferences, books, online training, etc.
Very comfortable office in the heart of Tokyo
About dotData
dotData’s pioneering automated feature discovery and engineering platform solves the hardest challenge of AI/ML projects. Our Feature Factory technology discovers hidden gems for empowering your business as transparent, explainable features by connecting the dots within large-scale data sets in hours, without human bias. It enables data scientists to explore 100X more features, including those you’ve yet to imagine, and arguments AI/ML projects in an agile manner to deliver business value faster. In an era of rapid change, AI-discovered insights can be a game changer for business growth and innovation across industries. The power of dotData’s platform and ability to provide game-changing insights is why Fortune 500 organizations across the globe use dotData.
About dotData
dotData is a Silicon Valley-based startup focused on full-cycle Machine Learning and Data Science automation.
Our platform automates the entire process of building predictive models starting from raw business data through data and feature engineering to machine learning all the way to production.
We have offices in the USA, Japan, and Poland. Fortune 500 organizations around the world use dotData to accelerate their ML and AI projects.
Get Job Alerts
Sign up for our newsletter to get hand-picked tech jobs in Japan – straight to your inbox.