Job description
- Mission
- Platform.sh is a groundbreaking hosting and development tool for web applications.
- To reinforce our technical prowess, we are looking to grow our operations team. If you’re looking for an exciting, high-growth opportunity with an award-winning, cutting-edge company, this could be just the job for you
- For its PaaS solution https://platform.sh is looking for an Operations and Service Reliability Engineer with a taste for Python and Go, great Linux system understanding, and a real hunger for the challenges of building robust, distributed systems.
- Platform.sh is a PaaS shrouded in a lot of black magic (we can consistently clone a whole running cluster, with its state, databases, indexes in a matter of seconds). We want to get this down to the hundreds of milliseconds domain. Interested? There is more...
- Our external API is pure Hypermedia REST + oAuth on top of Pyramid. It mechanizes the Git layer and needs more features.
- We can consistently generate from the same manifest a Docker container, an LXC one, or VM disk images (AWS, Azure, OpenStack), we want more targets.
- We probably have the highest industry container density. We need to get it higher.
- We support any Python, Ruby, NodeJS or PHP, Java and .NET, time to roll-out Elixir, of course, Elixir (and Rust. We need Rust).
- Directly reporting to one of our Directors for the Operations Infrastructure Department and in close interaction with our Engineering and Customer Success teams, you will be responsible for:
- Cloud operations: configure clusters, deploy stuff, follow-up on alerts, help customer support debug issues.
- Automating all of the above so they can instead drink margaritas (or non-alcoholic beverages, of course)
- Creating systems, tools & processes that will enhance our support and operations efficiency
- Improving service quality, discipline and reliability throughout lifecycle
- Monitoring operating objectives, streamline and automate intervention
- Continuous learning from Operations experience, modeled as software
This is a fully remote position for a candidate based in Canada.
- The ideal candidate
- Has proven successful experience in an operations role
- Has demonstrated the ability to successfully manage cloud-based infrastructure for a fast growing organization
- Has experience with containerization technologies
- Has had exposure to cloud services: AWS, Azure, GCP, etc
- Understands how an OS works, knows networking, how git works, and the constraints of a distributed system
- Puppet experience
- Is proficient in Python (Golang a plus)
Nice to have
- Knowledge of Magento Ecommerce, Symfony, Drupal, eZ Platform, or Typo3.
- Note: we don't like stress, so we build everything to be robust and resilient, but stuff does break. This is a role with on-call duties, weekend work, and fire drills. If this fills you with dread... well, this might not be a fit for you.
This is a remote job. Work from anywhere!
- We are a worldwide distributed team and are looking for a candidate who can perform well working remotely. To be an effective performer as a CSE here at Platform.sh, you’ll need to be able to effectively collaborate across time zones while operating with a high level of independence and autonomy.
About Platform.sh
- Platform.sh is an idea-to-cloud application platform that simplifies cloud infrastructures.
- We give developers the tools they need to experiment, innovate, get rapid feedback and deliver better-quality features with speed and confidence thanks to our unique rapid cloning technology.
- We want people who are passionate, open, multicultural, friendly, humble and smart to join us and help this fast-growing, award-winning company to revolutionize the tech industry.