*Our roles are remote first, and can be based anywhere in India (#LI-Remote). Responsibilities Monitor and continually improve the capacity of our production environment Design and implement scalable, reliable, and efficient infrastructure using Kubernetes, Terraform, AWS resources. Partner with development teams to improve services through rigorous testing and release procedures with CI pipelines (Github Actions, Dockerfiles) Gain a deeper understanding of RudderStack infrastructure and help debug incidents Proactively build software to help operations and support teams Identify opportunities for process improvements, automation, and cost savings Requirements A Bachelor or Master degree in Computer Science or equivalent experience is required 5+ years of experience as a Site Reliability Engineer, Internal Platform Developer or similar role Strong understanding of cloud computing, containers, and DevOps practices Demonstrated Linux experience Excellent debugging skills Experience with Scripting and infrastructure automation Familiarity with distributed systems design patterns using tools such as Kubernetes Familiarity with AWS, Azure or Google Cloud Compute Excellent verbal and written communication skills Familiarity with Networking concepts like VPCs, proxies and CDNs Here are examples of things we've worked on: Build and maintain a Kubernetes platform to deploy all our applications with high availability Build Kubernetes operator to automate 100s of deployments Managed 100s of postgres with HA for our deployments Provision and manage air-gapped on-premise deployments in diverse environments. Manage multi-region multi-cluster environment with hundreds of customer deployments in single-tenant and multi-tenant models. Complete Infrastructure as a code and enforced using GitOps model Automated migrations of complex, highly available services Working on compliance(i.e. SOC2 Type 2, HIPPA), security, scalability, and a lot more aspects to deliver top class, secure software We follow FinOps and continuously optimize our cloud costs. How we achieve results: Empathy for the problems encountered by our customers. Collaboration with engineering teams to achieve results. Care deeply about the quality of your and the team's code Curiosity and understanding, for investigating causes and finding effective solutions. Output driven to provide value to our customers in a significant, measurable, and positive way. Focus on writing testable, performant, bug-free code to provide the right solutions to the problems. Please mention the word **SHARPEST** and tag RNDQuMjM0LjE0NS4xMTU= when applying to show you read the job post completely (#RNDQuMjM0LjE0NS4xMTU=). This is a beta feature to avoid spam applicants. Companies can search these words to find applicants that read this and see they're human.
Job listing via RemoteOK.com




