Site Reliability Engineer (Apple Information Security)
Detalhes do emprego
Site Reliability Engineer (Apple Information Security) Summary Posted: Oct 7, 2024 Role Number: 200571917 Imagine what you could do here. At Apple, new ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you can accomplish. We are seeking an extraordinary individual who is passionate about reliability engineering, software development, privacy, and information security with a desire to work in hyper-scale environments. The ideal candidate will have a strong background in production monitoring, a deep understanding of development and operations, and a proven track record in managing large-scale production environments. Description Our team is highly collaborative, working closely with partner teams to deliver the best results for Apple. We strive to find the best solution while also considering the need to get things done efficiently for each engineering challenge we face. Good ideas are valued and rewarded. As an SRE in Apple Information Security, you will: Operate, monitor, and triage all aspects of our production and non-production environments. Pioneer and implement the next generation telemetry system for AIS services. Establish alert handling procedures, runbooks, and collaborate with our global security team. Automate deployment and orchestration of services into the cloud environment as well as other routine processes. Actively participate in capacity planning and disaster recovery exercises. Interact with and support partner teams across the enterprise. Cultivate and maintain relationships with internal and external third party vendors. Minimum Qualifications 7+ years of experience in Site Reliability Engineering, DevOps, or a related field. Bachelor’s degree in Computer Science, or a related field, or equivalent practical experience. Experience working with cloud compute environments like OpenStack, AWS, GCP or Azure. Experience with infrastructure as code (IaC), configuration management, CI/CD, and automation, e.g., Terraform, Pulumi, CloudFormation, Ansible, Chef, Puppet, Jenkins. Strong programming skills: Python and/or Go. Extensive experience administering and troubleshooting Linux systems (any distribution), including the usage of standard Linux utilities. Troubleshooting and debugging experience. Preferred Qualifications Proficiency in implementing and coordinating telemetry using monitoring and observability tools like Splunk, Grafana, Prometheus, or similar. Experience in shell scripting (e.g., bash/zsh) and system administration. Experience with measuring, analyzing, and optimizing performance. Experience operating with Scrum/Agile development methodologies. Strong understanding of concurrency, parallelism, and distributed system concepts. Passion for high-quality code, tests, documentation, and production services. Participation in an on-call rotation. Building and operating container orchestrating systems (Docker, Kubernetes, vagrant and micro-services). #J-18808-Ljbffr
Apply safely
To stay safe in your job search, information on common scams and to get free expert advice, we recommend that you visit SAFERjobs, a non-profit, joint industry and law enforcement organization working to combat job scams.