Datadog Administrator and Monitoring Specialist
تفاصيل الوظيفة
About the Company - Lumen is guided by our belief that humanity is at its best when technology advances the way we live and work. With 450,000 route fiber miles serving customers in more than 60 countries, we deliver the fastest, most secure global platform for applications and data to help businesses, government and communities deliver amazing experiences. Learn more about Lumen’s network, edge cloud, security and communication and collaboration solutions and our purpose to further human progress through technology at news.lumen.com, LinkedIn: /lumentechnologies, Twitter: @lumentechco, Facebook: /lumentechnologies, Instagram: @lumentechnologies and YouTube: /lumentechnologies PFB Detailed Job Details: Experience: 5+yrs to 10yrs Location: Bangalore and Noida Job Description: We are seeking a skilled and experienced Datadog Administrator and Monitoring Specialist to join our team. The ideal candidate will have extensive experience in deploying, configuring, and managing Datadog for monitoring and observability across various environments. This role involves working closely with development, operations, and security teams to ensure the health, performance, and security of our infrastructure and applications. Key Responsibilities: Datadog Deployment and Configuration:
- Deploy and configure Datadog agents across various environments (cloud, on-premises, hybrid).
- Set up and manage Datadog integrations with different services and tools (e.g., AWS, Azure, Kubernetes, Docker, etc.).
- Configure Datadog dashboards, monitors, and alerts to provide visibility into system performance and health.
- Implement comprehensive monitoring solutions to track key performance indicators (KPIs) and service-level objectives (SLOs).
- Develop and maintain custom dashboards to visualize metrics, logs, and traces.
- Set up anomaly detection and alerting mechanisms to proactively identify and resolve issues.
- Analyze performance data to identify bottlenecks and optimize system performance.
- Collaborate with development and operations teams to implement performance improvements and best practices.
- Respond to and resolve monitoring alerts and incidents in a timely manner.
- Conduct root cause analysis and implement corrective actions to prevent recurrence.
- Document incident response procedures and maintain an incident response playbook.
- Ensure monitoring and observability practices comply with security and regulatory requirements.
- Implement security monitoring and alerting to detect and respond to potential threats.
- Proven experience in deploying and managing Datadog in a production environment.
- Strong understanding of monitoring and observability principles and best practices.
- Experience with cloud platforms (e.g., AWS, Azure, GCP) and container orchestration (e.g., Kubernetes, Docker).
- Proficiency in configuring Datadog agents, integrations, dashboards, and alerts.
- Familiarity with scripting languages (e.g., Python, Bash) for automation and customization.
- Cloud monitoring admin, deployment, and tuning experience (Azure, AWS and GCP)
- Knowledge of infrastructure-as-code tools (e.g., Terraform, Ansible) is a plus.
Apply safely
To stay safe in your job search, information on common scams and to get free expert advice, we recommend that you visit SAFERjobs, a non-profit, joint industry and law enforcement organization working to combat job scams.