Solution Architect - Site Reliability Engineer- Azure
XenonStack Chennai
Job Description
- Gathering Project Requirements from Stakeholders along with Business Analysts and Project Managers
- Break down complex problems and projects into manageable goals
- Handle High severity incident and situation.
- Designing high level Schematics of the infrastructure, tools and process needed
- Performing and in depth analysis of the possible risk and countermeasures for them
- Create a bridge between development and operations by applying software engineering mindset to system administration topics
- Configuration management platform understanding and experience (Chef/Puppet/Ansible)
- Release engineering, which involves defining best practices to ensure software releases are consistent and repeatable.
- Alerting, being on-call, and troubleshooting, along with emergency and incident response and postmortems.
- Know how best to monitor systems and react when things go wrong, constantly writing and rewriting response playbooks to reduce the time to fix any breakdown which may occur
- Involves documenting an incident, understanding all contributing root causes, and implementing future preventive actions.
- Highly developed skills in managing 24x7 production support comprising of Incident, Problem, Change management
- Troubleshooting Support Escalation
- On-Call Process Optimization
- Documenting Knowledge
- Optimizing SDLC (Software Development Life Cycle)
- Strong understanding of cloud-based architecture and cloud operations. Hands-on experience with Azure
- Experience in administration/build/management of Linux systems
- Foundational understanding of Infrastructure and Platform Technology stacks
- Strong understanding of Networking concepts and theories, such as different protocols (TCP/IP, UDP, ICMP, etc), MAC addresses, IP packets, DNS, OSI layers, and load balancing
- Working knowledge of Infrastructure and Application monitoring platforms
- Understanding of the core DevOps practices (CI/CD pipeline, release management etc)
- Ability to write code using any one modern programming language (Python, JavaScript, Ruby etc). Additional scripting skills are preferred
- Prior experience in Cloud management automation tools (Terraform/CloudFormation etc) is preferred
- Experience with source code management software and API automation is preferred.
- Deep Understanding of architecture and operations of Container Orchestration tools eg Kubernetes
- Deep understanding of Know Applications ie JAVA, Nodejs, Golang
- Deep understanding of Databases and SQL
- Strong understanding of BigData Infrastructure.
- Understanding of Incident management and Event Register Management
- Knowledge of SDLC methodologies and best practices including Waterfall Process, Agile methodologies, deployment automation, code reviews, and test-driven development
- Excellent communication skills
- Attention to detail
- Analytical mind and Problem Solving Aptitude
- Strong Organizational skills
- Visual Thinking
- Education : Technical Graduates ( BCA, BSC, B.TECH) , MCA, MSC AND M.TECH with strong data structures and algorithm skills
tekwissen indiaChennai
as a pioneer of modern transportation, we have sought to make the world a better place – one that benefits lives, communities and the planet
Job Title: Site Reliability Engineering Engineer III
Location: Chennai
Work Type: Hybrid (4 Days work From Office...
Athenahealth Technology Private LimitedChennai
Job Description
Join us as we work to create a thriving ecosystem that delivers accessible, high-quality, and sustainable healthcare for all.
Position Summary: We are seeking an Engineering Manager, Site Reliability Engineering (SRE), who...
XenonStackChennai
level Schematics of the infrastructure, tools and process needed
• Performing and in depth analysis of the possible risk and countermeasures for them
• Create a bridge between development and operations by applying software engineering mindset to system...