Verizon is a leading provider of technology, communications, information and entertainment products, transforming the way we connect across the globe. We’re a diverse network of people driven by our ambition and united in our shared purpose to shape a better future. Here, we have the ability to learn and grow at the speed of technology, and the space to create within every role. Together, we are moving the world forward – and you can too. Dream it. Build it. Do it here.
What you’ll be doing...
Verizon is committed to provide the best Omni Channel and personalized experience to its customers over all channels including Digital, Retail and Indirect, and Customer Care. To enable this best-in-class customer experience, we are working on implementing Site Reliability Engineering (SRE) practices and principles across all customer interactions and applications. The role of the SRE team is to operate applications in production “mission-critical systems” and do whatever is necessary to keep the site up and running. It is often defined as a software engineer doing operations work. You will be responsible for maintaining and establishing service levels agreed upon with Business and manage error budgets for each of their systems. You will be expected to balance your time doing operational work (making sure systems work as expected) and also improving the systems by writing software to automate processes and reduce toil.
We are looking for a Senior Site Reliability Engineer with six years of experience on distributed systems design and integration architectures of business applications using microservices, containers, and cloud. Lead team to build a new or modify existing automation framework for IT operations. Apply engineering mindset and development skills to IT operations to improve the overall observability of the applications and infrastructure and develop automation framework such way that it reduces manual efforts.
You should be a strong Technical lead to help execute on our vision for Site Reliability Engineering (SRE), determining how each system relates to each other and using a breadth of tools, build automation to improve Reliability for customers. Practices, such as limiting time spent on operations, and proactive identification of potential automation opportunities, factor into the iterative improvement key to both product quality and interesting, dynamic day-to-day work.
Implement SRE automation, develop automation across the stack, and optimize operations hours by reducing manual operations.
Eliminate toil by automation across all the layers – infrastructure provisioning, configuration management, deployment, testing, and operation.
Work on retooling our infrastructure to provide an agile, cloud based foundation that provides common infrastructure management and automation framework.
Interface directly with senior staff members within the organization to discuss and assess compliance with IT policies, standards and procedures, suggest opportunities for improvement, and report on the status of specific. Work with development teams throughout the software life cycle ensuring sustainable software releases.
Practice sustainable incident response and blameless postmortems.
What we’re looking for...
You’ll need to have:
Bachelor’s degree or four or more years of work experience.
Six or more years of relevant work experience.
Strong experience with AWS cloud environments, with working knowledge of NLB/ALB, S3, EC2, Autoscaling, EKS, Lambda with Certification in appropriate areas.
Build and drive adoption for SRE automation for IT operations and deployments.
Willingness to travel up to 25% or less.
Even better if you have one or more of the following:
Bachelor’s degree or equivalent experience, Master’s degree.
Six or more years of experience with all phases of the Software Development Lifecycle, including system analysis, design, coding, testing, debugging and documentation.
Five or more years of experience working on middle technologies like Weblogic, Tomcat, IBM MQ/Kafka/ RabbitMQ, Springboot, REDIS, Elasticsearch etc.
Automation experience and ability to code or script at an advance level.
Experience in Cloud & Container platform Strategies, Design, Architecture and Migration.
Experience with designing and implementing CI/CD DevOps solutions using Jenkins pipelines using Python, Git, Shell, YAML, Kubernetes and Docker.
Experience in scripting - Ansible, CloudFormation, Jython and UNIX shell scripting.
Configuration Management experience with Chef, Puppet, Ansible or Python.
Experience serving as both a mentor and advocate for your team.
Experience performing analytics on previous incidents and usage patterns to better predict issues and take proactive actions.
Experience leading and participating in performance tests identify bottlenecks, opportunities for optimization, and capacity demands for Iconic phone launches.
Experience in IT Security and compliance, operations and network services, and application development.
Experience leading medium to large projects by bringing together the right perspectives, identifying roadblocks, and integrating feedback from clients and team members.
Working knowledge of APM tools like CA Wily, New Relic, or Datadog.
Kubernetes CKA Certification.
Equal Employment Opportunity
We're proud to be an equal opportunity employer - and celebrate our employees' differences, including race, color, religion, sex, sexual orientation, gender identity, national origin, age, disability, and Veteran status. At Verizon, we know that diversity makes us stronger. We are committed to a collaborative, inclusive environment that encourages authenticity and fosters a sense of belonging. We strive for everyone to feel valued, connected, and empowered to reach their potential and contribute their best. Check out our diversity and inclusion page to learn more.