• Sr. Manager, Site Reliability

    Job Location US-CA-Pleasanton
    Information Technology
    Position Type
  • About Blackhawk Network:

    Blackhawk Network Holdings, Inc. (NASDAQ: HAWK) is a global financial technology company and a leader in connecting brands and people through branded value solutions. Blackhawk platforms and solutions enable the management of stored value products, promotions and rewards programs in retail, ecommerce, financial services and mobile wallets. Blackhawk’s Hawk Commerce division offers technology solutions to businesses and direct to consumers. The Hawk Incentives division offers enterprise, SMB and reseller partners an array of platforms and branded value products to incent and reward consumers, employees and sales channels. Headquartered in Pleasanton, Calif., Blackhawk operates in 26 countries. For more information, please visit blackhawknetwork.comhawkcommerce.comhawkincentives.com or our product websites GiftCards.comgiftcardmall.comGiftCardLab.com and OmniCard.com.


    We are looking to hire an accomplished Sr. Manager, SRE to join the Blackhawk Network Technology Support Organization and lead the Site Reliability Engineering team responsible for defining metrics on platforms, services, processes and partners, while also driving the complete problem management life cycle.  The desired candidate would have an excellent understanding of common development technologies, would have solid understanding of infrastructure as it relates to supporting a high volume online transaction processing platform.


    • Manage a team of engineers and data analysts towards company, team and personal goals.
    • Responsible for career growth, hiring, providing feedbacks through continuous review process.
    • Continue to refine and drive the problem management lifecycle including root cause investigation and reporting including the creation of corrective actions.
    • Articulate RCA results and corrective action requests to Engineers at all areas of the company (Engineering, Linux/Windows, DBA, Network, Storage and Customer Service).
    • Define and report on improvement metrics, at the platform and company level.
    • Mentor the team towards constantly pushing for improvement of our processes, mind set and passion towards driving continuous improvement within the company.
    • Works directly with peers in other orgs towards building common goals around availability or services.
    • Aggressively communicate solutions through data driven analysis of problems by painting a picture of problems and impacts technical and financial.
    • Define monitoring standards and tooling requirements to report on availability of services.
    • Implement solutions within the confines of standard architectural, compliance and security requirements.
    • Work hand in hand with the Operations Control Center to bring better visibility of impacts and constantly improve Mean Time to Identity as well as Mean Time to Restore.


    • 3+ years of managing or leading a team of Production Support engineers in the realm of Software Engineering, Network Administration, Cloud Operations, Linux, Storage or Database.
    • 10+ years of progressively advancing experience with hands on technical experience as a Software Engineer, Network Engineer, Cloud Engineer, Storage or Database administrator.
    • Strong project management skills with the ability to direct and influence technical professionals by planning, communicating and tracking deliverables.
    • Experience with process improvement methodologies such as Kanban, Lean and/or Six Sigma.
    • Experience with IT Service Management methodologies specifically ITIL.
    • Hands on Cloud/AWS experience with emphasis on scaling and high availability.
    • Hands on experience with Linux and Windows troubleshooting and/or administration.
    • Bachelor’s degree in computer science, engineering or equivalent.


    Sorry the Share function is not working properly at this moment. Please refresh the page and try again later.
    Share on your newsfeed