About TemperPack
TemperPack is a rapidly growing company based in Richmond, Virginia and with locations in Las Vegas Nevada, and Holt Michigan. We design and manufacture innovative, sustainable packaging components for cold-chain shipping of food and pharmaceutical products. The markets we serve have an urgent need for disruptive, sustainable alternatives to replace the wide array of non-sustainable packaging materials conventionally used in the cold-chain such as Styrofoam coolers. Our aspiration is to provide packaging solutions that customers and end-consumers feel great about using.
Â
Job Title: Manager – Digital Services Reliability Engineering
Supervisor’s Title: - Senior Manager - IT & Infrastructure
Department: IT
Location: Richmond
Â
Job Description:
We’re looking for a hands-on leader to manage and grow our Digital Services Reliability Engineering practice. This person will be responsible for the performance, availability, and scalability of our digital systems, while fostering a culture of innovation and operational excellence. The role combines technical depth with team leadership and requires the ability to drive continuous improvement in collaboration with engineering and business teams.
Essential Job Functions:
· Lead and mentor a team of Reliability Engineers, creating a collaborative and high-performing team environment.
· Ensure the stability and performance of key systems through effective monitoring, capacity planning, and performance optimization.
· Implement and evolve observability solutions using tools like Datadog, Grafana, Prometheus, Splunk, AppDynamics, or OpenTelemetry.
· Collaborate with Product and Engineering teams to design and maintain reliable, scalable, and efficient architectures.
· Manage and improve incident response processes, including coordination, root cause analysis, and post-incident reviews.
· Track system health using SLIs/SLOs and other KPIs to inform decisions and report on system reliability.
· Promote automation to reduce manual work, leveraging Infrastructure as Code (e.g., Terraform, Kubernetes, Ansible).
· Support the evolution of CI/CD practices in close partnership with development teams.
· Manage and optimize cloud infrastructure, primarily within Azure, and support containerized environments (Kubernetes, Docker).
· Contribute to cloud security, compliance, and cost-efficiency initiatives in collaboration with InfoSec, Legal and Finance teams.
· As members of the IT team, safeguarding information systems and ensuring data security is our highest priority. Every role within IT is responsible for upholding best practices in cybersecurity, compliance, and risk management to protect the integrity, confidentiality, and availability of our systems and data. All team members are expected to adhere to security policies, proactively identify and mitigate threats, and contribute to a culture of vigilance and continuous improvement in information security.