Title:  SRE/Observability Engineer

He aha ai tātou – Why us?

At One NZ we’re not just imagining the future, we’re building it. Our purpose? A better-connected Aotearoa New Zealand. Kia renarena te taukaea i Aotearoa New Zealand.

 

Our ambition? To become the most AI-enabled telco on the planet. This isn’t just about our technology, It’s about you. We’re investing in our people like never before.... empowering them to grow, lead, and shape what’s next.

 

Through our AI School, access to world-class learning platforms, or career pathways that evolve with you, we create an environment where your curiosity thrives, and your skills accelerate.

 

Join us and be part of something extraordinary: connect with purpose and help redefine what’s possible.  

 

Uia mai koe te pātai, he aha te mea nui o tēnei ao? Māku koe e ki atu he tangata, he tangata, he tangata.

“If you asked me, what is the greatest thing in this world, I would say it is people, it is people, it is people.”

Ko tō tūranga – your role

As the SRE/Observability Engineer – Data & AI Platforms, you’ll lead the reliability, governance, and observability of One NZ’s Data & AI platforms to enable faster, incident-free delivery. You’ll embed best-practice telemetry, readiness gates, and cost-aware operations across the AI lifecycle, working cross-functionally to align support models, resilience strategies, and FinOps controls. This role combines hands-on enablement with strategic influence to ensure our platforms remain secure, scalable, and production-grade.

Ko tō mahi – what you’ll do

  • Define SLIs/SLOs for core services and build dashboards and alerts using logs, metrics, and traces.
  • Continuously improve alert quality to reduce noise and increase actionability.
  • Set and own outcome targets such as MTTD, MTTR, recurrence rates, and change success.
  • Define severity levels, taxonomy, and communication SLAs.
  • Ensure on-call readiness with partners, lead P0/P1 incident response and stakeholder updates, and drive post-incident actions to closure.
  • Maintain a Known Error Database (KEDB), trend reporting, and a prevention backlog tied to SLO breaches.
  • Standardise resilience patterns including health checks, timeouts, retries, backoff strategies, and autoscaling.
  • Lead capacity and performance testing and enforce error-budget policies.
  • Enforce tagging to surface product-level costs, monitor budget variances and anomalies, and drive safe optimisations such as caching, batching, and model sizing.
  • Co-define acceptance criteria for production and support readiness across security, privacy, observability, and FinOps.
  • Automate checks in CI/CD pipelines and enable fast, safe rollback.
  • Maintain current standards, diagrams, evaluation reports, and runbooks to ensure clean handovers and reduce single points of failure.

Na tōu rourou - what you’ll bring

  • Experience with observability platforms such as Dynatrace or Datadog, and contextual platform credentials like Snowflake SnowPro Core.
  • Familiarity with Salesforce Data Cloud and its integration with SRE and observability workflows.
  • Proven track record designing SLIs/SLOs and error-budget policies across multiple services, aligned to business criticality tiers and reporting standards.
  • Proven experience in building and operating logs, metrics, and traces at scale using OpenTelemetry pipelines and alert routing.
  • Skilled in improving alert signal-to-noise ratio and automating runbooks.
  • Strong understanding of data ingestion, identity resolution, harmonisation, segmentation, activation, and governance — with a focus on how Data Cloud supports AI/RAG use cases and observability.
  • Experience implementing CI/CD quality gates (security, readiness, tagging/showback), automated rollback, and evidence-based change control.
  • Skilled in capacity, load, chaos, and disaster recovery testing.
  • Ability to surface cost-to-serve and drive safe optimisations.
  • Comfortable coordinating BAU and after-hours operations with partners.

 

Nā mātou te rourou – what you’ll get

  • One New Zealand is leading the way by ensuring you can have a truly balanced life. Most roles allow flexibility to work from home and flex your hours to enjoy work & whānau commitments.
  • A fully subsidised Southern Cross health insurance cover for you and your family.
  • Laptop, unlimited data plan and a market leading mobile phone
  • Lifestyle leave, giving you the option to purchase an extra week or two of annual leave.
  • Discounts on One New Zealand products, services and much more!

 

Joining our whānau is more than starting a new job - it’s the beginning of a journey that will challenge and inspire you to play a role in something bigger. 

 

Tū hikitia rā, tū hāpainga. Tū hāpainga, tū hikitia rā.

We stand to uplift, to support and to elevate others.

 

#LI-YA1