The most efficient ad marketplace for media owners to monetize content and marketers to deliver relevant ad experiences
August 21
🏢 In-office - Toronto
The most efficient ad marketplace for media owners to monetize content and marketers to deliver relevant ad experiences
• Lead efforts to ensure our systems and networks operate seamlessly. • Oversee internal metrics, execute effective incident responses, and contribute to system optimizations. • Own the monitoring and maintenance of the health, security, and performance of on-premises and hybrid-cloud infrastructure. • Execute timely and effective incident responses, minimizing downtime and ensuring swift resolution. • Build and update disaster recovery plans and security protocols and drive the maintenance of system backups. • Respond to alerts within our established SLOs and assist in incident triage, ensuring that the right teams are engaged to address issues promptly. • Influence the team’s direction and foster accountability, trust, and focus on goals. • Act as a primary contact for operational issues, providing technical support and ensuring issues are resolved efficiently. • Collaborate with product and software engineering teams to provide operational insights and relay requirements. • Foster a collaborative environment by bringing people together to come up with better designs and approaches to complex problems. • Identify opportunities for system optimization and performance improvements. Engage technical leads and management and drive the change. • Research and implement advancements in technology and industry best practices. • Develop and maintain automation frameworks to streamline processes and reduce manual tasks. • Own risk identification and mitigation to proactively address present or anticipated operational challenges. • Identify and implement catalysts for future optimization including provisioning techniques, deployment optimization, ancillary services, pipelines, ansible playbooks, power usage, bandwidth etc. • Create and ensure maintenance of comprehensive documentation for system configurations, processes, and incident resolution procedures. • Participate in knowledge sharing and provide cross-training to other departments. • Maintain runbooks and technical documentation, ensuring familiarity with internal and external escalation pathways.
• In-depth understanding of the Linux operating environment: kernel tuning, network stack tuning, system observability & instrumentation, and security & access management. • Solid understanding of layer 2-7 networking fundamentals and the relationship between servers & services, and the transit of their packets through network hardware. • In-depth experience engineering and maintaining a private-cloud infrastructure: Bare-metal, vSphere, KVM, Kubernetes. • Experience with tools like Ansible, Terraform, Docker, Kafka, Nexus • Experiencing with observability platforms: Prometheus, ELK, Jaeger, Grafana, Nagios, Zabbix • Familiarity with Big Data tools: Hadoop, HDFS, Spark, HBase • Ability to write code in Go, Python, Bash, or Perl for automation. • 7+ years of proven experience in previous roles or one of the following roles: DevOps, Linux System Administration, Site Reliability Engineering. • Built or maintained a private-cloud infrastructure running centos/rocky linux on a mix of bare-metal, virtualization, and containerization. • Managed public cloud environments such as AWS, GCP, Azure and their federation into on-premise environments. • Life-cycle management of bare metal servers such as Dell and Supermicro in globally distributed data centers (e.g. break-fix, baseband/firmware updates). • Built or maintained on-premise and cloud Kubernetes clusters: Kubeadm,EKS, GKE • Built or operated automation & orchestration frameworks for deployment & maintenance pipelines: e.g. Kafka, StackStorm, Ansible, Argo CD, Terraform to push out code or configuration updates, and building new infrastructure systems.
• Comprehensive health, dental, and vision plans at no cost to you • Time off and flexible work schedules • Retirement plan with a 5% company match • Stock options and equity packages • Generous parental leave • Monthly wellness stipend plus fitness discounts and quarterly wellness group activities • Community engagement opportunities and donation-matching program • Annual virtual company retreats and regular community-led team events • One day off per year to volunteer • A workplace that supports a diverse, equitable, and inclusive environment
Apply Now