I build reliable
infrastructure

Cloud & Infrastructure Engineer building and maintaining production systems that teams depend on daily. Monitoring, automation, and migrations — done right.

Get in touch View projects

matt@portfolio:~

$ welcome

Hey! I'm Matt — Cloud & Infrastructure Engineer.

Type help to see available commands.

Scroll

What I focus on

Clear outcomes over buzzwords.

Reliability

Reduce incident noise and surprises with better monitoring, alerting, and runbooks that actually get used.

Automation

Remove toil with scripts and Ansible where it actually pays off. No automation for automation's sake.

Pragmatism

Simple, auditable changes that teams can own and maintain. Solutions that outlast my involvement.

Production Systems

Real infrastructure I've built and maintain in production.

Support Dashboard

Team was juggling 3 ticketing systems per shift. Built a unified platform that consolidated them into one view with 40x faster response times.

PHP-FPM API Integration Live Since Nov 2024

VIP Alerting

VIP Alerting & OOH Reporting

Critical client tickets were getting missed outside business hours. Built automated escalation that catches 5-10 real incidents per week.

VictorOps API Automation 19 VIP Clients

Zabbix Automation

Zabbix Automation Modernisation

Monitoring onboarding took 1-2 hours per server and kept breaking on modern OSes. Rebuilt the automation — now 5 minutes, 500+ deployments, zero failures.

Ansible Zabbix API Production

Documentation

Technical Documentation

All operational knowledge lived in one person's head. Built comprehensive docs that cut onboarding from weeks to days and eliminated single points of failure.

Knowledge Base Runbooks Team Enablement

View all projects

About me

I'm a Cloud & Infrastructure Engineer with 5+ years at THG Ingenuity, where I provide L3 support across bare metal, VPS, and cloud hosting. My day-to-day is a mix of keeping production systems healthy, building internal tools that make the team faster, and automating away the repetitive stuff so we can focus on the problems that actually need a human.

I gravitate towards the kind of work where reliability matters — monitoring that catches real issues instead of generating noise, automation that handles edge cases instead of just the happy path, and documentation that means I'm not the only person who can fix things at 3 AM.

Servers automated

Systems consolidated

Faster response times

Linux Ansible StackStorm Zabbix cPanel/WHM Networking DNS & SSL Bash Python

 1# Recent incident timeline
 200:14 VIP ticket arrives OOH
 300:14 Auto-detected, VictorOps fired
 400:16 On-call engineer paged
 500:22 Root cause identified
 600:31 Resolution confirmed
 7
 8# Before: noticed Monday morning
 9# After:  resolved in 17 minutes

Get in touch

Have a question about my work, want to discuss infrastructure, or just want to say hello? Drop me a message.

matthodges20@gmail.com github.com/NexteraMatt LinkedIn

I build reliableinfrastructure