The MediaMachine Runbook for Traefik
MediaMachine.io is an IaaS platform for user-generated video content and we use Traefik as the reverse proxy in our network layer.
In a previous post, we talked about how Traefik fits into our infrastructure and helps us serve requests with ease. This runbook is an extension to that - an overview of things we found important to highlight for someone looking to know more about running Traefik themselves including useful configuration snippets.
#
The MediaMachine Traefik RunbookHere is our runbook for getting setup with Traefik in your environment:
PS: Some of the configuration is AWS centric but the general concepts and Traefik specific configuration applies more broadly.
#
Monitoring + Metrics + DashboardWe currently use Datadog for monitoring Traefik.
If you're running a Datadog agent on your instance, it is typically set to listen on port 8125
.
tip
This setup can work with other StatsD compatible endpoints too.
You can drop this section into the static config to enable metrics and the UI dashboard:

The Traefik dashboard is a well-done single pane of glass that lets us monitor our services and traffic at a glance. You can route an internal DNS name to point to the dashboard ingress port to isolate the entrypoint serving the dashboard from the entrypoint serving regular traffic.
If you're using middlewares, you can also inspect what middlewares and routing rules apply directly via the UI.
#
High Availability & FailoverIt was fairly straightforward for us to simply run multiple Traefik instances with support from Consul. Since we run Consul in our environment, our service containers running on docker advertize their IP addresses (and routing metadata) to Consul.
Traefik has built-in support to pull in updates from consul at fixed intervals (https://doc.traefik.io/traefik/providers/consul-catalog/). This makes spinning multiple Traefik instances for HA simply plug-and-play since each instance can independently fetch updates from Consul without extra coordination overhead.
If you don't run a local consul agent on the same node as Traefik, you can also set the endpoint
to your consul server address like consul.internal.example.com
tip
Run Traefik instances in a dedicated auto-scaling group and wire it in to the ALB. As the autoscaling group scales up/down, ALB keeps the routing information up-to-date and will forward traffic to new instances as they come up.
#
Make sure you're aware of the min "catalog" refresh timeThe default interval between querying consul for config updates is 15 seconds
.
Make sure your service deploys are graceful by allowing enough time between each instance in your rolling deploy.
tip
Waiting for ~30 seconds between each instance update should give Traefik+Consul enough time to propagate membership updates. Adjust delay to account for bootstrap operations to finish too.
#
Advertizing service routing with Docker+NomadIf you're using Nomad, adding service metadata tags makes it easy to declare routing config right next to your service definitions.
Example:
Nomad takes care of propagating these tags to Consul. Traefik leverages the Consul Catalog integration to periodically fetch updates and dynamically adjusts configuration. As our containers spin up (or down), the membership information is automatically kept up to date.
#
Improve load times by enabling compressionThis was covered in the previous code example but is worth highlighting. You can easily enable gzip compression via a middleware for all the responses flowing back through it.
For declaring and using compression middleware in-line with consul catalog tags:
#
Protect your servers with ratelimitsWe recommend setting a global ratelimiter to protect your infrastructure from DDOS attacks and accidental for-loops in client side code.
This declares a global ratelimit middleware. You can tweak the params of this ratelimiter like the period, burst delay etc dynamically (see next section for live config updates).
You can set a convention to apply the global-ratelimit
middleware to all routes and opt-in to more specific ratelimiters whenever needed by specific service backends.
#
Split Dynamic config strategy (with Consul)Traefik config can be divided into two logical sections:
- Static
- Dynamic
A static config section that doesn't change often (needs a Traefik redeploy)
We define things like where to find Consul, top level config, UI dashboard access etc in the static section.
Let's call this file traefik_static_config.toml
:
The dynamic config section that Traefik can update on-the-fly
Since we added a dynamic file provider block, we can drop updates into the traefik_dynamic_config.toml
file and Traefik will apply the changes without restarting.
Example of the dynamic config:
This file in conjunction with the dynamically updating service tags via Consul gives us pretty good coverage for applying most changes without having to do a full deploy. You can update these dynamic config sections live or add more middlewares and routers, and attach them to your services without restarting Traefik.
Using Nomad, we propagate updates the dynamic file via Consul:
- Store the contents of
traefik_dynamic_config.toml
as a Consul key - Use Nomad
template
block to sync updates when the file is changed on Consul - The
[providers.file] watch=true
config in Traefik will pick up changes dynamically
You can use this template block as a starting point:
#
Bonus: Identifying unique Traefik instancesIt is sometimes helpful to know which Traefik instance handled a particular request while you're debugging. We found this simple hack to give unique names to our Traefik instances:
You can declare an entrypoint with a unique name that doesn't really route any traffic.

This lets you configure your services to bind with a proper, well-known entrypoint as usual (httpsIngress
in this example) while also getting a unique name displayed in the UI dashboard.
Similarly, you can also have a middleware that injects the unique id of the traefik instance into the response headers with every request.
info
#
[FIN]We hope this runbook helps you with setting up Traefik in your own environment. If you have other cool tips and tricks, please share them with us and we'd be happy to update this runbook.
Simplify your video pipelineTry MediaMachine today!
Get started for free →Get access to one of the cheapest Cloud-Transcode pipelines
Engage users early with great Thumbnails and NLP-Like Video summaries
No credit card required