Varnish 4.1 & HAProxy: get the real IP by leveraging PROXY protocol support
Varnish has become an industry standard when it comes to caching. Varnish is a reverse caching proxy that you put in front of your webserver and that speeds up your website by caching your pages. The PROXY protocol was introduced in Varnish version 4.1 to make to simplify the use of multiple layers of proxying.
Why would you need multiple layers of proxying?
Well, one big drawback is that Varnish doesn’t support SSL and only processes unencrypted HTTP traffic. And they have very specific reasons to keep it that way. But in this day and age secure connections are a must, so we need to terminate SSL before we reach Varnish.
HAProxy to the rescue
HAProxy is open source loadbalancing & proxying software. Although Varnish has loadbalancing capabilities, HAProxy is far better at this and has more advanced features. However Varnish is still our number 1 tool when it comes to caching.
Fortunately HAProxy can also terminate our SSL connection and forward the connection to Varnish where the caching happens.
The network diagram
If you’re confused about the setup, here’s the network diagram to show you the different layers of our setup.
- The HAProxy server sits in front, handles all incoming connections, terminates SSL, does loadbalancing, …
- Varnish sits behind HAProxy, does caching and caching only
- Finally there’s an Apache server the serves the page in case it’s not in cache
The user thinks he’s talking to the Apache webserver, but in reality, there are 2 layers of proxying in between that he is unaware of. If the page is in cache, the Apache server isn’t hit at all.
The problem: what is the real IP of the user?
If you’re using a proxy, the application doesn’t really know that and treats the proxy as a regular user. If you need the user’s IP address, you will just get the IP address of the proxy that sits in front of the webserver. The same deal applies to your log files: they will only contain the one IP.
Luckily, this problem has been tackled by introducing the so-called X-Forward-For header. This header contains a comma-separated list of IP addresses of the proxy servers that are in front of the webserver. You can configure your webserver to use the value of that header instead of the IP address when logging to the access log or error log.
But if you have multiple layers of caching, the X-Forward-For header might contain the IP address of the first proxy, instead of the user IP.
Varnish and the X-Forwarded-For header
If you use Varnish 4, the X-Forwarded-For header is set automatically. You don’t need to worry about it, you don’t need to set it yourself by writing VCL code.
Unfortunately, Varnish doesn’t know if the IP address in the X-Forwarded-For header is the one of the user. The fact that there’s an HAProxy server in front, isn’t really transparant for Varnish.
The solution: PROXY protocol support in Varnish
However, there is a solution: HAProxy invented the so-called PROXY protocol. It’s a protocol that HAProxy uses to connect to its backend by adding a preamble that contains information about the origin connection. Check out the full protocol spec on the HAProxy site.
Varnish 4.1 now supports the PROXY protocol and does the following things for you:
- It sets the client.ip variable to the IP address that was sent via the PROXY protocol
- It sets the server.ip variable to the IP address of the server that accepted the initial connection
- It sets the local.ip variable to the IP address of the Varnish server
- It sets the remote.ip variable to the IP address of the machine that sits in front of Varnish
- It adds the IP address of the origin connection to the X-Forwarded-For header
Please see the Varnish VCL documentation section for more information about these variables.
Basically, PROXY protocol support makes sure that Varnish doesn’t have to care about other proxies, as long as Varnish sits directly in front of the web application.
How to configure Varnish
Varnish needs to be aware of the protocol used to communicate with it. By default this is plain old HTTP. If you want PROXY support, it is advised to add en extra listener on a separate port or IP. Add the keyword “PROXY” explicitly to support the protocol.
The Gist below contains an example configuration where HTTP traffic is handled by port 6081, whereas PROXY traffic is handled over port 6083.
By separating the traffic, you can easily let Varnish determine whether or not we’r dealing with a HTTPS connection. You can of course let HAProxy set an X-Forwarded-Proto header that is passed to the application or that is even used by Varnish for cache variations.
You can also directly bind to primary listener to port 80 to directly handle HTTP traffic, without having another proxy server in play. This will change the values of the IP variables (client.ip, server.ip, local.ip, remote.ip). The X-Forwarded-For header will stay intact.
HTTPS traffic will be handled by an SSL terminator.
For testing purposes I created a VCL file that sets a bunch of custom headers containing client.ip, server.ip, local.ip and remote.ip. These will be printed by our application to display all the IP information.
How to configure HAProxy
The configuration in HAProxy is very simple: just add the send-proxy-v2 keyword to the server definition on your backends and you’re good to go. The example below is an example config that mixes HTTP and PROXY traffic.
Connections on port 80 are regular HTTP connections en connections on port 81 are PROXY connections.
What about your application?
Your application just has to check the X-Forwarded-For header. If it is set and contains an IP, the web server is behind a proxy. If not, just rely on the regular IP header (Remote Address). If you want to be SSL aware, check if the X-Forwarded-For header is set.
The script below prints out the custom headers sent by Varnish. That way we can debug the IP addresses when we try different ways to access the script
- You can either access the application directly. In that case, all these headers will be empty.
- Or you can access the application by going through Varnish on port 6081
- You can also access the application through HAProxy using plain old HTTP
- And finally you can access the application through HAProxy via PROXY protocol connection to Varnish
Bottom line
I hope by now you understand why PROXY protocol support is so useful. In this example we focused on Varnish and HAProxy, but there are is a lot more technology that supports the protocol:
By using the PROXY protocol you no longer have to worry about the origin IP and passing it through the different proxy layers of your setup.
I also created a video that summarizes the information of this blog post and that combines it with a demo. The demo is useful and also shows the output of the PHP script. I try a bunch of combinations and the output varies accordingly.