Varnishlog: measure your Varnish cache performance
Varnishlog is an essential tool to measure the performance of your Varnish cache. Why add a reverse caching proxy like Varnish in front of your application, if you can’t analyze how it performs or behaves?
This blog post is an in-depth tutorial on how to use the varnishlog tool to retrieve Varnish Shared Memory Log information. The goal is to filter the output and turn it into meaningful output that gives insight into the behavior of your Varnish cache.
Follow along for an in-depth tutorial. Use the Table Of Contents if you want to read specific sections of this long blog post.
Contents
Varnishlog
Varnishlog is a utility that reads the contents of the in-memory log that Varnish provides. This log is often referred to as the Varnish Shared memory Log (VSL). Because it is stored in memory, it’s not a persistent log.
The varnishlog utility offers you a way to have a look at what’s currently happening within your Varnish instance. By default, the output is grouped by transaction and is tag based. A tag has a value and varnishlog allows you to query and filter the output by tags and values.
Have a look at the man pages for full reference about the usage of the varnishlog binary.
The varnishlog is just a utility, an interface. The actual logs are generated by the varnishd
program. By default the VSL is 81 megabytes in size. However, the size VSL size can be set to a custom value by adding the -l
option to your Varnish startup options.
The VSL output contains log lines that belong to transactions. Every log line has a tag and a corresponding value. Here’s an example extract:
* << Request >> 10973258 - Begin req 10973257 rxreq - Timestamp Start: 1501507281.942533 0.000000 0.000000 - Timestamp Req: 1501507281.942533 0.000000 0.000000 - ReqStart 127.0.0.1 59753 - ReqMethod GET - ReqURL / - ReqProtocol HTTP/1.1 - ReqHeader Host: blog.feryn.eu
What does this output mean?
- The
Begin
tag indicates the start of a transaction- The
req
keyword indicates that this is a request transaction - The 10973257 value is the identifier of the parent transaction
- The
rxreq
keyword explains that this request is in fact a client request from the user to the Varnish instance
- The
- The
Timestamp
tags identify the starttime and endtime of the request. These are unix timestamps. - The
ReqStart
tag identifies the client IP and the client port number. In our case, the Varnish server sites behind another proxy, hence the 127.0.0.1 value - The
ReqMethod
tag states that this is a HTTP GET request - The
ReqURL
tag identifies the URL that was requested - The
ReqProtocol
tag indicates which version of the HTTP protocol was used. - And finally the
ReqHeader
tag displays the HTTP request headers that were sent by the client
This is only an extract. An actual transaction contains a lot more information. We’ll dig deeper throughout this blog post. But you’ll find the full reference for VSL on the Varnish documentation site.
Transactions
Transactions are items of work. Every transaction in the log is identified by a transaction ID (VXID). Log lines that share the same VXID belong to the same transaction.
There are 2 kinds of transactions:
- Sessions
- Requests
A session refers to the TCP connection that the client makes to the server. When keep-alive is supported, one connection can process multiple requests.
The request can have different incarnations:
- The client request to Varnish
- The backed request from Varnish to the webserver
- An ESI subrequest from Varnish to the webserver
Transactions can depend on one another, which means there’s a certain hierarchy:
- A session groups multiple requests
- A request can trigger a backend request
- The backend response can contain an ESI tag that requires a subsequent ESI subrequest
Basically: transactions initiate other transactions. These dependencies are visualized in the image below:

You see the Begin tag for the different transactions in the log output. You’ll notice the different types of transactions, the parent and child VXIDs, and the reason why this is transaction is occurring.
Here’s what happens in the example above:
- The session is started and identified by VXID 1 and doesn’t depend on any other transaction
- The initial request has VXID 2 and depends on the session
- Because the request causes a cache miss, a backend fetch is required. This is identified by VXID 3, which depends on the client request (VXID 2)
It may seem weird that a transaction that occurs later in the flow, is displayed first. Because of the dependencies between transactions, there is some kind of encapsulation going on. The image above illustrates this quite nicely: transaction 3 is shown first, because it is the inner most transaction in the flow, encapsulated by its parent transactions.
Transaction grouping
By default, transactions are grouped by their transaction ID (VXID): transactions are displayed in their entirety and separately, including sessions, and the different kinds of requests. This results in independent transactions with limited context about their correlation.
By adding the -g
argument to the varnishlog
binary, we can specify how we want to group our output. The following grouping modes are available:
- VXID: group by varnishlog transaction id. This is the default behaviour.
- Session: group the output by HTTP connection and link the correlated requests
- Request: group by request, and omit session information
- Raw: turn every log line into its own transaction. This omits any session and request correlation
Let’s illustrate these grouping modes with an example. The diagram below show the content composition of a website that has a header, a navigation, a footer, and the main content block.

The header and navigation are composed through 2 ESI requests:
- An ESI call to render the /header resource
- Ans ESI call to render the /nav resource
The homepage also loads the /footer resource through an AJAX call.
We inspect the shared memory log by executing the following command:
varnishlog -i Begin,ReqUrl,Link,BereqURL
This command displays the start of transactions, the request URL, linked transactions, and the backend request URL when a backend call is made.
This results in the following output:
* << BeReq >> 98318 - Begin bereq 98317 fetch - BereqURL / * << BeReq >> 98320 - Begin bereq 98319 fetch - BereqURL /header * << Request >> 98319 - Begin req 98317 esi - ReqURL /header - Link bereq 98320 fetch * << BeReq >> 98322 - Begin bereq 98321 fetch - BereqURL /nav * << Request >> 98321 - Begin req 98317 esi - ReqURL /nav - Link bereq 98322 fetch * << Request >> 98317 - Begin req 98316 rxreq - ReqURL / - Link bereq 98318 fetch - Link req 98319 esi - Link req 98321 esi * << BeReq >> 98324 - Begin bereq 98323 fetch - BereqURL /footer * << Request >> 98323 - Begin req 98316 rxreq - ReqURL /footer - Link bereq 98324 fetch * << Session >> 98316 - Begin sess 0 HTTP/1 - Link req 98317 rxreq - Link req 98323 rxreq
As you can see, there is no grouping, no hierarchy, an no indentation. Thanks to the Link
tag, we know which requests are triggered by what session, and which backend fetches are triggered by what request. But it’s not really intuitive.
We can also group by session. This would require us to use the following command:
varnishlog -i Begin,ReqUrl,Link,BereqURL -g session
The output would be the following:
* << Session >> 14 - Begin sess 0 HTTP/1 - Link req 65539 rxreq - Link req 65545 rxreq ** << Request >> 65539 -- Begin req 14 rxreq -- ReqURL / -- Link bereq 65540 fetch -- Link req 65541 esi -- Link req 65543 esi ** << Request >> 65545 -- Begin req 14 rxreq -- ReqURL /footer -- Link bereq 65546 fetch *** << BeReq >> 65540 --- Begin bereq 65539 fetch --- BereqURL / *** << Request >> 65541 --- Begin req 65539 esi --- ReqURL /header --- Link bereq 65542 fetch *** << Request >> 65543 --- Begin req 65539 esi --- ReqURL /nav --- Link bereq 65544 fetch *** << BeReq >> 65546 --- Begin bereq 65545 fetch --- BereqURL /footer **** << BeReq >> 65542 ---- Begin bereq 65541 fetch ---- BereqURL /header **** << BeReq >> 65544 ---- Begin bereq 65543 fetch ---- BereqURL /nav
In the example above, there’s indentation, and you can clearly see which session contains what requests, and which subrequests are triggered. This is the actual hierarchy of the example above:

When you group transactions based on the request using
varnishlog -g request
, session information will be omitted. When you perform raw grouping usingvarnishlog -g raw
, every log record is a transaction on its own. In this case there is no correlation between log records.
The VCL flow
Shared memory logs only provide meaning when you understand how they represent the VCL flow. Varnish uses the Varnish Configuration Language to extend its default behavior.
Here’s the diagram containing the flow:

It is essential that your understand the behavior, but that’s material for a blog post on its own. Basically, you should understand that Varnish receives requests and tries to return the cached response from memory, rather than fetching it from your backend.
Here’s a brief summary of the decision making process:
- Receive a request from the client
- Determine whether or not the request is cacheable
- If the request is not cacheable, directly fetch the response from the backend and feed it back to the client
- If the request is cacheable, look the object up in cache by composing a lookup hash
- If the object is found, use the object date to compose the response
- If the object is not in cache, fetch the response from the backend
- Store the object in cache if it is cacheable
- Return the response to the client
The differente stages of the flow are identified by the corresponding hook names. These names are useful to know when you come across them in the memory logs, as they provide context.
So far we’ve talked about loglines, about the correlation between log lines and transactions, and the correlation between transactions. In the previous examples, we already hinted towards tags. I showed you some basic tags like Begin, Link, ReqUrl, ReqHeader. Although these tags were useful in the context of transaction identification, the output gives little meaning from a broader perspective.
It’s about time we introduce some more tags and group them according to their meaning. The full reference can be found on the Varnish documentation website.
There are different kinds of tags:
- Request tags
- Response tags
- Backend tags
- Backend request tags
- Backend response tags
- Object tags
- VCL behavior tags
- Miscellaneous tags
Request tags are used to identify the different bits and pieces of an incoming HTTP request.
- ReqStart: indicates the start of request processing
- ReqMethod: the HTTP method used to perform the request
- ReqURL: the URL that is addressed by the request
- ReqProtol: the HTTP version used by the request
- ReqHeader: a set of request headers that are passed.
You’ll find these tags in the log lines below:
- ReqStart 127.0.0.1 56312 - ReqMethod GET - ReqURL / - ReqProtocol HTTP/1.1 - ReqHeader Host: localhost - ReqHeader Connection: keep-alive - ReqHeader Upgrade-Insecure-Requests: 1 - ReqHeader User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 - ReqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8 - ReqHeader Accept-Encoding: gzip, deflate, br - ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6 - ReqHeader X-Forwarded-For: 127.0.0.1
Response tags identify log lines that give insight about the HTTP response that is returned to the client by Varnish. Here’s an overview of tags:
- RespProtocol: the HTTP version that is used in the response
- RespStatus: the HTTP response status code that indicates the status of the response
- RespReason: the meaning of the HTTP response status code
- RespHeader: a set of HTTP response headers that is returned by Varnish, primarily based on the HTTP response headers that were returned by the backend
- RespProtocol HTTP/1.1 - RespStatus 200 - RespReason OK - RespHeader Host: localhost - RespHeader Cache-Control: public, s-maxage=500 - RespHeader Date: Tue, 01 Aug 2017 08:56:44 GMT - RespHeader ETag: "c5afddc587599a72d467caca23e980bf" - RespHeader Vary: Accept-Language - RespHeader Content-Length: 3098 - RespHeader Content-Type: text/html; charset=UTF-8 - RespHeader X-Varnish: 32770 - RespHeader Age: 10
The example above returns a simple page where no errors occur, hence the 200 status code. It’s clear that this response is an HTTP 1.1 response. The response headers that the backend returned, are included, but Varnish also adds a couple of extra headers:
- X-Varnish: the VXID that can be used to retrieve this transaction from the logs
- Age: the amount of time the object has been in the cache
When a connection to the backend is made, the information about the backend is logged, as shown by the log extracts below:
- BackendOpen 19 boot.default 127.0.0.1 8080 127.0.0.1 62552 - BackendClose 19 boot.default
In this case we’re setting up a connection on the localhost to a webserver that is running on port 8080. The local port on Varnish that is used is port 62552. The backend is identified as “default”, and was loaded at boot time. Later on we’re closing the connection to that named backend.
In one of the next segments, I’ll show you how to monitor backend health using Backend_health
tags.
There are other backend tags available, you’ll find them on the Varnish documentation page.
When an object cannot be served from cache, Varnish makes a backend request. The backend request object contains the data of the original request object, but with potentially updated values. Typical backend request tags are:
- BereqMethod
- BereqURL
- BereqProtocol
- BereqHeader
Again, there’s more, but I won’t cover them all.
It will come as no surprise that the backend response tags are almost identical to the response tags. A backend response object contains payload and headers that were returned by the backend. The value of a backend response will be stored in a cache object, and later on copied into a response object.
These are the most common backend response tags:
- BerespProtocol
- BerespStatus
- BerespReason
- BerespHeader
Object tags represent cached objects. The value of a cache object will primarily come from a backend response. Here are the object tags, which are again quite similar to the response tags:
- ObjProtocol
- ObjStatus
- ObjReason
- ObjHeader
When examining logs, the input and output data is significant, but the behavior of the cache is just as important. The VCL related tags are injected at different stages in the logs to indicate which part of the flow has been reached by the transaction. Here are some example tags:
- VCL_Error: log error messages that were generated during the VCL execution
- VCL_Log: custom log messages that were inserted in the VCL by calling the
std.log()
function somewhere in the VCL code - VCL_acl: the results of an access control list validation that was set in the VCL code
- VCL_call: contains the name of the VCL method that is being executed. This method represents a state in the VCL flow.
- VCL_return: logs the VCL method termination statement. This indicates which decision the VCL makes and implies a transition to the next state.
If we perform another varnishlog
call on our example website, where we only display the URL, the backend request URL, and the relevant VCL tags, the output could be the following:
* << Request >> 5 - ReqURL / - VCL_call RECV - VCL_return hash - VCL_call HASH - VCL_return lookup - VCL_call MISS - VCL_return fetch - VCL_call DELIVER - VCL_return deliver ** << BeReq >> 6 -- BereqURL / -- VCL_call BACKEND_FETCH -- VCL_return fetch -- VCL_call BACKEND_RESPONSE -- VCL_return deliver
The 2 transactions give a clear indication of what’s going on:
- The request for the homepage is received (VCL_call RECV)
- We continue the flow and decide to compose a lookup hash (VCL_return hash)
- The hash is composed in the VCL_Hash method (VCL_call HASH)
- Once the hash is composed, we look the object up in cache (VCL_return lookup)
- Unfortunately, we didn’t find it, so we end up with a miss (VCL_call MISS)
- This means we need to fetch the data from the backend (VCL_return fetch)
- We continue to the second transaction where we execute a backend fetch for the homepage (VCL_call BACKEND_FETCH)
- We receive the fetched data (VCL_return fetch)
- We examine the backend response and store it in cache (VCL_call BACKEND_RESPONSE)
- We deliver the HTTP response back to the client thread (VCL_return deliver)
- In the main thread we receive the HTTP response from the backend and we decide to deliver the data to the end-user (VCL_call DELIVER)
- Eventually the HTTP response is returned to the end-user (VCL_return deliver)
The example above is quite extensive, because the result is a cache miss, which requires backend access. When the object would be in cache, the logs could look as follows:
* << Request >> 6 - ReqURL / - VCL_call RECV - ReqURL / - VCL_return hash - VCL_call HASH - VCL_return lookup - VCL_call HIT - VCL_return deliver - VCL_call DELIVER - VCL_return deliver
This is the decision make process for this request:
- The request for the homepage is received (VCL_call RECV)
- We continue the flow and decide to compose a lookup hash (VCL_return hash)
- The hash is composed in the VCL_Hash method (VCL_call HASH)
- Once the hash is composed, we look the object up in cache (VCL_return lookup)
- Luckily, the object is stored in cache, and we end up with a hit (VCL_call HIT)
- We receive the object from cache, and we decide to deliver the data to the end-user (VCL_return deliver)
- We enter the delivery stage where modifications to the output can still happen (VCL_call DELIVER)
- Eventually the HTTP response is returned to the end-user (VCL_return deliver)
There are plenty of other scenarios, but a hit and a miss are the most essential scenarios.
The
VCL_call
andVCL_return
tags are, in my opinion, the most useful tags. They provide meaning and context. Without them, it’s sometimes hard to determine what happened to certain requests or responses. Use them in conjunction with request and backend response information.
The Varnish Shared Memory Logs also contain some tags that are specifically cache related.
The ExpBan tag
The ExpBan
tags is used when the VCL bans an object from cache, a log line with the ExpBan
appears containing the VXID of the parent transaction.
When we would ban the /header
resource and call the varnishlog -i "ReqUrl,ExpBan"
command, the output could look like this:
* << Request >> 32789 - ReqURL /header - ExpBan 98355 banned lookup
The Hit
tag appears when a transaction results in a cache hit. The VXID of the parent transaction is returned. By executing varnishlog -i "ReqUrl,Hit"
, we’d get the following output:
* << Request >> 98369 - ReqURL /footer - Hit 65597
The HitPass tag
The HitPass
tag is used for log lines that are negatively cached in the Hit-For-Pass cache. This is caused when the vcl_backend_response
method inspects the backend responds and decides the response cannot be cached, for a variety of reasons. This means the decision not to cache is cached in a special Hit-For-Pass cache. This avoids unnecessary request queueing.
When you encounter a HitPass
tag appears with your request tags, this means the request will bypass the rest of the decision making process, and will immediately be sent to the backend. Here’s an example:
** << Request >> 6 -- ReqURL /footer -- HitPass 3
The log extract above indicates that the response for the /footer
resource was negatively cached. For the next 120 seconds, Varnish will bypass the decision whether or not to cache, and serve the data directly from the backend, regardless of other criteria.
The TTL tag
The TTL
tag is used to log the Time-To-Live that is assigned to an object that is about to be stored in cache. Here’s the format of the value for this tag:
%s %d %d %d %d [ %d %d %u %u ] | | | | | | | | | | | | | | | | | +- Max-Age from Cache-Control header | | | | | | | +---- Expires header | | | | | | +------- Date header | | | | | +---------- Age (incl Age: header value) | | | | +--------------- Reference time for TTL | | | +------------------ Keep | | +--------------------- Grace | +------------------------ TTL +--------------------------- "RFC" or "VCL"
The TTL
value provides a substantial amount of information regarding the decision making process in terms of the Time-To-Live:
If the appropriate Cache-control
or Expires
headers are provided, an order of precedence is respected:
- If the VCL has an explicit
beresp.ttl
value, this value will be the TTL - If the HTTP response has a
Cache-control
header containing ans-maxage
expression, this value becomes the TTL - If the HTTP response has a
Cache-control
header containing anmax-age
expression, this value becomes the TTL - If the HTTP response has an
Expires
header, this value becomes the TTL - If none of these are present, the default TTL is used, which is 2 minutes
If the TTL is set based on HTTP response headers, the value for the first column is RFC
. If the TTL is set by VCL, the value is VCL
.
The effective Time-To-Live is stored in the second column of the log line.
If the VCL has specific grace, or keep values, this is added to the Time-To-Live, and gives it some extra leeway. These values are added as the third and fourth column.
The fifth column contains the current timestamp.
When we’re dealing with a TTL that was set by the VCL, the last four columns aren’t displayed. Here’s such an example:
-- TTL VCL 120 10 0 1501597242
Let’s break it down:
- The TTL is decided by the VCL
- The effective TTL is 120 seconds (2 minutes)
- There’s 10 seconds of
Grace
in case the backend is down when the object expires - There’s no
Keep
value, which is the grace equivalent for conditional requests - The time of processing is 1501597242, which converts to the following timestamp: 2017-08-01 14:20:42
And here’s an example when the backend sets the TTL:
-- RFC 500 10 -1 1501598872 1501598872 1501598872 0 500
Let’s analyze the output:
- The backend sets the TTL to 500 seconds
- There’s 10 seconds of grace
- There’s no keep value
- The time of processing is 2017-08-01 14:47:52, which is also the value for the age, and the
Date
response header - There is no expires header
- However, there is a
Cache-Control
header that sets a TTL of 500. In this case this is through ans-maxage
expression
If you would add a custom Age
header to your HTTP response, you would in fact mess with your Time-To-Live: the amount of seconds that is stated would be deducted from your TTL. In the example below the response contains an Age=3
header which makes the object 3 seconds old from the go.
Instead of a 500 second TTL, the object will only be cached for 497 seconds. This is reflected in the Age
column, which is off by 3 seconds:
-- RFC 500 10 -1 1501598872 1501598869 1501598872 0 500
The Begin tag
The Begin
talk isn’t really new to us, I mentioned the tag earlier in this post. This tag indicates the beginning of a transaction, staging which kind of transaction it is, to which parent transaction it belongs, followed by a bit more context.
Here are a couple of examples:
- Begin bereq 98317 fetch - Begin req 98317 esi - Begin req 98316 rxreq - Begin sess 0 HTTP/1
As mentioned before: the transaction is either a session, a request, or a backend request. Backend requests could be regular requests, or ESI requests.
The Link tag
Whereas the Begin
tag links to its parent VXID, the Link
tag expresses links to a child VXID. Again, a couple of examples:
- Link bereq 98312 fetch - Link req 98311 rxreq - Link req 98318 esi
The Timestamp tag
At several points in the log, Timestamp
tags will occur. They contain timing information. This is the format:
%s: %f %f %f | | | | | | | +- Time since last timestamp | | +---- Time since start of work unit | +------- Absolute time of event +----------- Event label
Here’s an example containing timestamps:
* << Request >> 65539 - Timestamp Start: 1501601912.758662 0.000000 0.000000 - Timestamp Req: 1501601912.758662 0.000000 0.000000 - Timestamp Fetch: 1501601912.806733 0.048071 0.048071 - Timestamp Process: 1501601912.806750 0.048088 0.000017 - Timestamp Resp: 1501601912.806787 0.048125 0.000037 ** << BeReq >> 65540 -- Timestamp Start: 1501601912.758753 0.000000 0.000000 -- Timestamp Bereq: 1501601912.758952 0.000199 0.000199 -- Timestamp Beresp: 1501601912.806677 0.047924 0.047725 -- Timestamp BerespBody: 1501601912.806749 0.047996 0.000072
What can we learn from this example?
- Fetching the data took 0.048071 seconds
- Processing the backend response took 0.000017 seconds
- The full unit of work was finished in 0.048125 seconds
When we dig deeper, and analyze the backend request, we learn the following lessons:
- Sending the backend request to the backend took 0.000199 seconds
- It took the backend 0.047725 seconds to send the response
- Fetching the body from the backend took 0.000072 seconds
If you just want to know how long the full transaction lasted, the very last occurrence of the Timestamp
in the request suffices. The other occurrences are useful if you want to know where a possible delay happened.
Filtering output
At this point, we have figured out how transactions work and which tags can occur in our log lines. When you run the varnishlog
command, the unfiltered output tends to get very noise on a system that gets a reasonable amount of traffic.
The Asciicast below shows how it can quickly become a mess:
Let’s face it: most of the time, you don’t need to see all the data. You’ll filter the output, and only display the relevant tags for your use case.
Varnishlog offers 4 basic ways to filter your output:
- Include specific tags via the
-i
option - Exclude specific tags via the
-x
option - Include tags by regular expression match via the
-I
option - Exclude tags by regular expression match via the
-X
option
The most common type of filtering I do, is to include the URL of a request and its VCL behavior. The following command would do the trick for you:
varnishlog -i ReqUrl,VCL_call,VCL_return -g session
This could be the output you get:
* << Session >> 252394 ** << Request >> 252395 -- ReqURL / -- VCL_call RECV -- VCL_return hash -- VCL_call HASH -- VCL_return lookup -- VCL_call HIT -- VCL_return deliver -- VCL_call DELIVER -- VCL_return deliver *** << Request >> 252397 --- ReqURL /header --- VCL_call RECV --- VCL_return hash --- VCL_call HASH --- VCL_return lookup --- VCL_call HIT --- VCL_return deliver --- VCL_call DELIVER --- VCL_return deliver *** << Request >> 252399 --- ReqURL /nav --- VCL_call RECV --- VCL_return hash --- VCL_call HASH --- VCL_return lookup --- VCL_call HIT --- VCL_return deliver --- VCL_call DELIVER --- VCL_return deliver *** << BeReq >> 252396 --- VCL_call BACKEND_FETCH --- VCL_return fetch --- VCL_call BACKEND_RESPONSE --- VCL_return deliver **** << BeReq >> 252398 ---- VCL_call BACKEND_FETCH ---- VCL_return fetch ---- VCL_call BACKEND_RESPONSE ---- VCL_return deliver **** << BeReq >> 252400 ---- VCL_call BACKEND_FETCH ---- VCL_return fetch ---- VCL_call BACKEND_RESPONSE ---- VCL_return deliver
Including multiple tags can either be done by adding multiple -i
parameter, or by passing a comma-separated list of values.
These values can also be globs. We could short the command above using globs:
varnishlog -i "ReqUrl,VCL_*" -g session
We can do the exact opposite, and exclude tags we don’t care about. Let’s combine tag inclusion and exclusion in the following example:
varnishlog -i "Req*" -x ReqHeader,ReqUnset
In this example we include all tags that start with req, but we want to exclude the very verbose request headers, and potential unset actions for the request. This is our output:
* << Request >> 314125 - ReqStart 127.0.0.1 64585 - ReqMethod GET - ReqURL / - ReqProtocol HTTP/1.1 - ReqAcct 476 0 476 311 0 311 * << Request >> 314126 - ReqStart 127.0.0.1 64585 - ReqMethod GET - ReqURL /footer - ReqProtocol HTTP/1.1 - ReqAcct 370 0 370 309 0 309
Makes sense, right?
A header that occurs a lot in our output is the request header. Quite often you don’t want to have all the request headers in your output, just that one will do. The -I
option allows you to include log lines by regular expression match.
Image you’re interested the Accept-Language
value for a specific request, you can use the following command to get the URL and the language:
varnishlog -I "reqheader:Accept-Language" -i requrl
What we’re doing is including the ReqUrl
tag, and including all ReqHeader
tags that match Accept-Language
.
This is the output:
* << Request >> 374378 - ReqURL / - ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6 * << Request >> 374379 - ReqURL /footer - ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6
Similar deal here, but for exclusion. Image you want to see the HTTP response headers that will be returned to the client, but exclude the custom headers that start with an x- or X-.
The regular output, without the exclusion would be:
* << Request >> 374384 - ReqURL /footer - RespHeader Host: localhost - RespHeader X-Powered-By: PHP/7.0.15 - RespHeader Cache-Control: public, s-maxage=500 - RespHeader Date: Wed, 02 Aug 2017 11:27:21 GMT - RespHeader ETag: "d47ac09f5351f8f4c97c99ef5b3d2ecd" - RespHeader Vary: Accept-Language - RespHeader Content-Length: 80 - RespHeader Content-Type: text/html; charset=UTF-8 - RespHeader X-Varnish: 374384 15367 - RespHeader Age: 334 - RespHeader Via: 1.1 varnish-v4 - RespHeader Connection: keep-alive
Then we add the exclusion:
varnishlog -i RespHeader,ReqUrl -X "RespHeader:(x|X)-"
And this is the result:
* << Request >> 374384 - ReqURL /footer - RespHeader Host: localhost - RespHeader Cache-Control: public, s-maxage=500 - RespHeader Date: Wed, 02 Aug 2017 11:27:21 GMT - RespHeader ETag: "d47ac09f5351f8f4c97c99ef5b3d2ecd" - RespHeader Vary: Accept-Language - RespHeader Content-Length: 80 - RespHeader Content-Type: text/html; charset=UTF-8 - RespHeader Age: 334 - RespHeader Via: 1.1 varnish-v4 - RespHeader Connection: keep-alive
Yet again, we’re combining multiple filtering options to get the ideal output.
All-in-one tag filter example
Let’s finish this section with an all-in-one example that combines the 4 kinds of filters:
varnishlog -i "RespHeader,Req*" -X "RespHeader:(x|X)-" -I "timestamp:Resp" -x reqprotocol,reqacct -g request
This command does the following:
- Include all response headers
- Include all log lines where the tag starts with req
- Exclude response headers that start with x-, or X-
- Include the timestamp, but only the part where the response timestamp is measured
- Exclude request protocol log lines, and the request accountancy log lines
This is the output:
* << Request >> 59383 - ReqStart 127.0.0.1 53195 - ReqMethod GET - ReqURL / - ReqHeader Host: localhost - ReqHeader Connection: keep-alive - ReqHeader Cache-Control: max-age=0 - ReqHeader Upgrade-Insecure-Requests: 1 - ReqHeader User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36 - ReqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8 - ReqHeader Accept-Encoding: gzip, deflate, br - ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6 - ReqHeader If-None-Match: W/"27f341f8e459dd35f1087c55351cacda" - ReqHeader X-Forwarded-For: 127.0.0.1 - ReqUnset Accept-Language: nl,en-US;q=0.8,en;q=0.6 - ReqHeader accept-language: nl - ReqHeader Surrogate-Capability: key=ESI/1.0 - ReqUnset Accept-Encoding: gzip, deflate, br - ReqHeader Accept-Encoding: gzip - RespHeader Host: localhost - RespHeader Cache-Control: public, s-maxage=500 - RespHeader Date: Wed, 02 Aug 2017 11:46:49 GMT - RespHeader ETag: "27f341f8e459dd35f1087c55351cacda" - RespHeader Vary: Accept-Language - RespHeader Content-Length: 3098 - RespHeader Content-Type: text/html; charset=UTF-8 - RespHeader Age: 152 - RespHeader Via: 1.1 varnish-v4 - RespHeader ETag: W/"27f341f8e459dd35f1087c55351cacda" - RespHeader Connection: keep-alive - Timestamp Resp: 1501674561.472358 0.000068 0.000020
Powerful stuff!
VSL queries
So far we’ve been filtering the output, and only displaying log lines that match specific tags we’re interested in. The good thing is that we’re reducing the amount of output we’re getting by doing tag-based filtering.
But that still means all transactions are processed, and tags are stripped off. It doesn’t allow us to reduce the amount of transactions we’re getting.
That’s where VSL queries come into play.
The -q
option gives you the possibility to query for transactions that match patterns and values. The VSL query language can do this for you.
This is the format for VSL queries:
<record selection criteria> <operator> <operand>
The record selection criteria can be tags, taglists, tag prefixes, or even fields for tags prefixes. You can even decide at which transaction level you wish to query.
This is the format for the record selection criteria:
{level}taglist:record-prefix[field]
Here’s a list of operators you can use for record selection comparison:
==
the compared value equals the record value (numerical)!=
the compared value doesn’t equal the record value (numerical)<
the compared value is more than the record value (numerical)<=
the compared value is more than or equal the record value (numerical)- > the compared value is less than the record value (numerical)
>=
the compared value is less than or equal the record value (numerical)eq
the compared value equals the record value (string comparison)ne
the compared value doesn’t equal the record value (string comparison)~
the compared value matches the record value (regular expression)!~
the compared value doesn’t match the record value (regular expression)
VSL queries can be combined using
and
,or
, andnot
boolean functions
VSL query examples
Let’s throw some examples into the mix, to show you how it works:
varnishlog -i VCL_call,VCL_return -g request -q "ReqURL eq '/'"
This command returns the VCL_call
and VCL_return
values for the homepage.
This is the output:
* << Request >> 374400 - VCL_call RECV - VCL_return hash - VCL_call HASH - VCL_return lookup - VCL_call HIT - VCL_return deliver - VCL_call DELIVER - VCL_return deliver
We notice the homepage is served from cache, which is good news!
Another common example is displaying all URLs that aren’t stored in cache. They are either stored in the blacklist because they’re not cacheable, or they’re not yet in cache. The following command would display this information:
varnishlog -i ReqUrl -q "VCL_call eq 'MISS' or VCL_call eq 'PASS'"
A final example is a time measurement example:
varnishlog -i ReqUrl -I "Timestamp:Resp" -q "Timestamp:Resp[2] > 1.0"
This command displays the URL, and the response timestamp for all requests that take longer than 1 second. The query looks for the Timestamp
tag, and takes the second column for any occurence that has Resp
as the prefix. If that value exceeds 1 second, it gets reported.
Note that timestamp values are floats. Integers are invalid for this type of comparison
Miscellaneous command line options
Throught this article, a lot of useful varnishlog
command line options were covered. There are other options you’ll probably never use, but I want to mention some of them anyway.
Getting the full memory log
By default we read the tail end of the log, because this contains the latest information. If you add the -d
option, varnishlog will start at the beginning of the log. This means you’ll see older log items that are sent to stdout.
This could be useful to measure certain assumptions on a larger sample.
varnishlog -d
Writing the varnishlog output to a file
Varnishlog prints a continuously updated list of log items, or dumps the full log to stdout, depending on your output settings.
But varnishlog has the command line options to redirect the output to a file.
varnishlog -w file
The command above saves the varnishlog output to a file in binary format. This file can be replayed using the following command:
varnishlog -r file
You can also decide to append logs to the file by adding the -a
option:
varnishlog -a -w file
And finally you can decide to store the output in plain text by adding the -A
option:
varnishlog -A -a -w file
You can daemonize varnishlog to continuously log output to a file. You ‘ll add the -D
option and you can save the PID to a file by adding the -P
option. You can also add filtering criteria to continuously log items you care about:
varnishlog -i "ReqUrl,VCL_*" -D -a -A -w /var/log/varnish/custom.log -P /var/run/custom_varnishlog.pid
This command will make varnishlog run in the background and store human-readable logs in /var/log/varnish/custom.log
and append data. The PID of the daemon is stored in /var/run/custom_varnishlog.pid
.
Multiple instances
When you run multiple instances of Varnish on a server, the varnishd
process will use the -n
parameter to name the instance. Varnishlog can use that same option to identify the Varnish instances who’s shared memory log it should examine.
varnishlog -n myVarnishInstance
What about varnishtop?
All I’ve been talking about is varnishlog. It’s a great tool, but it just spits out logs. Varnishtop on the other hand, uses the Varnish Shared Memory Logs to create a continuously updated list of the most commonly occurring log entries.
It’s basically like the
top
command on Linux, but for Varnish.
It makes no sense what so ever to run varnishtop
without filters. Here’s a command to see the most popular URLs that are processed by Varnish:
varnishtop -i ReqUrl
This is the output:
You can use all the filters and queries that varnishlog and create meaningful output.
Possible scenario: show the most popular browser languages for users that hit the homepage. We’d use the following command:
varnishtop -I reqheader:Accept-Language -q "ReqUrl eq '/'"
And this would be the output:
list length 4 0.86 ReqHeader Accept-Language: en-US 0.80 ReqHeader Accept-Language: nl-NL,nl;q=0.8,en-US;q=0.6,en;q=0.4 0.54 ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6 0.39 ReqHeader Accept-Language: nl-BE
Why didn’t Varnish serve this request from cache?
90% of the times I use varnishlog
, it’s to figure out why a page is not cached. The decision whether not to cache happens at 2 stages:
- When receiving the request (in the
vcl_recv
method) - When receiving the backend response (in the
vcl_backend_response
method)
Each of these stages have their own set of criteria. The shared memory logs will help you figure out why Varnish didn’t cache your page.
This is the default behavior of the VCL in Varnish. If you extend the behavior with custom VCL scripting, you can change this rules and still decide to cache requests that were otherwise not cacheable.
The request is not cacheable
When Varnish receives an HTTP request from the client, it will check whether or not it is stateless. By default only stateless request are deemed cacheable.
Varnish will not cache if:
- The request is not a
GET
orHEAD
request - If there’s an
Authorization
header present - If there are cookies
The following command can help us out in our quest to determine why the page is not cached:
varnishlog -i "Req*,VCL*" -x ReqAcct,ReqStart -q "VCL_call eq 'PASS'"
Cookies
Here’s some output for a non-cacheable request:
* << Request >> 12 - ReqMethod GET - ReqURL / - ReqProtocol HTTP/1.1 - ReqHeader Host: localhost - ReqHeader User-Agent: curl/7.48.0 - ReqHeader Accept: */* - ReqHeader Cookie: myCookie=bla - ReqHeader X-Forwarded-For: 127.0.0.1 - VCL_call RECV - VCL_return pass - VCL_call HASH - VCL_return lookup - VCL_call PASS - VCL_return fetch - VCL_call DELIVER - VCL_return deliver
We’r specifically looking for requests that were passed to the backend. In the output we notice that a cookie was set. That’s why the page cannot be cached.
POST calls
Here’s anther scenario, where there are not cookies, but the request is still passed:
<< Request >> 32779 - ReqMethod POST - ReqURL / - ReqProtocol HTTP/1.1 - ReqHeader Host: localhost - ReqHeader User-Agent: curl/7.48.0 - ReqHeader Accept: */* - ReqHeader X-Forwarded-For: 127.0.0.1 - VCL_call RECV - VCL_return pass - VCL_call HASH - VCL_return lookup - VCL_call PASS - VCL_return fetch - VCL_call DELIVER - VCL_return deliver
Although there’s no cookie, the request is still passed to the backend. That’s because the request was sent using the POST
method. Post calls aren’t cacheable, because they imply a state change.
Authorization headers
Here’s another example of a passed request without cookies and with a GET
request:
* << Request >> 15 - ReqMethod GET - ReqURL / - ReqProtocol HTTP/1.1 - ReqHeader Host: localhost - ReqHeader Authorization: Basic dGhpanM6ZmVyeW4= - ReqHeader User-Agent: curl/7.48.0 - ReqHeader Accept: */* - ReqHeader X-Forwarded-For: 127.0.0.1 - VCL_call RECV - VCL_return pass - VCL_call HASH - VCL_return lookup - VCL_call PASS - VCL_return fetch - VCL_call DELIVER - VCL_return deliver
Again, the page is not cached. That’s because there’s an Authorization
header. Authorization implies state, and is not cached by Varnish.
The response is not cacheable
When a request is considered cacheable, but the object is not found in cache, a backend fetch happens because of the miss. When the vcl_backend_response
method receives the response from the backend, it examines the headers for non-cacheable values.
If it turns out that the response isn’t cacheable, Varnish will mark it as Hit-For-Pass for the next 2 minutes.
The following command lists URLs, VCL behavior, and backend response information for transactions that contain a HitPass
log line.
varnishlog -i ReqUrl,VCL_*,Beresp*,TTL -q "HitPass" -g request
Zero TTL
Imagine that this is the output:
* << Request >> 19010384 - ReqURL /my-url - VCL_call RECV - VCL_return hash - VCL_call HASH - VCL_return lookup - VCL_call PASS - VCL_return fetch - VCL_call DELIVER - VCL_return deliver ** << BeReq >> 19010385 -- VCL_call BACKEND_FETCH -- VCL_return fetch -- BerespProtocol HTTP/1.1 -- BerespStatus 200 -- BerespReason OK -- BerespHeader Date: Thu, 03 Aug 2017 08:15:22 GMT -- BerespHeader Server: Apache/2.4.10 (Debian) -- BerespHeader Last-Modified: Tue, 01 Aug 2017 07:21:00 GMT -- BerespHeader ETag: "5c0d-555abfd3f422f-gzip" -- BerespHeader Vary: Accept-Encoding -- BerespHeader Content-Encoding: gzip -- BerespHeader Cache-Control: max-age=0 -- BerespHeader Expires: Thu, 03 Aug 2017 08:15:22 GMT -- BerespHeader Content-Length: 7686 -- BerespHeader Content-Type: application/json -- TTL RFC 0 10 -1 1501748123 1501748123 1501748122 1501748122 0 -- VCL_call BACKEND_RESPONSE -- TTL VCL 120 10 0 1501748123 -- VCL_return deliver
What does this mean? The HTTP response for the /my-url
resource was not stored in cache, because it did not comply to the criteria that are validated in the vcl_backend_response
method.
In our case the Cache-Control: max-age=0
header that was returned by the webserver causes Varnish not to cache the object. It is marked as Hit-For-Pass, and subsequent requests are immediately sent to the backend.
We can retrieve this information from the BerespHeader
, but the last field of the first TTL
occurrence also show that the max-age
or s-maxage
value is zero.
Another reason could be that a cookie was set in the backend. Varnish doesn’t cache responses that set cookies, as they imply state changes.
Here’s some output:
* << Request >> 65551 - ReqURL /set-cookie - VCL_call RECV - VCL_return hash - VCL_call HASH - VCL_return lookup - VCL_call PASS - VCL_return fetch - VCL_call DELIVER - VCL_return deliver ** << BeReq >> 65552 -- VCL_call BACKEND_FETCH -- VCL_return fetch -- BerespProtocol HTTP/1.1 -- BerespStatus 200 -- BerespReason OK -- BerespHeader Cache-control: s-maxage=10 -- BerespHeader Set-Cookie: myCookie=bla -- BerespHeader Content-type: text/html; charset=UTF-8 -- BerespHeader Date: Thu, 03 Aug 2017 08:39:04 GMT -- TTL RFC 10 10 -1 1501749545 1501749545 1501749544 0 10 -- VCL_call BACKEND_RESPONSE -- TTL VCL 120 10 0 1501749545 -- VCL_return deliver
The first occurrence of the TTL
tag shows that there is a valid max-age
or s-maxage
value. The value is 10, and this is also reflected by BerespHeader Cache-control: s-maxage=10
. However, the Hit-For-Pass still occurs, and that’s because of the cookie that was set. This is reflected in the logs by BerespHeader Set-Cookie: myCookie=bla
.
If you set the Time-To-Live to zero in your
Cache-Control
header, or you addprivate
,no-cache
, orno-store
expressions to yourCache-Control
header, the object will not be cached. This also applies to anExpires
header that was set to a time in the past. The same thing happens when you set a cookie. These are the most common reasons why Varnish doesn’t cache objects where the request look like it was cacheable.
Summary
By now, you should realize that the Varnish Shared Memory Logs (VSL) are tremendously useful. You can hook into what’s currently happening on the server. Both the varnishlog
and varnishtop
binaries are at your service to provide the necessary insight.
Both utilities have filter options and a query language to display the exact information you need.
The Varnish documentation site has a couple of pages that are dedicated to this topic:
- The VSL-query reference page
- The VSL reference page
- The varnishlog reference page
- The varnishtop reference page
If you really want to leverage these tools, you should have a decent knowledge of the internal flow within Varnish, the default VCL behavior, and the VCL methods. Any custom VCL code that your Varnish instances uses, will also influence this behavior.
Read my book
If you want to learn more about Varnish, I suggest you read my book. It’s published by O’Reilly.
In this book I explain:
- How Varnish works,
- How you can configure the server
- How Varnish respects HTTP best practices
- What VCL is and how you can write your own VCL file
- How to increase your hit rate
- How to invalidate your cache
- What Varnish can mean for your business