Process AWS ELB logs with Logstash!
I recently ran into a problem with Logstash, where it would not correctly and consistently process AWS (Amazon Web Services) ELB (Elastic Load Balancer) logs from an S3 bucket.
ELB logs are the standard method of logging ELB usage and activity in AWS, and are very useful when trying to determine a wide variety of trends and activities.
I was determined to get ELB logs into Elasticsearch, using logstash, so I needed to figure out a few things…
NOTE: This is all correct at the time of this writing. If this changes, I will try to come back and update.
First, I gathered a couple of example log entries to process.
NOTE: I did this myself originally with my own log samples, but AWS has recently put up a few (4) useful log examples:
2015-05-13T23:39:43.945958Z my-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.000073 0.001048 0.000057 200 200 0 29 "GET http://www.example.com:80/ HTTP/1.1" "curl/7.38.0" - - 2015-05-13T23:39:43.945958Z my-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.000086 0.001048 0.001337 200 200 0 57 "GET https://www.example.com:443/ HTTP/1.1" "curl/7.38.0" DHE-RSA-AES128-SHA TLSv1.2 2015-05-13T23:39:43.945958Z my-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.001069 0.000028 0.000041 - - 82 305 "- - - " "-" - - 2015-05-13T23:39:43.945958Z my-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.001065 0.000015 0.000023 - - 57 502 "- - - " "-" ECDHE-ECDSA-AES128-GCM-SHA256 TLSv1.2
Taking the second example (As it looks like an ideal example), I’ll break it down to a separate line per value, but just remember that this should be used as a single line:
2015-05-13T23:39:43.945958Z my-loadbalancer 192.168.131.39:2817 10.0.0.1:80 0.000086 0.001048 0.001337 200 200 0 57 "GET https://www.example.com:443/ HTTP/1.1" "curl/7.38.0" DHE-RSA-AES128-SHA TLSv1.2
I looked at the format of the logs, and did a little research to understand the ELB log format. According to docs and samples, it looks like this (Note the parts in quotes!):
timestamp name client:port backend:port request_processing_time backend_processing_time response_processing_time elb_status_code backend_status_code received_bytes sent_bytes "request" "user_agent" ssl_cipher ssl_protocol
Using these, I created a Logstash grok filter to process my logs:
%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:elb_name} %{IP:elb_client_ip}:%{NUMBER:elb_client_port} %{IP:elb_backend_ip}:%{NUMBER:elb_backend_port} %{NUMBER:request_processing_time} %{NUMBER:backend_processing_time} %{NUMBER:response_processing_time} %{NUMBER:elb_status_code:} %{NUMBER:backend_status_code} %{NUMBER:elb_received_bytes} %{NUMBER:elb_sent_bytes} %{QS:elb_request} %{QS:userAagent} %{NOTSPACE:elb_ssl_cipher} %{NOTSPACE:elb_ssl_protocol}
Grok Filters
Grok filters are (at first) an odd-looking way of processing semi-structured data.
Grok uses ‘patterns’ to recognise data, and filter it appropriately, and always begins from the start of the line.
There are plenty of guides and examples of how Grok works, so I wont go into it here, but I’ll tell you how I worked it out. You can do this yourself, if you wish, to understand what’s going on.
I went to this awesome Grok Debugger site, put my sample log into the ‘input’ box and my grok filter below in the ‘pattern’ box. Something important to note is that grok works from the beginning onward, so when your grok filter doesn’t work, simply strip the filters of the end until something appears. Otherwise, start with nothing, and put the first one in (In my example that is %{TIMESTAMP_ISO8601:timestamp}), and go from there. When you get stuck, you can fiddle around where you got to already knowing that the previous parts work.
There is another site useful for this same purpose called Grok Constructor. This site helps more with predicting and working out what filter you are looking for.
Did this work?
Well, yes (sort of), but I started to see that there were a lot of logs that were not processed.
The logs that were especially not processed incorrectly, and put into Elasticsearch with a tag of _grokparsefailure were those with quotes (“) and also the entries that might switch between values and null-value dashes (-). You can see examples of these in log entries 3 and 4 up above, where the “request” and “user_agent” are null-values, represented by dashes.
Grok Filter V2!
So I did a new version of my grok filter, trying to incorporate some unstructured filter techniques:
%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:elb_name} %{IP:elb_client_ip}:%{NUMBER:elb_client_port} %{IP:elb_backend_ip}:%{NUMBER:elb_backend_port} %{NUMBER:request_processing_time} %{NUMBER:backend_processing_time} %{NUMBER:response_processing_time} (?:%{NUMBER:elb_status_code}|-) (?:%{NUMBER:backend_status_code}|-) %{NUMBER:elb_received_bytes} %{NUMBER:elb_sent_bytes} "(?:%{QS:elb_request}|-)" "(?:%{QS:userAgent}|-)" %{NOTSPACE:elb_sslcipher} %{NOTSPACE:elb_sslprotocol}
Same filter on 1 line:
%{TIMESTAMP_ISO8601:timestamp} %{NOTSPACE:elb_name} %{IP:elb_client_ip}:%{NUMBER:elb_client_port} %{IP:elb_backend_ip}:%{NUMBER:elb_backend_port} %{NUMBER:request_processing_time} %{NUMBER:backend_processing_time} %{NUMBER:response_processing_time} (?:%{NUMBER:elb_status_code}|-) (?:%{NUMBER:backend_status_code}|-) %{NUMBER:elb_received_bytes} %{NUMBER:elb_sent_bytes} "(?:%{QS:elb_request}|-)" "(?:%{QS:userAgent}|-)" %{NOTSPACE:elb_sslcipher} %{NOTSPACE:elb_sslprotocol}
The key difference here being the part that looks like this (First seen in item #8, elb_status_code):
(?:%{NUMBER:elb_status_code}|-)
This processes the item, or looks for the null-value dash, and continues.
This v2 filter has been working fine and I’ve (So far) processed nearly 2million ELB log entries) into Elasticsearch. I’m using this to display into Kibana for analysis.
FINAL NOTE: I did notice that there are entries that dont match the log pattern for special events.
In particular, when an ELB comes online, it spits out a log just like this:
Enable AccessLog for ELB: my-loadbalancer at 2016-01-01T15:13:39.298Z
These will be put into Elasticsearch with a tag of _grokparsefailure, as I indicated before, so you can still find them using a Logstash filter. These are still important events, that herald a new ELB coming online, and should be processed separately.
I’m investigating how to do this without disrupting the regular pattern, as its a rare occurrence.
Apr 4, 2016 at 10:37 am
Thank you for your post. It’s just what I needed to kick-start my ELB log processing endeavors.
I think the quotes added to elb_request and userAgent in the V2 filter should’t be there, they break matching for me. Either that or another pattern should be used instead of QS.
“(?:%{QS:elb_request}|-)”
“(?:%{QS:userAgent}|-)”
should be
(?:%{QS:elb_request}|-)
(?:%{QS:userAgent}|-)
LikeLike
Apr 4, 2016 at 1:21 pm
I did make an updated version, but I don’t think I uploaded it. I should really pick up on this, and put up the updated version. I’m glad it helped you! ☺
LikeLike
Apr 5, 2016 at 9:19 am
One more thing I found useful, coercing the data type for the numeric fields. By default Grok seems interpret all fields as strings so event the numbers end up as type string in Elasticsearch, meaning you can’t apply functions like max or average on them.
Found the information here: https://www.elastic.co/blog/little-logstash-lessons-part-using-grok-mutate-type-data
Here is the updated filter:
%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:elb_name}
%{IP:elb_client_ip}:%{NUMBER:elb_client_port:int}
%{IP:elb_backend_ip}:%{NUMBER:elb_backend_port:int}
%{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
(?:%{NUMBER:elb_status_code:int}|-)
(?:%{NUMBER:backend_status_code:int}|-)
%{NUMBER:elb_received_bytes:int}
%{NUMBER:elb_sent_bytes:int}
(?:%{QS:elb_request}|-)
(?:%{QS:userAgent}|-)
%{NOTSPACE:elb_sslcipher}
%{NOTSPACE:elb_sslprotocol}
LikeLike
Apr 7, 2016 at 10:01 am
I’ve put an update up with some more details of what I’m doing here:
https://followkman.com/2016/04/07/process-aws-elb-logs-with-logstash-update/
You might find it useful. 😉
LikeLike
Apr 4, 2016 at 12:35 pm
I think the quotes added to elb_request and userAgent in the V2 filter should’t be there, they break matching for me. Either that or another pattern should be used instead of QS.
LikeLike
Apr 7, 2016 at 9:58 am
[…] previously wrote about making a grok filter to process ELB logs. I have since worked on this further and developed an updated filter, which has been working very […]
LikeLike
Jun 29, 2016 at 7:58 pm
Great post. Saved me a bunch of time. Thanks! Like others pointed out you have to remove the quotes from the request and user agent lines. I also added null “-” handling for the SSL matches as well. So my final result looks like:
%{TIMESTAMP_ISO8601:timestamp}
%{NOTSPACE:elb_name} %{IP:client_ip}:%{NUMBER:client_port}
%{IP:backend_ip}:%{NUMBER:backend_port} %{NUMBER:request_processing_time:float}
%{NUMBER:backend_processing_time:float}
%{NUMBER:response_processing_time:float}
(?:%{NUMBER:elb_status_code:int}|-)
(?:%{NUMBER:backend_status_code:int}|-)
%{NUMBER:elb_received_bytes:int}
%{NUMBER:elb_sent_bytes:int}
(?:%{QS:elb_request}|-)
(?:%{QS:userAgent}|-)
(?:%{NOTSPACE:elb_sslcipher}|-)
(?:%{NOTSPACE:elb_sslprotocol}|-)
LikeLike
Jun 29, 2016 at 8:15 pm
Hey! thanks for commenting!
I think I may have fixed this issue in an updated grok that I uploaded here also, not sure if its included in that. I’ll add a link to this article to point there also.
I’m interested to know if it helps you, as it works verbatim for me.
Still, happy to be of assistance to someone. 🙂
LikeLike
Mar 21, 2017 at 10:37 am
I am just curious how you posted the ELB logs from S3 bucket to AWS ElasticSearch service?
LikeLike
Jun 19, 2017 at 5:18 pm
One option can be use the s3 input plugin to ingest the data into logstash
LikeLike