If you want to analyse for example apache log files and split the lines by space by using the usual “split” method, you will see that split doesn’t respect quoted strings. For example if you have a line like below;
192.168.2.1 – – [06/Mar/2012:10:02:22 +0100] “GET /2011/10/19/jncip-sec-exam/ HTTP/1.1” 200 3331 “-” “mm”
You can’t get the HTTP_REQUEST easily with split. There is a very nice module named
shlex which allows you to split strings by space and treats quoted strings as single columns. Below is an example of my code which shows how you can fetch HTTP REQUEST from an apache log.
for line in log_fh.readlines():