You are reading a MIX Online Opinion. In which we speak our minds. Karsten Januszewski Meet Karsten Arrow

Lab Notes

8Comment Retweet

Downloading and Parsing IIS Logs from Windows Azure

Jan 22, 2010 In Development By Karsten Januszewski

In this post, I delve into how to get IIS logs out of Windows Azure and on to your local machine so you can do analysis. I wrote a program to help this process called AzureLogFetcher

In this post, I’ll tell you how to get IIS logs out of Windows Azure and onto your local machine so you can do analysis. I wrote a program called AzureLogFetcher to enable this process.

I have already written about how to get logging and diagnostics working with Windows Azure so that you can access both your IIS logs and diagnostic information.  In this post, I’ll show you how to get this information out of Windows Azure and onto your local machine so you can analyze the logs. I’ll also show some of the queries I run against the IIS logs using the most excellent Log Parser tool, a free program from Microsoft.

So, here’s the crux of the problem: you’ve got your website running in Azure. You’ve got Azure transferring the IIS logs across all instances of your site to your Azure storage account. But now, you need to download those logs and do something with them.  There are tools out there that allow you to browse and download these files, but this gets laborious and tedious—after all, there’s a new log file written every hour.  Using a GUI tool to do the job means you have to manually select the log files and then copy them to your local machine. Wouldn’t it be nice if there were a tool that automatically did this for you?

I took a look and couldn’t find any tools like this.  So, I chatted with Ryan Dunn, Windows Azure evangelist extraordinaire. We cooked up Azure Log Fetcher, a command line application that will download all your IIS logs from Azure and delete them from the cloud. You can create a .bat file and set it in task scheduler to run automatically.

Here’s what it looks like:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Microsoft.WindowsAzure;
using Microsoft.WindowsAzure.StorageClient;
using System.Net.Mail;

namespace AzureLogFetch
{
    class Program
    {
        static string smtp = null;
        static string email = null;

        static void Main(string[] args)
        {
            if (args.Count() < 3)
            {
                Console.WriteLine("Usage: AzureLogFetch [path to save files]
			[Azure instance name] [Azure key] [smtp port]
			[email]");
                Console.WriteLine(@"Note: smtpport and email are optional");
                Console.WriteLine(@"Example w/o email:");
                Console.WriteLine(@"Example: AzureLogFetch g:logs MyAzure
			EK247tO8q4aNLA+A==");
                Console.WriteLine(@"Example w/ email: AzureLogFetch g:logs
			MyAzure EK247tO8q4aNLA+A== smtp.mymail.com me@email.com");
                return;
            }
            string directory = args[0];
            string instance = args[1];
            string key = args[2];

            if (args.Count() > 3)
            {
                smtp = args[3];
                email = args[4];
            }
            var account = new CloudStorageAccount(
        new StorageCredentialsAccountAndKey(
            instance,
            key
            ),false
        );

            var client = account.CreateCloudBlobClient();
            var container = client.GetContainerReference("wad-iis-logfiles");

            //note that I pass the BlobRequestOptions of UseFlatBlobListing
	    //which returns all files regardless
            //of nesting so that I don't have to walk the directory structure
            foreach (var blob in container.ListBlobs(new
		BlobRequestOptions() { UseFlatBlobListing = true }))
            {
                CloudBlob b = blob as CloudBlob;
                try
                {
                    b.FetchAttributes();
                    BlobAttributes blobAttributes = b.Attributes;
                    TimeSpan span = DateTime.Now.Subtract
			(blobAttributes.Properties.LastModifiedUtc.ToLocalTime());
                    int compare = TimeSpan.Compare(span,TimeSpan.FromHours(1));
                    //we don't want to download and delete the latest
		    //log file, because it is incomplete and still being
                    //written to, thus this compare logic
                    if (compare == 1)
                    {
                        b.DownloadToFile(directory + b.Uri.PathAndQuery);
                        b.Delete();
                    }
                }
                catch (Exception e)
                {
                    SendMail(instance + " download of logs failed!!!!",
			e.Message);
                    return;
                }

            }
            SendMail(instance + " download of logs complete
		at " + DateTime.Now.ToLongDateString() + " " +
		DateTime.Now.ToShortTimeString() , "");

        }
        static void SendMail(string subject, string body)
        {
            if (smtp == null || email == null)
            {
                return;
            }
            MailMessage message = new MailMessage();
            SmtpClient smtpClient = new SmtpClient(smtp, 25);
            message.From = new MailAddress(email);
            message.To.Add(new MailAddress(email));
            message.Subject = subject;
            message.Body = body;
            smtpClient.UseDefaultCredentials = true;
            smtpClient.Send(message);

        }
    }
}

There are a few things worth commenting on about this code. First, you’ll notice it requires Microsoft.WindowsAzure and Microsoft.WindowsAzure.StorageClient.  These are assemblies that ship with the Windows Azure Tools for Microsoft Visual Studio (November 2009) as part of the Windows Azure SDK. You can find them at %Program Files%Windows Azure SDKv1.0ref.

The crux line (thanks Ryan!) is where I iterate over the log files:

foreach (var blob in container.ListBlobs(new BlobRequestOptions() { UseFlatBlobListing = true })) 

The key here is that I pass the UseFlatBlobListing option, which means I don’t have to walk the whole directory structure to get all the logs across all instances.  Rather, it returns me everything—which means every log file.  This turns out to be handy when I go to download the file here:

b.DownloadToFile(directory + b.Uri.PathAndQuery); 

Notice I’m able to preserve the path. This is clutch because, if you are running multiple instances in Azure, you’ll have log files with the same name from different instances (aka u_ex10011916.log), so the files need to be saved in paths unique to their instance.

The other thing I did was to make sure I didn’t download the log file currently being written to. After all, it is a living document; if I were to download it, I might miss some of the logging it did (and thus the comparison logic in the code). After downloading them, I go ahead and blow them away.

I also added some code that enables the program to send an email after either a successful transfer of the logs or a failure.

You can cut/paste/compile this yourself or download the source or the binary here.

Once you have all this IIS log data, what do you do with it?  You analyze it, of course. The best tool I’ve found for analyzing IIS logs is the Microsoft Log Parser tool.  This is a command line tool that allows you write SQL queries against your log files.  The query I find myself returning to again and again is as follows:

logparser -i:IISW3C "SELECT count(*) as hits, cs(referer) INTO referrer.txt FROM *.log WHERE cs-uri-stem not like '%.png' and cs-uri-stem not like '%.gif' group by cs(Referer) order by hits desc" -recurse -o:CSV

This query groups referrers by hits, allowing me to see where my traffic is coming from. This is key for the Incarnate service, since we’re trying to determine who has installed the Incarnate plug-in. Notice that I exclude any .png or .gif files from the query, so I’m only getting hits on the service itself. By passing the O:csv switch, I can easily pull this into Excel and then sum the count column to get my total hits. If you want, you can also have it generate nifty charts for you, like this:

logparser -i:IISW3C "SELECT count(*) as hits, cs(referer) INTO chart.gif FROM *.log WHERE cs-uri-stem not like '%.png' and cs-uri-stem not like '%.gif' group by cs(Referer) order by hits desc" -recurse -o:Chart –chartType:Pie3D

I should note that I got the following error when I tried to generate charts:

Error creating output format "chart": This output format requires a licensed Microsoft Office Chart Web Component to be installed on the local machine.

Turns out I needed to install some Office components which may not be on your system. See this blog post for more.

There are a bunch of great samples that ship with Log Parser. The SDK and their forums are also helpful.

Follow the Conversation

8 comments so far. You should leave one, too.

Offbeatmammal Offbeatmammal said on Jan 25, 2010

Love that tools like this are being created for Azure, but ... have to ask when basics like this and a "performance dashboard" that shows me the load on my current instances will be part of the basic solution?

At the moment knowing when to add new instances is guesswork and there''s no visibility into (eg) the http request queue or the response time or the backlog of writes to a SQL instance (basic WMI stuff you take for granted on a physical machine) without writing a bunch of health code yourself...

Karsten Januszewski Karsten Januszewski said on Jan 25, 2010

@Offbeatmammal -- A little bird told me something like this is coming out soon...stay tuned...

Bikram Bikram said on Mar 22, 2010

Very well written and will help new to Azure. Dashboard concept for Azure IIS logs will be great help. So I have started working on it and will be available in few months. If any comments on requirements for dashboard please let me know.

Bikram Ray Bikram Ray said on Mar 23, 2010

Instead of deleting file from blob after copying to local directory can we use metadata Say "Partsing" with Value "Done" ? Because some scenario the blobs can be used by multiple applications. Do you see any issues?

Bikram Ray Bikram Ray said on Apr 11, 2010

I was trying to keep all my processing in Azure fabric. That means I need to run the logparser (third party console application) on azure fabric. How to go about it?

Karsten Januszewski Karsten Januszewski said on Apr 28, 2010

@Bikram -- Well, the whole idea of the tool is to get the iis logs out of Azure to do processing with other tools, etc. But I hear you -- it would be cool to have an azure service that did that. That''s just a matter of putting some sort of log analyzer up in the cloud and pointing it to the iis logs in blob storage...

Ramona Eid Ramona Eid said on Oct 4, 2010

I am trying to port this to Visual Studio 2010 with June 2010 Windows Azure SDK and keep getting error: The type or namespace name ''WindowsAzure'' does not exist in the namespace ''Microsoft'' (are you missing an assembly reference?)

I added the reference by navigating to where the folder is on disk for v1.2.

Please help!

Ramona Eid Ramona Eid said on Oct 4, 2010

Solved my own problem, but thanks! I was accidently tageting .NET Framework 4 Client Profile. Corrected and now builds beautifully.

Love this article. Love all your work, including Archivist!