Notes on Event 5

Into the home stretch and the entries just keep getting better! The only advice I'd like to offer this time is to be careful to read the instructions carefully. They included the specific folder where the files were located and I noticed several misinterpretations in the scripts. Some included a mandatory Path parameter and others had a default Path that was not the specified folder. Including an optional Path with the correct default would certainly be acceptable, but not those variations.

The instructions also included some ambiguity about what the log file actually contains. Was the client IP address in the first column (as specified in the instructions) or in a different column (as presented in the example logs)? There were a number of entries that just searched the logs for IP addresses and returned all of them. This approach would not be able to distinguished between the client and server addresses, which would give a wrong answer. Another approach searched for the "c-ip" column, but this would only work if the log files were as in the samples. Another method, select the second IP address in a line would also only work on the sample log style. There weren't many entries that supported both file types, but one of them did it in a very concise manner, checking the first and ninth columns for an IP address and selecting the correct one.

Most of the entries used Sort-Object -Unique or Select-Object -Unique to eliminate duplicates, which was the first approach that I thought of. There were several entries, however, that used alternate methods that I thought were quite clever applications of PowerShell technology: hash tables with the IP address as the key, and Group-Object on the IP address. Both options provided a fairly simple way to also report the instance count for each address.

Returning an instance count sounds like an interesting option, but after thinking about it some more, I'm not so sure. Counts of the number of sessions and the hits per session would be much more interesting than the raw hits count. But that's way, way beyond the scope of this event.

Anyway, just one more event to go. I'm expecting a spectacular finish!

About the Author

Art Beane

Howdy from Houston, TX. I’ve been consulting on how to use computers for almost 40 years, including everything from designing global networks to data centers to email, storage, virtualization, directories, automation and more. I’ve become a passionate user of PowerShell since it came out and use it whereever I can.

One Comment

  1. I used a hash table in mine (and provided the option to return instance counts), but the instance counts were something of an afterthought, and easy to get with the hash table.
    The primary reason for using the hash table was speed..

    250MB of IIS logs produces about a half million address instances. On my system, it takes sort-object -unique about 8 minutes to de-dupe that. Select-object -unique is much better, at a little over 2 minutes. A hash table in a filter script did it in 2.7 seconds.