Help a Python newbie out!
March 6, 2012 3:15 AM Subscribe
Programming noobfilter: I have a Python script that does some things to a proxy log file: it can count and rank the top 20 urls accessed, users and source IPs. Three further problems and code inside:
1. I want it to count and rank the top source IP segments (255.255.0.0) too, preferably in the same script, and output it in the same format. Is there a way to go about that? (split the IP addresses by period...?)
2. I tried for a few days to print the output into a file, but "print >>file, '%d %s' % (count, url)" gives me syntax errors. f.write() similarly doesn't work. How do you write the output into a text file or better yet, send it as an email? (The server this will run on has Windows, and I was thinking of using Task Manager to run this at a regular time each day.)
3. The proxy log file is uploaded automatically at 5:00am everyday and has the date (e.g. 20120229) in its name. How can I set this script to get the current date and scan the correct file?
Any help is greatly appreciated!
--------- (disclaimer: code from stackoverflow)
from collections import defaultdict
from operator import itemgetter
import heapq
access = defaultdict(int)
user = defaultdict(int)
sourceip = defaultdict(int)
with open("C:/log.log") as f:
for line in f:
parts = line.split() #split at whitespace
if len(parts) >= 6:
access[parts[11] + parts[13]] += 1 # grabs host url and path and combines them
user[parts[15]] += 1 # grabs usernames
sourceip[parts[4]] += 1 # grabs computer's IP
# top k entries
k = 20
for url, count in heapq.nlargest(k, access.iteritems(), key=itemgetter(1)):
print "%d %s" % (count, url)
print "\n"
# top k users
for user, count in heapq.nlargest(k, user.iteritems(), key=itemgetter(1)):
print "%d %s" % (count, user)
print "\n"
# top k sourceips
for ip, count in heapq.nlargest(k, sourceip.iteritems(), key=itemgetter(1)):
print "%d %s" % (count, ip)
print "\n"
posted by monocot to computers & internet (10 answers total)
posted by katrielalex at 3:33 AM on March 6, 2012