I'm looking for some help with a math equation / algorithm I'm trying to create. I'm a programmer by trade and not very good at math so I'd like some help with this or perhaps a site where I could pay someone to come up with the equation I'm looking for. I considered rentacoder or elance however I'd be concerned if the companies there had the proper math knowledge. A lot of claims get made on those sites that turn out to be false.
I'm trying to do is calculate the value or profitability of keywords based on some data that I have. I have a list of about 35 categories and in each category I have 500 keywords that belong to that category. I've provided a small sampling of the data I have (at the end of this post). The sampling has 5 categories and in each category it has the top 5 and bottom 5 from the lists of 500.
The category records have a '~' starting the row. The first field is the category name and the second field is the total dollar amount of products sold in that category in the last 2 weeks.
Under each category are the keyword records that belong to that category. Normally this list has 500 records however I’m only displaying the top and bottom five (records 1-5 and 495-500). These records are always placed in order of popularity. Meaning record one is searched for more often than record two. The first field if the record is the keyword itself. The second field is the number of products sold in the past two weeks that contain that keyword (raw number not dollar amount however I might be able to get that data). Field three is the number of products currently being offered that contain that keyword. In other words how many products are competing for that keyword at any given moment. The last field is a number representing the popularity of the keyword in the regular search engines (Google, Yahoo, etc.). It's a weight figure for how many people search for that term in a given month. I'm not sure how they come up with this number however here is a sampling for reference (sex: 1415948, harry potter: 242471, will smith: 91542, java books 378). I don't have this data but I was also thinking of including how many web sites compete for that keyword in Google to do a supply/demand comparison with field four. I can easily get that data and would like to include that in the computation.
Some thoughts I have about the data. The keywords are ordered by popularity like I said. This alone is only mildly helpful. I was thinking that the amount of product sold in a category could bring some value to that ranking. Meaning position 1 in a category that sells 1 million every 2 weeks is more valuable than position 1 in a category that sells $200,000 every 2 weeks. I also thought the search engine popularity weight number might also bring some value to that order. If the product ranks high in both areas then perhaps it is of greater overall value.
This is the data I have at the moment however I'm open to thoughts or suggestions. I might be able to get additional data if it's important however I'd like to stay with this if I can. I'm certain I have access to this data.
What I would like to do is take this data and compute a value for each keyword that tells me its profitability or value. I'd like to find out what keywords in this massive list are the best. I would place this 'profitability value' in the final field of each keyword record for someone to review. The category data (dollar amount sold in that category), keyword order (record 1 is more popular than record 2), and the data in record field 2 and 3 are the most important. The search engine information is less important (lets say 3 times less important).
Any help would be greatly appreciated. Thanks so much for your time. Have a great day.
~Antiques,6155459.31
victorian,50984,23124,313
sterling,506010,153977,327
furniture,54095,9618,3605
art deco,18234,13062,424
lamps,182309,56646,580
...
bedroom,11594,3241,709
chest of drawers,194,242,67
cottage,13730,6258,214
photo,1131344,159891,1765
pre columbian,673,246,10
~Art,33651005.64
aceo,4554,3924,40
obey,3365,1072,205
highwaymen,1875,233,136
thomas kinkade,1566,1379,320
florida,51435,11457,4226
...
boston,76336,20618,3296
mother,120329,19159,1857
vermont,7714,1723,593
poland,10018,3604,1210
cherry blossom,3410,1428,497
~Baby,1887275.00
clothes,78160,40211,2226
baby,405388,139070,4956
shoes,465883,245167,6373
winnie the pooh,17721,8714,2899
pottery barn,7237,6054,4391
...
nuk,276,126,38
cover,1884691,262594,1030
cradles,861446,30529,41
butterflies,100523,37077,5900
graco car seat,402,224,51
~Books,99187955.26
easton press,1826,907,63
stephen king,6545,3419,1282
harry potter,22159,7573,7007
nancy drew,4963,1447,757
bible,201307,11721,3081
...
electrical,14574,2904,235
blackjack,334947,9839,967
diary,66830,3756,347
judo,3198,307,356
jane eyre,2981,256,290
~Business & Industrial,203912902.60
john deere,22686,11873,4646
backhoe,979,572,180
tractor,23708,15907,669
bobcat,3999,980,1521
loader,5620,2906,33
...
buffer,2415,1075,64
boring bar,656,262,10
tig welder,763,422,96
rake,2649,625,62
flow meter,856,485,63
Secondly, you can't determine the total dollar amount for a keyword; for all you know, 90% of the money spent in Antiques might be for a couple dozen very expensive pre-Colombian items.
So I guess I don't really know what you're expecting to compute. You say that total sales for the category is most important (higher number = better), followed by search rank within the category (lower number = better), so you might try:
keyword "value" = sum of (category sales / search rank) for all categories keyword appears in
or
keyword "value" = sum (category sales * (500 - search rank)) likewise
...either will give you a value that increases as category sales increase and as search rank improves. But I don't know how useful that's gonna be...
posted by equalpants at 8:30 PM on July 3