How can I generate numbers that fall into power-law distribution?
April 7, 2008 1:34 PM Subscribe
I need to generate numbers that fall into a power-law / zipf / long-tail distribution. Does anyone know of a python library that can do that?
Start off with SciPy. I'm not at a computer with Python right now, but I'm pretty sure that what you need to do might involve (for a Zipf dstn) something like stats.zipf.rvs() with appropriate parameters.
posted by thisjax at 2:18 PM on April 7, 2008
posted by thisjax at 2:18 PM on April 7, 2008
if SciPy doesn't do what you want, there is a pseudocode outline here, starting on page 550. it looks very easy to write.
posted by sergeant sandwich at 3:29 PM on April 7, 2008
posted by sergeant sandwich at 3:29 PM on April 7, 2008
Yeah, it took less than 10 minutes to write up the algorithm that sergeant sandwich linked to, which is less time than it would have taken to compile/install SciPy.
posted by thisjax at 7:04 PM on April 7, 2008
posted by thisjax at 7:04 PM on April 7, 2008
Why the heck not (excuse my bad code):
posted by thisjax at 7:27 PM on April 7, 2008
def list_zipf_values(exponent, num_of_values): list_of_values = [] b = 2 ** (exponent - 1) while len(list_of_values) <> value = genvalue(b, exponent) if value != None: list_of_values.append(value) return list_of_values def genvalue(b, exponent): U = random.uniform(0,1) V = random.uniform(0,1) X = math.floor(U ** (-(1/(exponent - 1)))) T = (1 + (1/X)) ** (exponent - 1) upper_bound = T/b value = V*X*((T-1)/(b-1)) if value <> return value >>
posted by thisjax at 7:27 PM on April 7, 2008
it took less than 10 minutes to write up the algorithm that sergeant sandwich linked to, which is less time than it would have taken to compile/install SciPy.
How much time did you spend testing and verifying it? I'm not sure if the original version you wrote would work, but as pasted here, that algorithm isn't even going to compile.
posted by grouse at 7:34 PM on April 7, 2008
How much time did you spend testing and verifying it? I'm not sure if the original version you wrote would work, but as pasted here, that algorithm isn't even going to compile.
posted by grouse at 7:34 PM on April 7, 2008
Yeah, sorry about that, I forgot to change character entities when I pasted the code...it compiles just fine as well:
posted by thisjax at 8:13 PM on April 7, 2008
import math, random class zipfdstn: def listvalues(self, exponent, num_of_values): list_of_values = [] b = 2 ** (exponent - 1) while len(list_of_values) < num_of_values: value = self.genvalue(b, exponent) if value != None: list_of_values.append(value) return list_of_values def genvalue(self, b, exponent): U = random.uniform(0,1) V = random.uniform(0,1) X = math.floor(U ** (-(1/(exponent - 1)))) T = (1 + (1/X)) ** (exponent - 1) upper_bound = T/b value = V*X*((T-1)/(b-1)) if value <= upper_bound: return value
posted by thisjax at 8:13 PM on April 7, 2008
This thread is closed to new comments.
It's loading very slowly for me, so here's the google cache version.
posted by toomuchpete at 2:04 PM on April 7, 2008