How can I generate numbers that fall into power-law distribution?
April 7, 2008 1:34 PM   Subscribe

I need to generate numbers that fall into a power-law / zipf / long-tail distribution. Does anyone know of a python library that can do that?
posted by k7lim to Computers & Internet (8 answers total) 1 user marked this as a favorite
 
Here's a page that shows how to generate random numbers in a Gaussian distribution, you'd have to be able to do a transform from gaussian to your preferred distribution, but it might be easier than crafting it from scratch.

It's loading very slowly for me, so here's the google cache version.
posted by toomuchpete at 2:04 PM on April 7, 2008


SciPy can do Zipf and power law distributions.
posted by grouse at 2:14 PM on April 7, 2008


Start off with SciPy. I'm not at a computer with Python right now, but I'm pretty sure that what you need to do might involve (for a Zipf dstn) something like stats.zipf.rvs() with appropriate parameters.
posted by thisjax at 2:18 PM on April 7, 2008


if SciPy doesn't do what you want, there is a pseudocode outline here, starting on page 550. it looks very easy to write.
posted by sergeant sandwich at 3:29 PM on April 7, 2008


Yeah, it took less than 10 minutes to write up the algorithm that sergeant sandwich linked to, which is less time than it would have taken to compile/install SciPy.
posted by thisjax at 7:04 PM on April 7, 2008


Why the heck not (excuse my bad code):
def list_zipf_values(exponent, num_of_values):
	list_of_values = []
	b = 2 ** (exponent - 1)
	while len(list_of_values) <>
		value = genvalue(b, exponent)
		if value != None:
			list_of_values.append(value)
		return list_of_values

def genvalue(b, exponent):
	U = random.uniform(0,1)
	V = random.uniform(0,1)
	X = math.floor(U ** (-(1/(exponent - 1))))
	T = (1 + (1/X)) ** (exponent - 1)
	upper_bound = T/b
	value = V*X*((T-1)/(b-1))
	if value <>
		return value

posted by thisjax at 7:27 PM on April 7, 2008


it took less than 10 minutes to write up the algorithm that sergeant sandwich linked to, which is less time than it would have taken to compile/install SciPy.

How much time did you spend testing and verifying it? I'm not sure if the original version you wrote would work, but as pasted here, that algorithm isn't even going to compile.
posted by grouse at 7:34 PM on April 7, 2008


Yeah, sorry about that, I forgot to change character entities when I pasted the code...it compiles just fine as well:
import math, random

class zipfdstn:
	def listvalues(self, exponent, num_of_values):
		list_of_values = []
		b = 2 ** (exponent - 1)
		while len(list_of_values) < num_of_values:
			value = self.genvalue(b, exponent)
			if value != None:
				list_of_values.append(value)
		return list_of_values

	def genvalue(self, b, exponent):
		U = random.uniform(0,1)
		V = random.uniform(0,1)
		X = math.floor(U ** (-(1/(exponent - 1))))
		T = (1 + (1/X)) ** (exponent - 1)
		upper_bound = T/b
		value = V*X*((T-1)/(b-1))
		if value <= upper_bound:
			return value

posted by thisjax at 8:13 PM on April 7, 2008


« Older Who else makes jewelry like th...   |  I'll be participating in my fi... Newer »
This thread is closed to new comments.