Join 3,377 readers in helping fund MetaFilter (Hide)


Spoon dart window tree.
April 4, 2010 7:49 PM   Subscribe

What is the best way to have random words selected from a dictionary?

Hi, my wife and I are interested in picking four or five random words to use in sentences for the fun of it.

What is the best way to "choose" the words? Is there a website that picks random words from the dictionary to spit out to you?

We have a dictionary, but we'd like to avoid (if possible) flipping through pages with our eyes closed or something like that.

Thanks!
posted by elder18 to Grab Bag (16 answers total) 3 users marked this as a favorite
 
You could use a random number generator to pick the page, then again for the Nth word on that page.
posted by mikeand1 at 7:56 PM on April 4, 2010


There's also a random word generator.
posted by mikeand1 at 7:59 PM on April 4, 2010 [3 favorites]


If you have a Mac (maybe Linux, I'm not sure), there's a very extensive list of English words on already on your computer: /usr/share/dict/words.
posted by k. at 8:05 PM on April 4, 2010


cat /usr/share/dict/words | perl -ne 'for $n(0..4){$w[$n]=$_ if rand()<1>
posted by nicwolff at 8:27 PM on April 4, 2010


Whoops:

cat /usr/share/dict/words | perl -ne 'for $n(0..4){$w[$n]=$_ if rand()<1/$.}}{print @w'
posted by nicwolff at 8:28 PM on April 4, 2010 [2 favorites]


(Which I know looks just as broken, but it works!)
posted by nicwolff at 8:29 PM on April 4, 2010


Wiktionary's random entry does this. The first three I got just now were voluminousness, polygon, and bort (alternative spelling boart) "a poorly crystallized diamond used in industrial cutting or abrasion." You can look them up in your dictionary for more information and to double check the definition and etymology. Sound good?
posted by nangar at 8:31 PM on April 4, 2010


Several great ideas so far. Thanks so much!
posted by elder18 at 8:36 PM on April 4, 2010


mikeand1's link also seems quite cool.
posted by nangar at 8:38 PM on April 4, 2010


If you want to use bash instead of perl, here's a more "pure" one-line solution:

numberOfWords=$(cat /usr/share/dict/words | \
   wc -l | \
   awk 'BEGIN {gsub(/ */,"",$1)} END {print $1}'); \
sampledWordIndex=$[($RANDOM % $numberOfWords)+1]; \
sed -n "$sampledWordIndex"p /usr/share/dict/words


This finds out how many words are in the dictionary, picks a random digit between 1 and the number of words, and then prints that selected word from the dictionary.
posted by Blazecock Pileon at 9:28 PM on April 4, 2010


But that has to run through the word list twice! Or, six times to get five words. Mine only runs through the list once to get any number of words. (The trick is, each word you see replaces each of the selected words with probability 1/the number of words seen so far.)

Also, I think ($RANDOM % $numberOfWords) isn't what you want — that's the remainder when a random number from 0 – 32767 is divided by the number of words in the list. You want something like $RANDOM * $numberOfWords / 32767.
posted by nicwolff at 9:49 PM on April 4, 2010


Heh, then again, that bash solution (corrected) takes ⅓ the time of my "elegant" Perl solution!

bash-3.2$ time (cat /usr/share/dict/words | perl -ne 'for $n(0..4){$w[$n]=$_ if rand()<1> discontentedness
cycad
sikhara
nonspeaker
unnicknamed

real 0m0.875s
user 0m0.857s
sys 0m0.021s

bash-3.2$ time (numberOfWords=$(cat /usr/share/dict/words | wc -l | awk 'BEGIN {gsub(/ */,"",$1)} END {print $1}'); sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words;)
synthetical
chiropodistry
goatstone
cephalomant
diphycercy

real 0m0.254s
user 0m0.216s
sys 0m0.042s

posted by nicwolff at 9:59 PM on April 4, 2010


Also, I think ($RANDOM % $numberOfWords) isn't what you want — that's the remainder when a random number from 0 – 32767 is divided by the number of words in the list. You want something like $RANDOM * $numberOfWords / 32767.

You're right. I forgot about the limit. The user time is greater for yours perhaps due to some internal array creation and initialization overhead in Perl, which seems expensive and a good starting place to look for optimization. I probably don't have that overhead because I'm not storing anything except a couple variables. My sys time is twice yours, probably due to filesystem access overhead. As you noted, I'm accessing the file twice as many times as your script.
posted by Blazecock Pileon at 10:33 PM on April 4, 2010


Why not just get a Word-a-Day calendar and peel off the next few pages whenever you need some random words? At least then the words are guaranteed to be interesting. They won't be "random" but they'll be random enough for your purposes.
posted by zanni at 12:11 AM on April 5, 2010


If you have access to the online OED (if either of you are associated with a university you probably do), they have a random word function which would probably be ideal for this purpose.
posted by threeants at 2:30 AM on April 5, 2010


Grab the first word of every post on the front page. Discard proper nouns if you wish.
posted by Meatbomb at 7:01 AM on April 5, 2010


« Older Recommend me a pizza stone!...   |  I'm thinking about hiring an o... Newer »
This thread is closed to new comments.