Spoon dart window tree.
April 4, 2010 7:49 PM Subscribe
What is the best way to have random words selected from a dictionary?
Hi, my wife and I are interested in picking four or five random words to use in sentences for the fun of it.
What is the best way to "choose" the words? Is there a website that picks random words from the dictionary to spit out to you?
We have a dictionary, but we'd like to avoid (if possible) flipping through pages with our eyes closed or something like that.
Thanks!
Hi, my wife and I are interested in picking four or five random words to use in sentences for the fun of it.
What is the best way to "choose" the words? Is there a website that picks random words from the dictionary to spit out to you?
We have a dictionary, but we'd like to avoid (if possible) flipping through pages with our eyes closed or something like that.
Thanks!
If you have a Mac (maybe Linux, I'm not sure), there's a very extensive list of English words on already on your computer: /usr/share/dict/words.
posted by k. at 8:05 PM on April 4, 2010
posted by k. at 8:05 PM on April 4, 2010
cat /usr/share/dict/words | perl -ne 'for $n(0..4){$w[$n]=$_ if rand()<1>
1>posted by nicwolff at 8:27 PM on April 4, 2010
Whoops:
posted by nicwolff at 8:28 PM on April 4, 2010 [2 favorites]
cat /usr/share/dict/words | perl -ne 'for $n(0..4){$w[$n]=$_ if rand()<1/$.}}{print @w'
posted by nicwolff at 8:28 PM on April 4, 2010 [2 favorites]
Wiktionary's random entry does this. The first three I got just now were voluminousness, polygon, and bort (alternative spelling boart) "a poorly crystallized diamond used in industrial cutting or abrasion." You can look them up in your dictionary for more information and to double check the definition and etymology. Sound good?
posted by nangar at 8:31 PM on April 4, 2010
posted by nangar at 8:31 PM on April 4, 2010
Response by poster: Several great ideas so far. Thanks so much!
posted by elder18 at 8:36 PM on April 4, 2010
posted by elder18 at 8:36 PM on April 4, 2010
If you want to use bash instead of perl, here's a more "pure" one-line solution:
numberOfWords=$(cat /usr/share/dict/words | \
wc -l | \
awk 'BEGIN {gsub(/ */,"",$1)} END {print $1}'); \
sampledWordIndex=$[($RANDOM % $numberOfWords)+1]; \
sed -n "$sampledWordIndex"p /usr/share/dict/words
This finds out how many words are in the dictionary, picks a random digit between 1 and the number of words, and then prints that selected word from the dictionary.
posted by Blazecock Pileon at 9:28 PM on April 4, 2010
numberOfWords=$(cat /usr/share/dict/words | \
wc -l | \
awk 'BEGIN {gsub(/ */,"",$1)} END {print $1}'); \
sampledWordIndex=$[($RANDOM % $numberOfWords)+1]; \
sed -n "$sampledWordIndex"p /usr/share/dict/words
This finds out how many words are in the dictionary, picks a random digit between 1 and the number of words, and then prints that selected word from the dictionary.
posted by Blazecock Pileon at 9:28 PM on April 4, 2010
But that has to run through the word list twice! Or, six times to get five words. Mine only runs through the list once to get any number of words. (The trick is, each word you see replaces each of the selected words with probability 1/the number of words seen so far.)
Also, I think
posted by nicwolff at 9:49 PM on April 4, 2010
Also, I think
($RANDOM % $numberOfWords)
isn't what you want — that's the remainder when a random number from 0 – 32767 is divided by the number of words in the list. You want something like $RANDOM * $numberOfWords / 32767
.posted by nicwolff at 9:49 PM on April 4, 2010
Heh, then again, that bash solution (corrected) takes ⅓ the time of my "elegant" Perl solution!
posted by nicwolff at 9:59 PM on April 4, 2010
bash-3.2$ time (cat /usr/share/dict/words | perl -ne 'for $n(0..4){$w[$n]=$_ if rand()<1>
discontentedness
cycad
sikhara
nonspeaker
unnicknamed
real 0m0.875s
user 0m0.857s
sys 0m0.021s
bash-3.2$ time (numberOfWords=$(cat /usr/share/dict/words | wc -l | awk 'BEGIN {gsub(/ */,"",$1)} END {print $1}'); sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words; sampledWordIndex=$[$RANDOM * $numberOfWords / 32767 + 1]; sed -n "$sampledWordIndex"p /usr/share/dict/words;)
synthetical
chiropodistry
goatstone
cephalomant
diphycercy
real 0m0.254s
user 0m0.216s
sys 0m0.042s1>
posted by nicwolff at 9:59 PM on April 4, 2010
Also, I think ($RANDOM % $numberOfWords) isn't what you want — that's the remainder when a random number from 0 – 32767 is divided by the number of words in the list. You want something like $RANDOM * $numberOfWords / 32767.
You're right. I forgot about the limit. The user time is greater for yours perhaps due to some internal array creation and initialization overhead in Perl, which seems expensive and a good starting place to look for optimization. I probably don't have that overhead because I'm not storing anything except a couple variables. My sys time is twice yours, probably due to filesystem access overhead. As you noted, I'm accessing the file twice as many times as your script.
posted by Blazecock Pileon at 10:33 PM on April 4, 2010
You're right. I forgot about the limit. The user time is greater for yours perhaps due to some internal array creation and initialization overhead in Perl, which seems expensive and a good starting place to look for optimization. I probably don't have that overhead because I'm not storing anything except a couple variables. My sys time is twice yours, probably due to filesystem access overhead. As you noted, I'm accessing the file twice as many times as your script.
posted by Blazecock Pileon at 10:33 PM on April 4, 2010
Why not just get a Word-a-Day calendar and peel off the next few pages whenever you need some random words? At least then the words are guaranteed to be interesting. They won't be "random" but they'll be random enough for your purposes.
posted by zanni at 12:11 AM on April 5, 2010
posted by zanni at 12:11 AM on April 5, 2010
If you have access to the online OED (if either of you are associated with a university you probably do), they have a random word function which would probably be ideal for this purpose.
posted by threeants at 2:30 AM on April 5, 2010
posted by threeants at 2:30 AM on April 5, 2010
Grab the first word of every post on the front page. Discard proper nouns if you wish.
posted by Meatbomb at 7:01 AM on April 5, 2010
posted by Meatbomb at 7:01 AM on April 5, 2010
This thread is closed to new comments.
posted by mikeand1 at 7:56 PM on April 4, 2010