gra[rrrrrr]
May 3, 2011 11:15 AM Subscribe
How to make a data subset based on a variable with multiple character values in [R]?
So I have this dataset with, among other things, street names (one variable for primary, one for secondary streets). I would like a subset of Street A that contains intersections with a selection of multiple possible cross streets. These are all text and R really does not like this. I found this solution on stackoverflow, but I still get errors. Halp me? I'm still new to this whole rrrrr business.
So I have this dataset with, among other things, street names (one variable for primary, one for secondary streets). I would like a subset of Street A that contains intersections with a selection of multiple possible cross streets. These are all text and R really does not like this. I found this solution on stackoverflow, but I still get errors. Halp me? I'm still new to this whole rrrrr business.
I'd probably convert the data from street names into numbers.
Transform the variable as such...
"Elm Street" -> 6
"Cabrillo Avenue" -> 8
posted by k8t at 11:23 AM on May 3, 2011
Transform the variable as such...
"Elm Street" -> 6
"Cabrillo Avenue" -> 8
posted by k8t at 11:23 AM on May 3, 2011
You don't want to convert the data into numbers. You probably want the data to be converted into factors, but this was probably done automatically if you imported it using
posted by grouse at 11:28 AM on May 3, 2011
read.table()
or similar.posted by grouse at 11:28 AM on May 3, 2011
Response by poster: Basically, I have a dataset of all motor vehicle collisions in Los Angeles. I want a subset that is a few miles of Venice Blvd. The only info I have to go on to do this are the primary_rd and secondary_rd variables. So I made a subset where Primary_rd=Venice Blvd. Now I want to chop that down to just the relevant few miles of Venice. I have a list of the cross streets along that section and want to select on that in some sort of where secondary_rd=("blah" or "blah1" ) fashion. Except, I can't use "or" in this case because all of my variables are factors, I guess.
posted by mandymanwasregistered at 11:29 AM on May 3, 2011
posted by mandymanwasregistered at 11:29 AM on May 3, 2011
Response by poster: I still don't have the r lingo down, so hopefully that description is better.
posted by mandymanwasregistered at 11:31 AM on May 3, 2011
posted by mandymanwasregistered at 11:31 AM on May 3, 2011
Response by poster: I get errors like:
Error: unexpected symbol
Error: unexpected string constant
posted by mandymanwasregistered at 11:33 AM on May 3, 2011
Error: unexpected symbol
Error: unexpected string constant
posted by mandymanwasregistered at 11:33 AM on May 3, 2011
Best answer: You can't use or in the way the Stack Overflow questioner wanted to because or doesn't work that way. It has nothing to do with factors; the error message is a red herring. You need to say
My previous solution should work fine. It sounds like there will be many secondary roads, so the cleanest way to do this would be:
posted by grouse at 11:37 AM on May 3, 2011
secondary_rd == "blah" | secondary_rd == "blah1"
, not secondary_rd == ("blah" | "blah1")
. Better yet would be to do secondary_rd %in% c("blah", "blah1")
.My previous solution should work fine. It sounds like there will be many secondary roads, so the cleanest way to do this would be:
secondary.rds.nearby <- c("blah", "blah1")
collisions.venice.nearby <- subset(collisions.venice, secondary_rd %in% secondary.rds.nearby)
posted by grouse at 11:37 AM on May 3, 2011
Response by poster: I should add that the link stackoverflow solution seems to work when I have a few of the streets listed, but not when I dump the whole list in. This is the sort of thing I'm going to be doing more than once for other streets, so I was hoping to come up with template code. Ok I'll shut up now.
posted by mandymanwasregistered at 11:38 AM on May 3, 2011
posted by mandymanwasregistered at 11:38 AM on May 3, 2011
Response by poster: Thanks grouse, I'll give that a try.
posted by mandymanwasregistered at 11:39 AM on May 3, 2011
posted by mandymanwasregistered at 11:39 AM on May 3, 2011
Is "Venice" somehow guaranteed to be street1? If not you will also have to check the symmetric condition.
posted by a robot made out of meat at 11:49 AM on May 3, 2011
posted by a robot made out of meat at 11:49 AM on May 3, 2011
This thread is closed to new comments.
subset(dataset, primary.street == "Street A" & secondary.street %in% c("Cross Street 1", "Cross Street 2", "Cross Street 3"))
posted by grouse at 11:19 AM on May 3, 2011