How can I automatically split GPS tracklogs?
March 17, 2009 8:04 PM
I have almost a year's worth of GPS track logs. I'd like to be able to automatically split them by trip type (walking, biking, car) and filter out some of the noise that gets generated when leaving the GPS on while stationary.
Over the past several months I've been using my GPS to track where I go. The raw track logs I get are generally pretty accurate, but often while I'm out I'll switch back and forth between walking, biking, and the car/bus/train. My GPS unit (Garmin 60CSx) doesn't generally split tracks when I change modes of transportation. Also, sometimes I forget to turn the GPS off when I'm inside or standing still; because the GPS isn't 100% accurate, this creates a noisy track that looks like random wandering around the same point.
Given a list of tracks in GPX or some other standard format that's supported by gpsbabel, I'd like to be able to take each track and break it at each point where my average speed changes by a certain configurable threshold. I'd also like to be able to detect when I've stopped somewhere and remove all points that were generated while I was stopped.
I've hacked together something in Python that works on a rudimentary level (the geolocator library has been especially helpful), but it's not perfect. I also haven't implemented anything that gets rid of the random wandering. I could probably figure out how to do this, but it occurred to me that this is probably a solved problem and that I could save time by not re-inventing the wheel.
Does anyone know of any software with intelligent track splitting and noise removal? Previous questions have mentioned GPS TrackMaker, but it doesn't seem to meet these requirements. Failing that, algorithms or pseudocode would be helpful as well.
Over the past several months I've been using my GPS to track where I go. The raw track logs I get are generally pretty accurate, but often while I'm out I'll switch back and forth between walking, biking, and the car/bus/train. My GPS unit (Garmin 60CSx) doesn't generally split tracks when I change modes of transportation. Also, sometimes I forget to turn the GPS off when I'm inside or standing still; because the GPS isn't 100% accurate, this creates a noisy track that looks like random wandering around the same point.
Given a list of tracks in GPX or some other standard format that's supported by gpsbabel, I'd like to be able to take each track and break it at each point where my average speed changes by a certain configurable threshold. I'd also like to be able to detect when I've stopped somewhere and remove all points that were generated while I was stopped.
I've hacked together something in Python that works on a rudimentary level (the geolocator library has been especially helpful), but it's not perfect. I also haven't implemented anything that gets rid of the random wandering. I could probably figure out how to do this, but it occurred to me that this is probably a solved problem and that I could save time by not re-inventing the wheel.
Does anyone know of any software with intelligent track splitting and noise removal? Previous questions have mentioned GPS TrackMaker, but it doesn't seem to meet these requirements. Failing that, algorithms or pseudocode would be helpful as well.
Does anyone know of any software with intelligent track splitting and noise removal?[...]Failing that, algorithms or pseudocode would be helpful as well.
Some GPSes simply say "if speed is less than 0.2m/s, assume stationary".
If your data includes satellite status information, you can detect going inside (or through tunnels/into enclosed parking) because you lose most of your satellites/signal quality drops.
You could classify tracks based on median speed over a short time window. That is, if you're going under 3 mph for more than 5 minutes, classify as walking; below 15mph but above 3mph for 5 minutes, classify as cycling; above 15mph for 5 minutes, classify as driving. If you're within 5m of a given location for more than 5 minutes, classify as stationary.
Obviously the difficulty comes from the times when you drive a car slowly (e.g. in a car park) and accurately detecting transitions (e.g. in a car park: car slows down, car stops, car moves slowly, car stops, person gets out and walks). That's why I say 'for 5 minutes' - you probably aren't going to stop your car, walk for 5 minutes, then get back in the car; it's more likely you've just come to some stop lights/a traffic jam.
For your day-to-day routine you could also classify by area; that is, if you're tracking your drive to work every day, you expect to classify as 'walk' at home and at work, but if you classify as walking anywhere else, that's a possible misclassification.
posted by Mike1024 at 6:38 AM on March 18, 2009
Some GPSes simply say "if speed is less than 0.2m/s, assume stationary".
If your data includes satellite status information, you can detect going inside (or through tunnels/into enclosed parking) because you lose most of your satellites/signal quality drops.
You could classify tracks based on median speed over a short time window. That is, if you're going under 3 mph for more than 5 minutes, classify as walking; below 15mph but above 3mph for 5 minutes, classify as cycling; above 15mph for 5 minutes, classify as driving. If you're within 5m of a given location for more than 5 minutes, classify as stationary.
Obviously the difficulty comes from the times when you drive a car slowly (e.g. in a car park) and accurately detecting transitions (e.g. in a car park: car slows down, car stops, car moves slowly, car stops, person gets out and walks). That's why I say 'for 5 minutes' - you probably aren't going to stop your car, walk for 5 minutes, then get back in the car; it's more likely you've just come to some stop lights/a traffic jam.
For your day-to-day routine you could also classify by area; that is, if you're tracking your drive to work every day, you expect to classify as 'walk' at home and at work, but if you classify as walking anywhere else, that's a possible misclassification.
posted by Mike1024 at 6:38 AM on March 18, 2009
Thanks for the ideas. I was hoping that there would be software to do this for me, but that doesn't seem to be the case. I'll have to try looking at distance traveled vs. total movement to get rid of the noise. Looking at satellite status is a good idea, but unfortunately the data I have is GPX only--so basically distance and location without satellite status or the receiver's idea of how accurate the location is.
posted by komilnefopa at 6:18 PM on March 19, 2009
posted by komilnefopa at 6:18 PM on March 19, 2009
« Older What can I do to increase the size of my butt? | Lazy Pseudo Statistician Looking for Computer To... Newer »
This thread is closed to new comments.
To get rid of the random junk, I would try measuring distance travelled vs. total movement. If, in a set of 10 readings, you move a total of 100 m (1 to 2, 2 to 3, etc.), but the actual distance between reading 1 and reading 10 is only 3 m, it's probably random inaccuracy.
posted by pocams at 5:39 AM on March 18, 2009