What is the best back end approach for a my e-bay polling web service?
February 28, 2015 3:34 PM Subscribe
Is running regular wget cron jobs every few minutes the best way to automate scanning the ebay api for batches of searches?
Howdy howdy,
I know a modest amount of PHP that I've picked up along the way as I've put together some t-shirt websites, but there are some big gaps in my basic knowledge of how to approach certain problems. One of the little side projects I've put together for myself is a little web app that allows me to set up e-bay searches and hit their API looking for "Buy It Now" items that match my search, and are priced below a threshold I've set for that search, and when if finds one it texts me a link to the item.
Right now I just have it set up for myself, and I have about 20 searches set up stored in a DB, so every 5 minutes I have my VPS load a php webpage via cron job wget which then takes all the searches from the DB runs each through the ebay api and send off a text if it finds a new item matching my criteria. But I've got a few friends that want to get in on it, so I've decided to rewrite things to allow for additional users, but I'm worried this approach wont scale well.
I'm not talking thousands of users, but I can imagine even a relatively small number of users with a few dozen searches could possibly cause problems with things timing out, or as it expands there's the risk of the script not being done running from one cycle when the cron job tries to hit the page a second time.
Basically I'm assuming I don't know enough to know the right way to do this yet, and I just need pointed in the right direction so I know what I need to learn next!
Thanks much.
Howdy howdy,
I know a modest amount of PHP that I've picked up along the way as I've put together some t-shirt websites, but there are some big gaps in my basic knowledge of how to approach certain problems. One of the little side projects I've put together for myself is a little web app that allows me to set up e-bay searches and hit their API looking for "Buy It Now" items that match my search, and are priced below a threshold I've set for that search, and when if finds one it texts me a link to the item.
Right now I just have it set up for myself, and I have about 20 searches set up stored in a DB, so every 5 minutes I have my VPS load a php webpage via cron job wget which then takes all the searches from the DB runs each through the ebay api and send off a text if it finds a new item matching my criteria. But I've got a few friends that want to get in on it, so I've decided to rewrite things to allow for additional users, but I'm worried this approach wont scale well.
I'm not talking thousands of users, but I can imagine even a relatively small number of users with a few dozen searches could possibly cause problems with things timing out, or as it expands there's the risk of the script not being done running from one cycle when the cron job tries to hit the page a second time.
Basically I'm assuming I don't know enough to know the right way to do this yet, and I just need pointed in the right direction so I know what I need to learn next!
Thanks much.
The biggest limitation here will be the ebay API call limits. It looks like for non-compatible applications pretty much everything is limited to 5,000 calls a day. It looks like you alone (12 different searches, 12 times per hour, 24 hours a day) are already flirting with the limit at 3,456 calls. Add 20 of your closest friends and that is 69,120 calls a day.
If you want to make this app a reality, you will have to go through the Compatible Application Check so that you can bump your call limit from 5,000 to 1.5 million.
posted by rockindata at 4:02 PM on February 28, 2015 [1 favorite]
If you want to make this app a reality, you will have to go through the Compatible Application Check so that you can bump your call limit from 5,000 to 1.5 million.
posted by rockindata at 4:02 PM on February 28, 2015 [1 favorite]
That said, your basic approach is fine- there are a million ways to run though a list of calls to a remote API and display them on a webpage. I would do it in python, but that is because that where I am most comfortable.
posted by rockindata at 4:11 PM on February 28, 2015 [1 favorite]
posted by rockindata at 4:11 PM on February 28, 2015 [1 favorite]
The basic approach is "fine," and should work for your solution, but if you want to learn better ways, having to load a web page just to run a little script is really inefficient. As a first step, you can directly run a php script without involving the web server. 'php script.php' will run the code in the php file. Beyond that, you might want to consider learning a scripting language that is actually designed for stuff like this. Python, Ruby, maybe even Perl or shell. PHP will work ok for it, but it isn't a great language and comes with a bunch of baggage you don't need for this kind of application.
All that said, if your primary interest is just having something that works rather than learning a better way, feel free to ignore me.
posted by primethyme at 5:32 PM on February 28, 2015 [1 favorite]
All that said, if your primary interest is just having something that works rather than learning a better way, feel free to ignore me.
posted by primethyme at 5:32 PM on February 28, 2015 [1 favorite]
Ignore primethyme, PHP is fine. :) It might be 'uncool' but that doesn't mean jack. As a language, it does have some gotchas, but being based on PHP didn't stop Facebook and it won't stop you.
You'll want to add locking code, so that if the cronjob runs every 5 minutes, but the job ends up taking 7 minutes, you don't trip over yourself. Something simple like the snippet on http://www.tuxradar.com/practicalphp/8/11/0 so the 2nd call into the script just exits because the file is already locked is fine.
In addition, you'll want to make sure your database updates are small, so that you can run multiple queries at the same time without things blowing up. User A's search for pocketwatches shouldn't have an impact on User B's search for shoes, so you can run those queries in parallel. Unless two users are searching for the same thing, it should be possible to run quite a few queries at the same time.
Apologies if this is too basic, but you'll want to read up on database normalization.
You'll also want to add a logfile so you can debug what happened when things inevitably go wrong.
The next step, though, is to deal with ebay's API limits as rockindata's 1st comment suggests, as according to their math, you'll hit the limit quite soon.
posted by fragmede at 1:59 PM on March 2, 2015 [1 favorite]
You'll want to add locking code, so that if the cronjob runs every 5 minutes, but the job ends up taking 7 minutes, you don't trip over yourself. Something simple like the snippet on http://www.tuxradar.com/practicalphp/8/11/0 so the 2nd call into the script just exits because the file is already locked is fine.
In addition, you'll want to make sure your database updates are small, so that you can run multiple queries at the same time without things blowing up. User A's search for pocketwatches shouldn't have an impact on User B's search for shoes, so you can run those queries in parallel. Unless two users are searching for the same thing, it should be possible to run quite a few queries at the same time.
Apologies if this is too basic, but you'll want to read up on database normalization.
You'll also want to add a logfile so you can debug what happened when things inevitably go wrong.
The next step, though, is to deal with ebay's API limits as rockindata's 1st comment suggests, as according to their math, you'll hit the limit quite soon.
posted by fragmede at 1:59 PM on March 2, 2015 [1 favorite]
This thread is closed to new comments.
posted by bitdamaged at 3:37 PM on February 28, 2015 [1 favorite]