Running a Python Script every week on a remote server
March 24, 2021 6:30 AM
Hello, I have no programming experience. However, I run a private subreddit with a bot that must run every week. Before, I had a member run the bot on their server, but they just left. I would like to run it myself to avoid any future issues. Do you have any recommendations on the easiest way to accomplish this goal?
It's a small bot that calls on the reddit API. My member could probably walk me through these steps, but they deleted all traces of themselves, as is the norm.
The bot script is in Python.
So, basically, I know I need to pay for some server space, maybe install some kind of program on that space to run python, and then also install some kind of timer on that script?
That leads me to 100 questions that I just don't know the detail to ask - what server space is cheapest/best for this application? What program should I install to run the script on the server? (if that's even how servers/python works). What timer code should I use for this? I don't know python - but the bot seems pretty straightforward.
Or, if there is some resource out there that could walk me through this process/hold my hand, I would pay for assistance in the process. I'm just not sure where to find someone like that.
Thanks so much!
It's a small bot that calls on the reddit API. My member could probably walk me through these steps, but they deleted all traces of themselves, as is the norm.
The bot script is in Python.
So, basically, I know I need to pay for some server space, maybe install some kind of program on that space to run python, and then also install some kind of timer on that script?
That leads me to 100 questions that I just don't know the detail to ask - what server space is cheapest/best for this application? What program should I install to run the script on the server? (if that's even how servers/python works). What timer code should I use for this? I don't know python - but the bot seems pretty straightforward.
Or, if there is some resource out there that could walk me through this process/hold my hand, I would pay for assistance in the process. I'm just not sure where to find someone like that.
Thanks so much!
I would be thinking about AWS Lambda for this, rather than using a server at all - https://docs.aws.amazon.com/lambda/latest/dg/welcome.html
On the AWS free tier, you get 1 million lambda executions a month, with up to 3.2 million seconds of compute time. I imagine that's enough.
You can set up scheduling for it via a cron expression (if you were using a server, you'd use cron as well).
The trickiest thing with Lambda is bundling up the dependencies - see https://docs.aws.amazon.com/lambda/latest/dg/python-package.html for more details.
It's plausible that this will be the easiest to manage long-term, since you don't need to care about server maintenance, etc.
This assumes that the Python script doesn't need anything like a database - that's possible to do, but will make it a bit more complex.
posted by siskin at 6:40 AM on March 24, 2021
On the AWS free tier, you get 1 million lambda executions a month, with up to 3.2 million seconds of compute time. I imagine that's enough.
You can set up scheduling for it via a cron expression (if you were using a server, you'd use cron as well).
The trickiest thing with Lambda is bundling up the dependencies - see https://docs.aws.amazon.com/lambda/latest/dg/python-package.html for more details.
It's plausible that this will be the easiest to manage long-term, since you don't need to care about server maintenance, etc.
This assumes that the Python script doesn't need anything like a database - that's possible to do, but will make it a bit more complex.
posted by siskin at 6:40 AM on March 24, 2021
Thank you both - I can definitely look into this.
If you don't mind indulging my ignorance and saving me a few hours of googling - any primers on what a cron expression is, how to run it, what a dependency is, etc?
No databases or anything. It's really simple code, basically logs into reddit, reads a specific thing, does a specific thing, and it's done.
I really don't need you to hold my hand, but clicking that link, where it tries to define these things, is a bit like reading a different language.
But, it sounds like what you are saying is - 1. make a lambda account thingy, 2. install some kind of pythony code into it, 3. make sure my file is the right type, 4. upload it into the lambda thingy, 5. install some kind of cron expressiony thingy, that is pointed at my zip file, and is set to run every week.
Are those about the right steps?
posted by bbqturtle at 6:50 AM on March 24, 2021
If you don't mind indulging my ignorance and saving me a few hours of googling - any primers on what a cron expression is, how to run it, what a dependency is, etc?
No databases or anything. It's really simple code, basically logs into reddit, reads a specific thing, does a specific thing, and it's done.
I really don't need you to hold my hand, but clicking that link, where it tries to define these things, is a bit like reading a different language.
But, it sounds like what you are saying is - 1. make a lambda account thingy, 2. install some kind of pythony code into it, 3. make sure my file is the right type, 4. upload it into the lambda thingy, 5. install some kind of cron expressiony thingy, that is pointed at my zip file, and is set to run every week.
Are those about the right steps?
posted by bbqturtle at 6:50 AM on March 24, 2021
That's right.
A cron expression is something like:
0 4 * * SUN
where the fields are:
minute hour day-of-month month day-of-week
The above expression would run a script weekly at 4am every Sunday morning. Cron itself is the name of the Unix service that typically does this on a physical server, but cron expressions are used more generally whenever processes are scheduled. A helpful site to decode them is https://crontab.guru/#0_4_*_*_SUN.
A dependency would be something like a library used by the Python script. If the Python is simple enough, there may not be any, and so you may be able to just paste the Python code into the lambda code editor without worrying about it. However, from some googling, it looks like there are some Python libraries for the Reddit API that may be being used by your script (PRAW and Pushshift) - I don't know anything about them, but they might be things you need to include in a zip file that you upload.
posted by siskin at 7:04 AM on March 24, 2021
A cron expression is something like:
0 4 * * SUN
where the fields are:
minute hour day-of-month month day-of-week
The above expression would run a script weekly at 4am every Sunday morning. Cron itself is the name of the Unix service that typically does this on a physical server, but cron expressions are used more generally whenever processes are scheduled. A helpful site to decode them is https://crontab.guru/#0_4_*_*_SUN.
A dependency would be something like a library used by the Python script. If the Python is simple enough, there may not be any, and so you may be able to just paste the Python code into the lambda code editor without worrying about it. However, from some googling, it looks like there are some Python libraries for the Reddit API that may be being used by your script (PRAW and Pushshift) - I don't know anything about them, but they might be things you need to include in a zip file that you upload.
posted by siskin at 7:04 AM on March 24, 2021
Does it even need to be a remote server? If you have reliable home internet access and a computer that's on all the time, you don't need to pay for a cloud server at all.
Macs have cron and python built-in. On Windows, you could install Python and run the Task Scheduler. Set up your computer to run the task on a schedule, and you're done.
If you don't have a desktop computer, you could buy a Raspberry Pi and have it do it—the Pi comes with Python and cron.
posted by vitout at 7:13 AM on March 24, 2021
Macs have cron and python built-in. On Windows, you could install Python and run the Task Scheduler. Set up your computer to run the task on a schedule, and you're done.
If you don't have a desktop computer, you could buy a Raspberry Pi and have it do it—the Pi comes with Python and cron.
posted by vitout at 7:13 AM on March 24, 2021
While I like the idea of just running it by my PC, I'm not sure I want to pay to leave my PC on all the time. It's a bit older and pretty loud. And if I missed one, it would be a bit of an annoyance.
posted by bbqturtle at 7:16 AM on March 24, 2021
posted by bbqturtle at 7:16 AM on March 24, 2021
It looks like the bot does use PRAW. In the first line it says "import PRAW, random, time, pickle, sys, os, configparser, datetime, import prawcore"
posted by bbqturtle at 7:18 AM on March 24, 2021
posted by bbqturtle at 7:18 AM on March 24, 2021
Well, your PC wouldn't necessarily have to be on all the time. Do you shut it down completely when you're not using it? Or do you put it in standby / sleep?
Windows Task Scheduler can wake a computer to perform a task.
posted by vitout at 7:23 AM on March 24, 2021
Windows Task Scheduler can wake a computer to perform a task.
posted by vitout at 7:23 AM on March 24, 2021
You should be able to install PRAW on your server using “pip install praw”.
posted by en forme de poire at 7:24 AM on March 24, 2021
posted by en forme de poire at 7:24 AM on March 24, 2021
I might gently suggest finding a friend or another subreddit member to run this. Even if it's a simple script, when you get into running it on a server or AWS, you're signing up for some ongoing maintenance headache. At the very least occasional security patches, server migrations, dependencies getting deprecated (e.g. sometimes AWS deprecates python versions for Lambda), etc. If all of this stuff is truly foreign to you, and it's not something you're interested in learning about for its own sake, I worry that it's going to break at some point in the future and leave you in a jam. I'm absolutely not saying that you're not capable of learning or doing this; just that I feel like you're getting yourself into a long term commitment here that you might not want if this isn't a passion of yours....
posted by primethyme at 7:48 AM on March 24, 2021
posted by primethyme at 7:48 AM on March 24, 2021
AWS can get dizzyingly complex for someone that's not an engineer.
I'd opt for getting the script running locally on my PC. If it can run more than once a week without issue, I'd set up the cron / Task Scheduler to run it every day, during a time that your PC is likely to be on. As long as you don't go more than a week without your PC powered on, you'll be fine.
If you seldom use your PC and don't want to rely on it being turned on, I'd buy a cheap, low-specced Raspberry Pi, connect it to my network (via Ethernet if possible, Wi-Fi if not), and set up cron to run the script daily.
If this doesn't sound like anything you're interested in doing, I'd get another subreddit member to help, as primethyme suggested.
posted by vitout at 8:16 AM on March 24, 2021
I'd opt for getting the script running locally on my PC. If it can run more than once a week without issue, I'd set up the cron / Task Scheduler to run it every day, during a time that your PC is likely to be on. As long as you don't go more than a week without your PC powered on, you'll be fine.
If you seldom use your PC and don't want to rely on it being turned on, I'd buy a cheap, low-specced Raspberry Pi, connect it to my network (via Ethernet if possible, Wi-Fi if not), and set up cron to run the script daily.
If this doesn't sound like anything you're interested in doing, I'd get another subreddit member to help, as primethyme suggested.
posted by vitout at 8:16 AM on March 24, 2021
Yeah, this is one of those times where the technically-correct answers (a Lambda or a cron) might not be the correct ones because you need to know how to handle things that break, whether it's on your end or with the Reddit API.
I run exactly those kinds of tasks -- regular API scrapes and parsing with Python -- on a little first-gen Raspberry Pi plugged into the router; a Pi Zero W would be more than enough if you were starting afresh. But I also wrote the scripts, so I know what's meant to happen each step of the way and get debug / error messages sent to an email address.
As others have said, regular weekly housekeeping via the API absolutely doesn't need something always on, so you could use Task Scheduler. But as you say, forgetting to switch on your PC one week would be a right old pain in the rear.
So if you don't fancy running a Raspberry Pi and learning what to do when things break, I'd suggest farming it out. Chances are one of your members has a Pi or a NAS or just a regular web hosting account with shell / cron capability. (I think even janky CPanel hosts have cron.)
posted by holgate at 8:48 AM on March 24, 2021
I run exactly those kinds of tasks -- regular API scrapes and parsing with Python -- on a little first-gen Raspberry Pi plugged into the router; a Pi Zero W would be more than enough if you were starting afresh. But I also wrote the scripts, so I know what's meant to happen each step of the way and get debug / error messages sent to an email address.
As others have said, regular weekly housekeeping via the API absolutely doesn't need something always on, so you could use Task Scheduler. But as you say, forgetting to switch on your PC one week would be a right old pain in the rear.
So if you don't fancy running a Raspberry Pi and learning what to do when things break, I'd suggest farming it out. Chances are one of your members has a Pi or a NAS or just a regular web hosting account with shell / cron capability. (I think even janky CPanel hosts have cron.)
posted by holgate at 8:48 AM on March 24, 2021
While I agree that farming it out is the best solution, I've done that twice before, and both times I was left nervous when things were missed / people disappeared on me randomly.
If you guys think running it locally is that much simpler than running it on a server, then maybe I should look into that!
...Does that change the steps a lot?
posted by bbqturtle at 10:08 AM on March 24, 2021
If you guys think running it locally is that much simpler than running it on a server, then maybe I should look into that!
...Does that change the steps a lot?
posted by bbqturtle at 10:08 AM on March 24, 2021
I would argue that GitHub Actions would be an excellent candidate for filling this niche. It would all happen in the cloud and you can manage its behavior via GitHub's tools. If you paid for a private account you could hide the code entirely, but if that doesn't matter you could do it in the open for free (secrets like API keys and such would be hidden, of course).
* GitHub Actions support running cron-style as well as on demand when you click a button
* Collaboration on both the code and the configuration to run that code happens in the same place
* There is lots of good documentation and examples to steal from
* The community has created many GitHub Actions that you can use directly, so many common tasks are already automated for you.
It would be straightforward to provide your own docker configuration for this task and then have an action that builds that container and runs it on GitHub's servers once a month.
posted by hobu at 11:52 AM on March 24, 2021
* GitHub Actions support running cron-style as well as on demand when you click a button
* Collaboration on both the code and the configuration to run that code happens in the same place
* There is lots of good documentation and examples to steal from
* The community has created many GitHub Actions that you can use directly, so many common tasks are already automated for you.
It would be straightforward to provide your own docker configuration for this task and then have an action that builds that container and runs it on GitHub's servers once a month.
posted by hobu at 11:52 AM on March 24, 2021
Here's a potential simple-ish and cheap hosted option: NearlyFreeSpeech is a pay-as-you-go, work-it-all-out-yourself host that offers scheduled tasks, though it's limited to running at the top of the hour. It'd probably cost you a cent a day. I've just tested it on the account I've had hanging around for a while and it works for me.
posted by holgate at 2:25 PM on March 24, 2021
posted by holgate at 2:25 PM on March 24, 2021
I think you have a few options in order of increasing complexity and billing:
1) Run it locally in a shell, and just set a calendar reminder to do it weekly-ish.
2) Run it as holgate suggests - on a nearlyfreespeech shell.
3) Buy a really cheap rasberry pi, run it there.
I think the middle option is probably your best bet or find another community member you trust. I would not go down the path of a cloud service (this is what I do as a professional and for the home user it's just totally unnecessary who has this singular use case).
I do think that if you have another use for it there is some utility in a droplet from DigitalOcean or the like but the complexity ramps up real quick and the potential financial liability if you don't hold on to those credentials isn't great. Docker, Lambda, Github Actions, etc are all orders of magnitude more complex that you need to deal with to do this.
posted by iamabot at 5:08 PM on March 24, 2021
1) Run it locally in a shell, and just set a calendar reminder to do it weekly-ish.
2) Run it as holgate suggests - on a nearlyfreespeech shell.
3) Buy a really cheap rasberry pi, run it there.
I think the middle option is probably your best bet or find another community member you trust. I would not go down the path of a cloud service (this is what I do as a professional and for the home user it's just totally unnecessary who has this singular use case).
I do think that if you have another use for it there is some utility in a droplet from DigitalOcean or the like but the complexity ramps up real quick and the potential financial liability if you don't hold on to those credentials isn't great. Docker, Lambda, Github Actions, etc are all orders of magnitude more complex that you need to deal with to do this.
posted by iamabot at 5:08 PM on March 24, 2021
I'm not sure I want to pay to leave my PC on all the time. It's a bit older and pretty loud.
It's astonishing how many uses you'll find for a small but capable machine that runs cool and completely silent and consumes less than five watts even at full load (less than two when idle) so that leaving it running 24x7 causes no problems.
posted by flabdablet at 12:46 AM on March 25, 2021
It's astonishing how many uses you'll find for a small but capable machine that runs cool and completely silent and consumes less than five watts even at full load (less than two when idle) so that leaving it running 24x7 causes no problems.
- Buy an Odroid N2+ including the optional 12V power supply.
- Buy a high endurance micro SD card. If you don't already have a way to access those cards from your daily driver computer, get a micro SD to USB adapter as well.
- Download an Armbian Buster image for the N2+ and follow the Armbian Quick Start Guide to get it onto the micro SD card.
- Fit the N2+ with a CR-2032 lithium coin cell to let it keep time when powered down.
- Fit the micro SD card to the N2+, connect an Ethernet patch cable between its Ethernet port and a spare LAN port on your home router, then connect the power plug to make it go. You should see a heartbeat light start flashing on the board, and some activity lights flickering on the Ethernet socket.
- On your daily driver, open a Terminal window (Mac OS or Linux) or a CMD window (Windows 10), type
ssh root@odroid-n2
and hit Enter. When prompted, allow ssh to add this new host to its list of trusted hosts. You'll then see a login prompt. The password is initially 1234 (you won't see that echoed, just type it blindly and hit Enter). - Armbian will prompt you to enter and confirm a new root password. Again, you won't see the passwords echo. Use the new one next time you ssh in.
posted by flabdablet at 12:46 AM on March 25, 2021
Sorry, I see you need your script to run weekly. Put the symlink inside /etc/cron.weekly instead of /etc/cron.daily.
posted by flabdablet at 12:53 AM on March 25, 2021
posted by flabdablet at 12:53 AM on March 25, 2021
Except /etc/cron.weekly runs at the time specified in the default crontab so if the script needs to execute at a specific time there'd still be some fiddling.
posted by holgate at 12:33 PM on March 26, 2021
posted by holgate at 12:33 PM on March 26, 2021
It's a server. There's always fiddling.
Trying the simplest way first and seeing if it's workable is still good practice.
posted by flabdablet at 12:40 PM on March 26, 2021
Trying the simplest way first and seeing if it's workable is still good practice.
posted by flabdablet at 12:40 PM on March 26, 2021
This thread is closed to new comments.
posted by condour75 at 6:33 AM on March 24, 2021