How best to organize python scripts?
November 4, 2009 4:36 AM   Subscribe

What is the best way of organizing python scripts?

I have a quickly growing number of python scripts that I am using for some work. At the moment I don't have a problem remembering what the twenty or so scripts do, but I'm worried in a few months time that I will have completely forgotten.

I do document my work as much as possible, but when I stare in to a folder of a bunch of files it is a little hard to remember what exactly is going on in each file without opening it and starting to read through.

I don't have much of a formal background in programming so any suggestions about effective ways of organizing these would be appreciated.

I'm using Python 2.5 and Idle on Windows.
posted by a womble is an active kind of sloth to Computers & Internet (13 answers total) 5 users marked this as a favorite
 
Would a text file to act as a table of contents work?
posted by entropic at 4:44 AM on November 4, 2009


Are you familiar with how multi-modal command line programs work? That is, programs where the first argument to the command specifies one of a number of subroutines that will run. Version control systems (svn, git, darcs) are good examples of this ("svn checkout", "git pull"). It may help your organization if you combine scripts with related tasks into one command. So if you have five scripts that operate on widgets, you can write another script "munge-widget" that takes arguments like "copy", "reverse", "delete" or what have you that activate the functionality of the relevant individual program. I know in Ruby there are libraries designed to simplify the task of writing programs like this, assuredly they exist for Python. You may find it conceptually easier to create command-line options instead; for this, see the built-in optparse library.
posted by silby at 4:49 AM on November 4, 2009


The key phrase is "stare into a folder"; this implies that you are storing all of your files in a single folder. If you organize your files in a hierarchical fashion, using sub-folders, you might find yourself with only one or two scripts in a "bottom level" folder. If these are named correctly, it should be obvious what they do. For example, you might start by having "university work" and "outside work" folders at the top level. Outside work may then be partioned by project, while the University folder can be subdivided by course/unit/module.
posted by gene_machine at 4:52 AM on November 4, 2009


Best answer: Good naming, an overview document that you can easily access (notepad, google docs, whatever tool you're using for this), and usage texts in the scripts that are printed if you run the script with no arguments.

(If you're using module-level docstrings for the help text, you can use standard documentation tools to generate the index for you, or even write your own little indexing tool.)

I often write task-oriented notes as well, with transcripts of real sessions to show how the tools are used. Just cutting and pasting a session to a text file can be good enough -- seeing what you did three month earlier can be a great help when you're trying to do a similar thing again. Or mail the transcripts to yourself, so you have them sitting in a mail folder.

(This has nothing to do with formal CS education, of course. More a general ability of organizing things.)
posted by effbot at 5:17 AM on November 4, 2009


Best answer: Make them display a terse description of what they do when you execute them without arguments.
posted by flabdablet at 5:32 AM on November 4, 2009


Best answer: One thing I struggled with when I started with Python was being able to import my own work. Because I couldn't figure it out for a long time, I left everything I needed in one folder. The trick to importing stuff is to organize the scripts into sensible groups, make folders from the groups and then put an empty file called __init__.py in each folder so Python knows that's a module and lets you import from it. The easiest/ laziest thing is to just make a folder under your site-packages directory (which lives under Python25/lib on Windows) with your initials or something and then put your folders under that.

Then you can say

from tpc.widgets import myscriptname
posted by yerfatma at 6:24 AM on November 4, 2009 [1 favorite]


Categorize by usage, and make a hierarchy of directories. For twenty files, you should probably have about three or four groups if you divide them well.

Name the files descriptively.

Put """Docstrings at the top of the file""" so you have a quick description and then a detailed description.
posted by cmiller at 7:14 AM on November 4, 2009


Best answer: What cmiller said about filenaming and docstrings. Docstrings are your friend. But I'd start smaller and simpler than that: put all your scripts in a folder say c:\Scripts. Then set an environment variable (under System Properties in Windows: Win+Pause, Advanced System Settings, Environment Variables) called PYTHONPATH to be equal to c:\Scripts. That will let you open a python prompt anywhere and do "import myscript". More importantly, it will let you do "import myscript" from within other Python files. Learn this (ugly) trick: in any python file you can structure things like:
def myfunction(arg):
   ...

if __name__ == "__main__":
   ...
Anything that goes after the if is called only when you execute the script itself, but NOT when you import it into another script. You can then create another script that simply does this:
from myscript import myfunction

myfunction("foo")
That way you can start creating a library of functions that you use often. After that point you can start worrying about packages, __init__.py, etc. Start small. And welcome to the best programming language ever.
posted by costas at 7:28 AM on November 4, 2009


Response by poster: Thanks for all the suggestions - my productivity has really increased by using python and this will make it even better.

I like the idea of having the script print out what it does by default, and keeping a detailed list of what each file is with some sample output is a good idea too.

I have been creating functions and importing each script to run that specific function; it did not occur to me to structure this hierarchy using folders, or that this is something one can do in python. I will explore this further.

costas, I went to set my environmental variable, and PYTHONPATH was already set to a folder of a program (C:\Program Files\ArcGIS\bin). I'm using Python for controlling this program (ArcGIS), but my python scripts are all in C:\python_code. Can I just add another variable and add that to the python path?
posted by a womble is an active kind of sloth at 8:26 AM on November 4, 2009


Response by poster: and metafilter related questions gave me this..... didn't see it when I searched earlier.
posted by a womble is an active kind of sloth at 8:27 AM on November 4, 2009


Best answer: PYTHONPATH is a list of directories, separated by semicolons (on windows, at least), so you can set it to something like:

  C:\python_code;C:\Program Files\ArcGIS\bin

This page contains a bit more info.

You can verify that Python picks this up by starting Python from the command line and typing

  import sys
  sys.path

(this gives you a list of all directories that Python checks when looking for a given module)

Also note that you can run modules as scripts with the -m option; "python -m module" will search for module.py along the path, and the run it as if you'd spelled out the full path to the file.
posted by effbot at 8:51 AM on November 4, 2009


Best answer: Obviously naming the scripts intelligently and figuring out a worthwhile folder hierarchy will help, but there's plenty of documentation generation tools you can use and incorporate into your habits.

You've already seen mention of docstrings, but there's also PyDoc to generate HTML.

So my suggestion is to organize them like so:

/
|-> doc
|-> src
   |-> category1
   |-> category2
   |-> category3
|-> build (build scripts for documentation)
| README.txt
| ant.xml (or whatever build script you want for documentation generation)
| .svn (or whatever revision control you feel like; just use an .ignore for the autogenerated files)

If you need them installed on PATH, then add an install rule to copy or symlink in a system folder. Or a group folder and add it to PATH.
posted by pwnguin at 10:14 AM on November 4, 2009


Write comprehensive doctests and a nice suite of unittests for them. Then it'll be basically self-documenting.
posted by i_am_a_Jedi at 4:14 PM on November 4, 2009


« Older What are my best options if I only need a car for...   |   People in Europa / Ciao ciao bella, Monaco Newer »
This thread is closed to new comments.