I'm using Python in my research now to build some models and analyze some data. I've done this with Matlab and R in the past and became fairly used to running individual lines or blocks of code at a time and doing a lot of interactive printing and plotting from the console. This behavior seems to be a bit more difficult in Python and I'm trying to find the best way to be productive.
The behavior in Matlab is called
'Cell Mode' and it is common in a lot of scientific software. It lets you demarcate code regions with a special comment character, and with Ctrl+Enter you run the current cell where your cursor is located. Cells are kind of like methods but you don't call them -- you move the cursor to run individual cells, or you run all the cells in a file sequentially. Variables within a cell persist after a cell is run.
For complex and large tasks it makes sense to write more 'proper' code, separating data processing and analysis steps into separate methods or files, but most of the time this is overkill, and it makes it harder to do interactive data exploration and plotting.
I currently use an IDE called
Spyder which is recommended by other scientific Python users. It has a feature called 'Run selection or current block' which is supposed to replicate this, but it is buggy and breaks if you use loops. I've been looking for another IDE that might have this feature and come up empty and don't have time to try a bunch. Any suggestions?
Alternatively I have been trying to get into a more traditional programming workflow, separating my code into functions and using breakpoints and debuggers when I need to do interactive plotting and printing. I get frustrated by scoping -- variables within functions are local, of course, but I often want to be able to print and plot them as I'm writing the code so I can tweak things, and the only way I can get access to them is to stop execution within that method, but I find breakpoints and debuggers clunky. Maybe I just need to get better at this. I often can't get things to work the way I want, I use a lot of global variables, I find I want to interactively access local variables from other methods that I thought I wouldn't need to, and in general I am wasting time and headspace on this problem when I want to be focusing on my data. What are the best practices here?
posted by Blazecock Pileon at 2:10 PM on January 15