Programming Languages for not-quite Luddites?
January 23, 2012 8:04 AM
What online resources do you suggest where I can learn about various programming languages as opposed to learning how to code? Any sources that discuss the pros and cons of each or why it would be better to do X in Ruby rather than Python?
I'm an archivist who primarily has been working processing paper documents and I'm interested in migrating into digital archives. I have been doing a lot of readings recommended by friends/colleagues who are digital archivists, mostly having to do with workflows and metadata, published as reports sponsored by OCLC and NISO, as well as papers authored by some of my friends regarding digital initiatives at their organizations. From what I gather form my friends and what I've read and observed, I don't necessarily need to know how to program, but it's really helpful if I knew enough to be able to speak with programmers intelligently.
I'm no techie, but I'm not a total luddite. I know some basic HTML and CSS although it's been awhile since I've had cause to do anything with them. I have created finding aids in EAD (Encoded Archival Description) using a template in Notetab Pro and later using the Archivists Toolkit. As part of my MLIS (2006) I took a class in database design and we had to learn a bit of SQL and a teeny tiny bit of ColdFusion (which I've since forgotten). I'm used to working with metadata content standards. I could probably figure out a crosswalk (between EAD and MODS for example) but I wouldn't know how the back end programming works to actually transform form one to the other and I don't know if I'd need to, but I'd at least like to know enough to not be totally lost when talking to someone from the IT dept. Oh and I know that this is baby stuff, but I am participating in Code Year.
Oh and the kind of sites/projects that digital archivists get to work on include the Polar Bear Expedition Digital Collection; Digital Collections at Joyner Library, East Carolina State, among hundreds of other examples available online.
I'm an archivist who primarily has been working processing paper documents and I'm interested in migrating into digital archives. I have been doing a lot of readings recommended by friends/colleagues who are digital archivists, mostly having to do with workflows and metadata, published as reports sponsored by OCLC and NISO, as well as papers authored by some of my friends regarding digital initiatives at their organizations. From what I gather form my friends and what I've read and observed, I don't necessarily need to know how to program, but it's really helpful if I knew enough to be able to speak with programmers intelligently.
I'm no techie, but I'm not a total luddite. I know some basic HTML and CSS although it's been awhile since I've had cause to do anything with them. I have created finding aids in EAD (Encoded Archival Description) using a template in Notetab Pro and later using the Archivists Toolkit. As part of my MLIS (2006) I took a class in database design and we had to learn a bit of SQL and a teeny tiny bit of ColdFusion (which I've since forgotten). I'm used to working with metadata content standards. I could probably figure out a crosswalk (between EAD and MODS for example) but I wouldn't know how the back end programming works to actually transform form one to the other and I don't know if I'd need to, but I'd at least like to know enough to not be totally lost when talking to someone from the IT dept. Oh and I know that this is baby stuff, but I am participating in Code Year.
Oh and the kind of sites/projects that digital archivists get to work on include the Polar Bear Expedition Digital Collection; Digital Collections at Joyner Library, East Carolina State, among hundreds of other examples available online.
The decision of whether to do something in language A vs language B, to me, involves two separate pieces of research.
First, there's the question of what support tools or libraries are available in each language. For example, I know that Python has a wonderful set of numerical and scientific libraries for manipulating and plotting data. Ruby, to my knowledge, does not. It's good to be familiar with both the standard libraries for common languages (what comes more or less built-in) and the publicly available code libraries (what everyone else has built and shared).
Second, there's the question of how the language itself is built and what programming paradigms it encourages. For example, programming in an object-oriented paradigm is more difficult in a language like C which wasn't built with support for that in mind rather than a language like Java which is built to help you program that way.
Sounds like it would be helpful to gain some Category 1 knowledge because Category 2 stuff is more about how you'd actually write the code itself. To talk about what people are doing and how they're doing it, it's inevitably going to involve getting together a bunch of libraries that do what you want, writing code to make them talk to each and establish a chain of events that convert some input into whatever kind of output you need.
There's likely going to be discussion involving what parts of the system will be covered by existing libraries you can use and what parts will involve writing something from scratch. Here it will be important to understand why the decision is being made to do something from scratch (something doesn't exist in the library? it exists but works in a way you can't use...etc.) because this will be the portion of the software that takes longer and will cost you more money if you need to hire more people to do it.
I threw a bunch of random stuff in here because I wasn't sure what your involvement in the programming effort is going to be like. Let me know if you want more specifics. I'm no expert but maybe I can point you in the right direction.
And here's an interesting link that I've had for a while to an ebook for "Programming Historians". Seems a bit relevant to your work. Maybe it will have some good insights on actually using programming to do things similar to what you're looking to do.
posted by musicismath at 8:59 AM on January 23, 2012
First, there's the question of what support tools or libraries are available in each language. For example, I know that Python has a wonderful set of numerical and scientific libraries for manipulating and plotting data. Ruby, to my knowledge, does not. It's good to be familiar with both the standard libraries for common languages (what comes more or less built-in) and the publicly available code libraries (what everyone else has built and shared).
Second, there's the question of how the language itself is built and what programming paradigms it encourages. For example, programming in an object-oriented paradigm is more difficult in a language like C which wasn't built with support for that in mind rather than a language like Java which is built to help you program that way.
Sounds like it would be helpful to gain some Category 1 knowledge because Category 2 stuff is more about how you'd actually write the code itself. To talk about what people are doing and how they're doing it, it's inevitably going to involve getting together a bunch of libraries that do what you want, writing code to make them talk to each and establish a chain of events that convert some input into whatever kind of output you need.
There's likely going to be discussion involving what parts of the system will be covered by existing libraries you can use and what parts will involve writing something from scratch. Here it will be important to understand why the decision is being made to do something from scratch (something doesn't exist in the library? it exists but works in a way you can't use...etc.) because this will be the portion of the software that takes longer and will cost you more money if you need to hire more people to do it.
I threw a bunch of random stuff in here because I wasn't sure what your involvement in the programming effort is going to be like. Let me know if you want more specifics. I'm no expert but maybe I can point you in the right direction.
And here's an interesting link that I've had for a while to an ebook for "Programming Historians". Seems a bit relevant to your work. Maybe it will have some good insights on actually using programming to do things similar to what you're looking to do.
posted by musicismath at 8:59 AM on January 23, 2012
michealh, thanks for the links.
musicmath, maybe it would help to admit that I barely understand your first 3 paragraphs. I mean I get the basics, that each language has it's own library of tools or functions and that will help determine which one is best for the task but that's where it ends. I will look into all the links, particularly Programming Historians.
Maybe it would help to addthat if I got a job as a digital archivist, it would most likely be in an academic setting. It would be highly unlikely that I would have access to funds to hire a programmer. I would be working with someone from the library's IT staff who undoubtedly will be stretched very thin. It's also very likely that my supervisors will know even less about this stuff than I do. The best I can hope for is some colleagues, both in the library and IT departments, that are equally jazzed about creating digital collections and have the skill ses to make it happen or are willing to develop them. My role, as I understand it from friends in this position, will be more of a coordinator, setting up policies and procedures, workflows, reviewing best practices, etc. However the more I understand about what is possible/impossible, the more that I can speak the language, then I think that I'd be able to build up some good will and buy in from the IT department.
Oh and any digital archivists who happen to see this, please feel free to chime in, particularly if you think that my understanding of what a digital archivist does is way off base.
posted by kaybdc at 9:30 AM on January 23, 2012
musicmath, maybe it would help to admit that I barely understand your first 3 paragraphs. I mean I get the basics, that each language has it's own library of tools or functions and that will help determine which one is best for the task but that's where it ends. I will look into all the links, particularly Programming Historians.
Maybe it would help to addthat if I got a job as a digital archivist, it would most likely be in an academic setting. It would be highly unlikely that I would have access to funds to hire a programmer. I would be working with someone from the library's IT staff who undoubtedly will be stretched very thin. It's also very likely that my supervisors will know even less about this stuff than I do. The best I can hope for is some colleagues, both in the library and IT departments, that are equally jazzed about creating digital collections and have the skill ses to make it happen or are willing to develop them. My role, as I understand it from friends in this position, will be more of a coordinator, setting up policies and procedures, workflows, reviewing best practices, etc. However the more I understand about what is possible/impossible, the more that I can speak the language, then I think that I'd be able to build up some good will and buy in from the IT department.
Oh and any digital archivists who happen to see this, please feel free to chime in, particularly if you think that my understanding of what a digital archivist does is way off base.
posted by kaybdc at 9:30 AM on January 23, 2012
ESR's evaluation of various languages. The particular technologies there are out of date (in particular, Ruby doesn't show up, the JS situation has improved since then, and there is no mention of MS C#/.net stuff -- 5 years changes a lot!), but the ideas there are sound [1]
I *am* a pro-dev, and I still have to make these same evaluations of whether to do Project X in tech Y, while being aware of my own blind spots. My method: I tend to suggest a 'straw' implementation in tech Y, and ask for feedback. Occasionally, I am surprised that a solution already exists (for me, most often in Ruby, since I do mostly python/javascript). Where do I ask these questions? My local meetups (PyMNtos, and Ruby.mn), at work, irc, StackOverflow / Quora, etc.
Of particular problem with archiving is that (for people I know), it's not a sexy/fun programming area. OCLC is a mess to deal with, EAD's are unpleasant, and the whole area is filled with subject specific XML/XSLT grubbiness, etc. Perhaps talking with people from Koha would help shed some light on particular questions as well?
[1] and as an (opinionated) update...
C is still C
C# is similar to Java, but for MS, with particular awesomesauce at having a great standard library. Java still exists, and Android makes it more relevant. Obj-C is only used on Apple.
Python / Ruby are similar, though Python has Numpy (as stated above). Ruby has Rails, Python has Django. Both have mostly displaced Perl as weapon of choice for admin/scripting/web services / glue with other technologies
R is there is you really want to do stats (and plotting)
Lua or JS would replace TCL most places where it would have been used
JS is growing a lot, in many unexpected ways (Node.js for server side stuff, for example), but is still weirdly limited to in-browser contexts in many cases. This might change by the time I hit submit!
For a particular project, start from "there is a good library in language X", and see how far that goes!
posted by gregglind at 9:59 AM on January 23, 2012
I *am* a pro-dev, and I still have to make these same evaluations of whether to do Project X in tech Y, while being aware of my own blind spots. My method: I tend to suggest a 'straw' implementation in tech Y, and ask for feedback. Occasionally, I am surprised that a solution already exists (for me, most often in Ruby, since I do mostly python/javascript). Where do I ask these questions? My local meetups (PyMNtos, and Ruby.mn), at work, irc, StackOverflow / Quora, etc.
Of particular problem with archiving is that (for people I know), it's not a sexy/fun programming area. OCLC is a mess to deal with, EAD's are unpleasant, and the whole area is filled with subject specific XML/XSLT grubbiness, etc. Perhaps talking with people from Koha would help shed some light on particular questions as well?
[1] and as an (opinionated) update...
C is still C
C# is similar to Java, but for MS, with particular awesomesauce at having a great standard library. Java still exists, and Android makes it more relevant. Obj-C is only used on Apple.
Python / Ruby are similar, though Python has Numpy (as stated above). Ruby has Rails, Python has Django. Both have mostly displaced Perl as weapon of choice for admin/scripting/web services / glue with other technologies
R is there is you really want to do stats (and plotting)
Lua or JS would replace TCL most places where it would have been used
JS is growing a lot, in many unexpected ways (Node.js for server side stuff, for example), but is still weirdly limited to in-browser contexts in many cases. This might change by the time I hit submit!
For a particular project, start from "there is a good library in language X", and see how far that goes!
posted by gregglind at 9:59 AM on January 23, 2012
I would be working with someone from the library's IT staff who undoubtedly will be stretched very thin.
Then your answer, unfortunately, will have less to do with what might be the ideal language for the job than with what language the IT staff is already familiar with.
(that said, while I know not much about the specific academic area you're working in, if it involves a lot of text munging and XML manipulation my go-to solution would be perl, the duct tape of programming languages.)
posted by ook at 11:05 AM on January 23, 2012
Then your answer, unfortunately, will have less to do with what might be the ideal language for the job than with what language the IT staff is already familiar with.
(that said, while I know not much about the specific academic area you're working in, if it involves a lot of text munging and XML manipulation my go-to solution would be perl, the duct tape of programming languages.)
posted by ook at 11:05 AM on January 23, 2012
It might also help to divide this project up into parts so that the IT person on the other end doesn't feel overwhelmed and make it easier to explore solutions together.
For example, if you were creating a digital archive site it might involve:
scan pages -> tag by category and date -> convert to xyz format -> input into database -> design front-end -> link front-end with database -> etc etc
Once you break the project up into pieces you can start finding tools to help you get them done and/or break the programming task down into bite-size pieces. Also, the more specifically you can understand what you're trying to do the easier it will be to ask the internet about it.
And here's another language comparison source (pdf)
posted by musicismath at 11:15 AM on January 23, 2012
For example, if you were creating a digital archive site it might involve:
scan pages -> tag by category and date -> convert to xyz format -> input into database -> design front-end -> link front-end with database -> etc etc
Once you break the project up into pieces you can start finding tools to help you get them done and/or break the programming task down into bite-size pieces. Also, the more specifically you can understand what you're trying to do the easier it will be to ask the internet about it.
And here's another language comparison source (pdf)
posted by musicismath at 11:15 AM on January 23, 2012
I don't necessarily need to know how to program, but it's really helpful if I knew enough to be able to speak with programmers intelligently.
Given your requirements, I recommend you start out by learning Python. You will learn enough programming concepts to be able to communicate with programmers, and you will be able to easily write small useful programs for doing the kinds of things you will likely need to do as a non-pro programmer, such as automating aspects of your work or converting the data from system X so it can be added to system Y.
In the future you might want or need to learn other languages, but... a) you can go a long way with Python alone and b) you are not right now in a position to even understand the pros and cons of choosing one language over another, but you will be much better placed to do that after you get to grips with Python.
The Archivists' Toolkit appears to be built in Java, so you might want to consider that as well. But Python is much easier to pick up and play with, and there is a version called Jython that should allow you to work smoothly with Java programs and libraries if you ever need to do that.
posted by philipy at 11:39 AM on January 23, 2012
Given your requirements, I recommend you start out by learning Python. You will learn enough programming concepts to be able to communicate with programmers, and you will be able to easily write small useful programs for doing the kinds of things you will likely need to do as a non-pro programmer, such as automating aspects of your work or converting the data from system X so it can be added to system Y.
In the future you might want or need to learn other languages, but... a) you can go a long way with Python alone and b) you are not right now in a position to even understand the pros and cons of choosing one language over another, but you will be much better placed to do that after you get to grips with Python.
The Archivists' Toolkit appears to be built in Java, so you might want to consider that as well. But Python is much easier to pick up and play with, and there is a version called Jython that should allow you to work smoothly with Java programs and libraries if you ever need to do that.
posted by philipy at 11:39 AM on January 23, 2012
(if you go the python route, there have been zillions of "how to get going with python" posts on both AskMe and MeFi, and I don't want to clutter this with another one :) Disclosure: I am one of the people behind PyStar.org )
posted by gregglind at 1:47 PM on January 23, 2012
posted by gregglind at 1:47 PM on January 23, 2012
Thanks for all the great responses. I think that more likely than not, ook is right on the money (that the IT staff background/skills will determine what we use). All of this is hypothetical until I (hopefully) find myself in a position that requires these skills. But I wanted to get a head start and better my chances for making that happen. So it sounds like the consensus is that the best way to learn about programming languages in general is to learn one; duh! Since I'm pretty much starting at ground zero, I'll definitely check out Python based on the recs here.
I have a ton of other issues to wrap my head around on the broader topic of digital curation and have also found some additional resources on my own since asking the question including the digital curation exchange and code4lib (actually my friend suggested the latter saying that a lot of it would probably be over my head but it's a good place to check out what people are doing and get familiar with the lingo). I just list them here in case they might help someone else in the library world without a tech background who's interested in these issues and doesn't know where to start.
posted by kaybdc at 5:11 PM on January 23, 2012
I have a ton of other issues to wrap my head around on the broader topic of digital curation and have also found some additional resources on my own since asking the question including the digital curation exchange and code4lib (actually my friend suggested the latter saying that a lot of it would probably be over my head but it's a good place to check out what people are doing and get familiar with the lingo). I just list them here in case they might help someone else in the library world without a tech background who's interested in these issues and doesn't know where to start.
posted by kaybdc at 5:11 PM on January 23, 2012
This thread is closed to new comments.
After you learn about a language, type "language vs" into Google and look into the suggestions and follow up on those as well.
posted by michaelh at 8:16 AM on January 23, 2012