Programming neophyte looking for a language to start with. Or, what language should I learn to complete these tasks?
August 11, 2008 8:24 AM   Subscribe

What programming language should a neophyte start with to organize his life? Yes, another one of those questions. But I have two specific tasks that I'd like to accomplish and this makes choosing a language really hard for me. Especially since i'm a complete n00bler.

I'm a complete virgin to programming other than the little bit of BASIC that was taught to me in High School ages ago. But I REALLY want to learn a programming language. One that I can use as a foundation but more importantly a tool for doing things on/to/with my internet connected PC.

I have two specific tasks that I'd like to start out with for my first programs that I believe will be simple enough for a beginner like me.

1. Organize my horribly disorganized library
---Read through a folder filled with books in .pdf/.chm/.djvu format
---rename the files with their isbn number (queried from somewhere like amazon)
---fill in the "title","subject","author","category","keywords", and "comments" sections under right click on file Properties->Summary (also queried from amazon or the like)
---compress the file
---create a subject/category folder and place the compressed file there

2. Organize ~20mb worth of bookmarks (in multiple .html files)
---Check for duplicate bookmarks in this collection of .html files
----Use the pages content to scrape tag words then input them in the right click on bookmark Properties->"keyword" section and if possible also fill in the "description" area. Delicious to me would be a last resort as it's become a mess with people using tags that often look like gibberish and probably only make sense to their user and/or their own piece of code.

It'd also be nice for these apps to be cross platform (Windows and Linux) but that's an afterthought really.

Now, I know this might sound like a tall order for a complete neophyte but I believe the first program that I've outlined above could be easy enough to complete for someone like me.

So, what should it be? Perl, Python, C, C++, something else? Maybe someone out there knows of a *free ware/beggar ware/Open Source* program that does what I desperately need to get done? Please impart your collective wisdom unto me, AskMeFi! I beg you!
posted by monkishies to Education (35 answers total) 18 users marked this as a favorite
So-called "scripting" languages are well-suited for the tasks you describe: Check out Python or Perl (perl is arguably more powerful, but the syntax is messier).
posted by mpls2 at 8:28 AM on August 11, 2008

Well, I don't know where the Properties-->Summary stuff is stored, or how easy it is to manipulate, but all the rest of it is just asking for Python. (Perl is, I agree, arguably more powerful, but Python is - and this is an opinion, but it's one shared by others - a lot more newbie-friendly in its syntax than perl)
posted by Tomorrowful at 8:30 AM on August 11, 2008

I vote +1 for perl, but it probably doesn't matter that much between the two. If I were you I would look at some sample code and determine which language seems more intuitive to you.
posted by traco at 8:35 AM on August 11, 2008

IMHO, Python's less confusing and easier to learn. Granted you can write much more compact code in Perl, but for what you're trying to do I think Python would do fine (and it comes with its own IDE).

Avoid C/C++ entirely for what you're trying to do. While I love C, it's a pain to do any of that heavy string-processing stuff with it.
posted by Xany at 8:37 AM on August 11, 2008

Ruby is easier to learn than either Python or Perl, but it's not necessarily better suited to the tasks at hand. Any of the three should work, but I would choose Ruby simply because of its user-friendliness.
posted by Picklegnome at 8:41 AM on August 11, 2008

I wouldn't learn Perl unless you really have to. It's crazy. Python and Ruby are both better choices. C/C++ are awesome, but really aren't suited for what you want to do.
posted by chunking express at 8:50 AM on August 11, 2008

These are pretty straight-forward programming tasks. In practicality any modern programming language could accomplish them.

If you want to do some solid learning I would say go and make attempts to accomplish sub-tasks in several different languages - say, spend a week trying out each language - then pick the one you like the most.

If I was going to toss out three languages I'd recommend trying for variety, I'd say
  1. Python
  2. Unix shell scripting
  3. Visual Basic .NET (which, if you worked in Mono, would be cross-platform)
Also - I think the right-click properties you're talking about are proprietary Windows file metadata. But there's internal PDF metadata too (probably under something like the File->Properties menu when you're viewing a PDF) which you could set using a PDF handling library (or command line tool, in the case of Unix shell scripting).
posted by XMLicious at 8:50 AM on August 11, 2008

There's a dynamic tension between the "best language to learn programming with / to learn first" and "the best language for [task]". One may not be the other, and is unlikely to be. It's one of the failures of modern CS teaching that students get thrown at Java, which is not a great place to start learning.

Having said that, I concur with the above: Python is a good choice. Your second task is a classic scripting language one, and Python is kind to beginners. The first task, depending on how you go about it, would fit Python as well.

As for other choices (and this is pure, unadorned opinion) Perl's syntax is insane, and difficult even for programmers with a lot of experience in other languages. The current great leap forward to the next version of Perl has been ongoing since 2000, with no clear end in sight. C++ is slowly declining in use , justifiably supplanted by Java in many cases, and is (as someone who used to do a lot of C++ programing) harder and less fun than it should be. Java, see above. There is a Java ebook manager that may be worth looking at. Another scripting language that may be worth looking at is Ruby.
posted by outlier at 8:53 AM on August 11, 2008

Another vote for Python. Syntax-wise, I think it makes a more sense to a native"english" speaker. Hence it's also easier to read examples and to figure out what they're trying to accomplish, and how things work. All things being equal, Perl is probably the "best" choice, but the newbie-ness definitely swings things into Python's favor in this example.
posted by cgg at 8:58 AM on August 11, 2008

posted by callmejay at 8:59 AM on August 11, 2008

If you just want the easiest language to pick up, choose Ruby. It has fewer inconsistencies than any other major scripting language, and therefore takes less effort to learn and use. Another important consideration would be: what do the people you know use? If there are several people nearby using a different language, it might be worthwhile to use that one instead, just for the support and community.
posted by yath at 8:59 AM on August 11, 2008

Avoid C/C++ entirely for what you're trying to do.

Yes, stay far, far away. C# would probably be okay, but not C/C++.

A functional programming language like OCaml would actually probably be the best in terms of fitting these particular uses, but they aren't very newbie-friendly and it sounds like you want to learn a more general purpose language anyway.

I agree with the overall recommendations of Python, Perl, and Ruby. They are all widely-used general purpose scripting languages, and I can't really knock any of them. Perl is the oldest of the three, and has a huge library, but isn't as well designed (in my opinion) as the other two. Really I would just suggest reading online tutorials for each language and picking whichever one seems to fit well for you.
posted by burnmp3s at 9:03 AM on August 11, 2008

posted by knave at 9:15 AM on August 11, 2008

Oh dear God.

Scripting requirements are not (to my mind) the best place to start when learning how to program. Especially if you want to do stuff like renaming files. (You show me a modern scripting language that does this well and I'm gonna call you a liar.)

But - despite this, I'm gonna reccomend that you download and start using Hackety Hack. It's a learning environment for Ruby and it uses examples of reading from the web as part of the tutorial process. (This ties in with your "look up an ISBN number" requirement.

As for renaming files ... Ruby has that covered too.
dir = "C:\\yourFolder"
files = Dir.entries(dir)
files.each do |f|
next if f == "." or f == ".."
oldFile = dir + "\\" + f
newFile = dir + "\\_" + f
File.rename(oldFile, newFile)
posted by seanyboy at 9:23 AM on August 11, 2008 [2 favorites]


If you want to update the Summary info on a file, then you probably need to use VBScript with the Windows Script Host.

You'll also need a decent editor. I use textpad.
posted by seanyboy at 9:27 AM on August 11, 2008

It may be a good idea for app #1 to store metadata about these files in a database instead of renaming them. That way if the automated process chooses the wrong ISBN, it will be easy to go back and fix it in the tables rather than trying to interpret 978-0131103627.PDF. So my recommendation would be that you use Python and SQL
posted by nomad at 9:27 AM on August 11, 2008

Changing Summary Info.
posted by seanyboy at 9:31 AM on August 11, 2008

Python hands down. I guess Ruby gives it a run for the money (and I have zero experience with it). But Python is a very powerful scripting language with lots of support (and TONS of example code) for playing with system files and the like.

Python is also interpreted. Don't worry about what this really means... I'll tell you why it's good. You open an empty file and are trying to write a function to do something (return a list of files, or some such little chunk). You file up the Python interpreter, which is a little application you can run as you write the code. It sits beside you and you just scribble into that window, and "see what happens"

> make an object list
> here's you're list, empty
> now let's figure out how to add to the list... list.append()
>nope. error
> hrm. try list.add()
okay, here's your list back
>now how do I reference the first item in the list? list[0]?
> here's the 0th item
etc., etc., etc.
posted by zpousman at 9:42 AM on August 11, 2008

Python has plenty of free resources online for complete beginners, and is very easy to get started with. PHP is slightly more fiddly but easy to get going with, and very popular across the web. Perl is opaque, unfriendly and distinctly unfashionable. Ruby is very trendy, but I've not tried it.
posted by hatmandu at 9:46 AM on August 11, 2008

Backup your files before playing with them, and setup up a test directory to play with some sample files before you begin.

As many wise computer scientists have said before, "Shit happens".
posted by sleslie at 9:48 AM on August 11, 2008 [1 favorite]

I've a lot of experience with Perl, and only started using Python five months ago, but I'm nthing the recommendation for Python. A pleasure to work in and great for beginners.

I've only toyed with Ruby, so I can't comment on it. I ended up choosing Python over Ruby, but that was more driven by the frameworks than the languages themselves.
posted by bitterpants at 10:21 AM on August 11, 2008

Python, Python, Python! It's incredibly easy to use. Yes, the file renaming tools aren't the best, but you can live with that. Ruby is not nearly as easy use, and Perl is just ugly no matter how you look at it. I'm having to use an ugly C clone at the moment and I really really *really* would like to go back to Python.
posted by katrielalex at 11:00 AM on August 11, 2008

One more for Python. Lots of resources for beginners, easy to read programs, good libraries ... it is great.

Not to start a flame war, but for those who say that Ruby is more consistent, I'd point out White space oddity for Ruby There are no perfect languages.
posted by aroberge at 12:32 PM on August 11, 2008

Python: Programming is fun again! I use Python every day for lots of different tasks, including some like the ones you're interested in. (Consider the os and shutil modules for obtaining directory lists, renaming and moving files.)
posted by SPrintF at 12:52 PM on August 11, 2008

Before you learn a language, learn regular expressions. You will be able to use them in most of the tasks you specify, and they will be usable across languages.
posted by blue_beetle at 1:37 PM on August 11, 2008 [3 favorites]

If your goal is to become a great programmer, then learn C. It sounds like you goal is to get a few things done, in which case I'd recommend Ruby, which is much easier to use but doesn't teach you to master the fundamental logic behind programming. But don't take my word for it: try ruby for yourself and see if it suits you.
posted by MaxK at 4:38 PM on August 11, 2008

Python's a good language, but I wouldn't use it as a starter language.

I should rephrase that.

Some people don't get on with python. Despite what they'll tell you above, it isn't for everyone. This is not a fault of the language. It's just been ... optimised ... to a certain mindset. The same could be said for the .net languages and ruby, but not to the same degree. In my experience you'll either get on really well or really badly with python. If the latter happens, don't worry about it. IT IS NOT YOUR FAULT.
posted by seanyboy at 5:17 PM on August 11, 2008

Any popular language with a rich set of libraries will do, as they'll all be similarly difficult given your absence of skill. Pick one that looks fun from the top of the TIOBE list.

Most of what you want to do should be fairly easy. You're just manipulating files, and most of that manipulation will be handled by the libraries that you choose. The non-trivial part will be the pattern matching. There's a lot of brain work that goes into matching patterns like that. You'll spend some time trying to come up with an algorithm that will work for 95% of the items, and be ready to manually change some of the fringe cases.

You're basically interested in taking a manual process and automating it. I recommend that you first try to change one or two files by hand and while doing it record the steps that you take in excruciating detail. Then do it again. I think a lot of beginner programmers are amazed by the number of steps that humans take without realizing it.

Once you've identified all the discrete steps, try to figure out what may go wrong and put in some error handling and logging. You don't want to try to rip though thousands of files just to find the one with the weird encoding breaks your program half way through.

Finally, make sure you test on sample data. Take a random sample of files, copy them to another folder, and start experimenting. And never assume your code is bug free.
posted by brandnew at 5:52 PM on August 11, 2008

Learning C or C++ in order to become a programmer is like learning Latin in order to become a writer. Older people who had to do it tell you it is good for you. Mostly because they are sadistic. I learned it. I used it once after finishing school. Every single other language is better unless you are working in unbelievably tight constraints or you want to patch kernel.

I'd say learn Ruby because it has a lot of the newer programming constructs so you will be learning a lot of the new stuff that will show up in all the evolving languages and because you can then also use Rails and make webapps in no time at all. It is also has a lot of the new testing frameworks available so you can do Test Driven Design or Behaviour Driven Design...or whatever the latest fad is.
posted by srboisvert at 2:43 AM on August 12, 2008

I'm actually a guy with experience in Perl and Ruby and leaning towards making the move to Python. Perl is super powerful and CPAN means you've got pre-built libs for pretty much any task but it's practically unreadable.

For you I'd lean towards Python , if you were doing web dev I'd lean towards Ruby because of Rails but w/o that angle Python gets the edge.
posted by bitdamaged at 11:21 AM on August 12, 2008

For (1), you’ll want to use a language that has a library with bindings to amazon’s ECS. Ruby has Ruby/AWS. No doubt Python and perl both have something similar.

Don’t use C or C++. Either Ruby or Python will probably be easier to learn than perl, but it really depends on how you click with the language. IMO, perl is insane, but I know lots of people who prefer it to anything else.

I’m curious why so many people say that perl is more powerful than Python; there are plenty of features Python has that perl doesn’t, like coroutines and real reflection.
posted by suncoursing at 4:04 PM on August 12, 2008

Response by poster: Can't thank you all enough for the replies so far.

So, I've narrowed it down to Python and VB.NET(and MONO). But what I will begin with is Ruby just to get a "feel" for programming, heck, I might even stick to it. I do have a few follow up questions that I hope the community will be kind enough to endure with, and answer.

1.Do these three (with MONO four) programming languages have sufficient enough libraries to handle/manipulate pdf, chm, and djvu metadata? Does one of these languages have better libs than the others?

2. Mono sounds very attractive to me due to its' out-of-the-box cross platform nature. Will I be able to develop and code as effectively , with all the features, bells, whistles, and libraries that a costly VB license and development tools software package costs?

3. Which one of these languages will provide me the ability to code "anything" my heart desires. Meaning, which one of these suckers will let me code not only the type of software that I mentioned in the OP but also things like a web-scraper, a rudimentary firewall ( think far off in my learning)? BTW, not actual goals of mine yet just examples. Essentially a swiss army knife sort of language is what i'm after. Which after learning (which I gather happens with any language mastery) will make it easier for me to pick up other languages? It seems i answered my own question right there, but I'm pretty sure if I learned some wacky PL as my first I'd be lost trying to pick up something refined and professional.

4. blue_beetle mentioned learning regular expressions first. I looked at the wikipedia entry that was linked to and was left slightly confused as to why it was suggested. Doesn't each language have its' own regular expressions? If not is there a place on the internet where I can do what blue_beetle suggested?

last one!

5. Could anyone out there suggest a n00bler friendly community where I can pester more knowledgeable people (this in no way implies that you all aren't) than myself on these subjects? You know, a place where the members that visit these websites are there specifically for talk about code, coding help, and helping fledgling coders. Top notch places with top notch members willing to help feed a hungry mind and look down at anyone disrespecting a fellow member of the site.

Thanks again with all of your help so far. It really is priceless.
posted by monkishies at 5:10 PM on August 12, 2008

One little nitpick - the Mono Project isn't a programming language, it's a runtime environment, a set of compilers (for VB.NET, C#, and other languages), a set of libraries, and a suite of programming tools. Kind of equivalent to what GNU is, though far less mature, and eeevil because it's connected to Microsoft. (What I actually linked you to above is the development environment the community has made, MonoDevelop.)

My take on answering those:
  1. Definitely yes for pdf libraries, in all languages these days. If by chm you mean Microsoft help files, I'm unsure of the status of support in Python and Ruby, and my guess would actually be that the Mono project itself would not have anything that would support handling them in Linux because it wouldn't be part of the ECMA .NET standard. Though, if you were compiling something with Mono, on Windows you'd be able to access the Microsoft libraries for manipulating .chm files. For DjVu, the only really developed freely available library I know of is DjVuLibre, which I think is in C++. But there are ways you might wrapper it to be used in other languages - or if you become skilled and ambitious, you might write your own library (I'd leave that for later on, though.)
  2. MonoDevelop as a tool definitely has fewer bells and whistles than commercial tools like Microsoft's, even the free versions. But the bells and whistles don't make you a more effective programmer; I personally think that stuff is usually more distracting than anything, and becomes a crutch that prevents people from becoming effective programmers. Really all I use among the fancy stuff is "code completion", which MonoDevelop has. Python and Ruby development environments have that too, but I'll let others make recommendations. (Some of the fancy tools that are available are useful for specific tasks, but you'd really do better to become an effective programmer first.)
  3. All of them. There are some gotchas for every language, and some languages are better for specific tasks than others, but for something like web-scraping or building a firewall application Python, Ruby, and VB.NET will all be entirely suitable. Web-scraping actually really isn't that hard, BTW - I'd say go ahead and try playing around with that while you're still learning programming.
  4. Regexes can have slightly different syntax between languages and slightly different capabilities but they are a powerful and standardized tool for doing text processing. I would recommend the web site regexlib (both as a reference and for the really sharp people on its forums) and I'd recommend learning via playing around with its online regex tester. (Regexes would play a big part in doing web-scaping, btw.)
  5. For VB.NET there's a whole ecosystem of high-quality Microsoft-funded sites with their root at MSDN, the Microsoft Developer Network. They want everyone to easily learn programming in their languages, for their own evil and commercial purposes, of course. I'll let other people recommend for Ruby and Python... I'm more familiar with PHP than either of them.
And one last, very helpful tool I'll mention: Google Code Search. Much of the learning done by programmers of every skill level is done by finding snippets of code that other people have written and either tweaking it to do what you want or reverse-engineering it to figure out how it works.
posted by XMLicious at 6:38 PM on August 12, 2008

Python and Ruby are just as cross-platform as .net, if not, more.
posted by knave at 6:48 PM on August 12, 2008

He might be referring to the fact that, like with Java, the compiled binaries output by Mono are cross-platform anywhere the runtime is installed. As long as you don't link in any Microsoft libraries you can compile your program on a Windows machine and run it there, then FTP that same .exe file to your Unix server, chmod it as executable, and run it there.

With Python and Ruby, since they're primarily interpreted languages this isn't even a question. (So yeah, unquestionably Python and Ruby are just as cross-platform.)
posted by XMLicious at 7:08 PM on August 12, 2008

« Older St. Louis > Tower Grove Park > Chin Up bar?   |   Veganize my love affair with dairy products... Newer »
This thread is closed to new comments.