To underscore or not to underscore?
December 21, 2006 4:05 PM   Subscribe

Is there an advantage to having an underscore in a file name?

I'm devising a naming standard for some time tracking worksheets at work. For some reason, many years ago I started to put underscores in file names containing dates, like this:

20061223_pdb weeklytimesheet.xls
20061223_tjd weeklytimesheet.xls

instead of just naming them like this:

20061223pdb weeklytimesheet.xls
20061223tjd weeklytimesheet.xls

the belief being that the underscore will allow for more accurate sorting (and easier finding) of the files in Windows Explorer. I don't know where I got the underscore-is-better notion, but I've done it forever; a coworker challenged me on it today, and I had no concrete reason why it would be better.

Is there an actual reason why adding an underscore is better, or is it just a stylistic habit I acquired?
posted by pdb to Computers & Internet (23 answers total) 3 users marked this as a favorite
 
The only thing the underscore does is make it easy to use at the command line. In unix style operating systems where I spend a lot of time managing my files at the command line, I use underscores.

Note that batch files will need extra quoting of file arguments if there are spaces in them.
posted by cschneid at 4:09 PM on December 21, 2006


Some file / operating systems don't support spaces in filenames, but do support underscores.

If you are working on the command line it is easier to type
emacs file_with_underscores.txt
than
emacs file\ with\ underscores.txt
but tab completion makes that a bit of a non-issue, just ugly.

It won't have much effect on sorting unless you are doing something cunning that relies on _. Which doesn't sound likely.

Basically, what you are doing is just as random and arbitrary as using a space.
posted by public at 4:11 PM on December 21, 2006


If the dates are always a fixed length (is January "01" or "1"?) there doesn't appear to be much value in the underscore. If the dates can vary in length, then the underscore could serve a purpose.

It may be slightly easier to read the filenames with the underscore. The spaces are the hassle, as you end up having to use quote marks in a command line to refer to them.
posted by i love cheese at 4:11 PM on December 21, 2006


It's a carryover from DOS which did not allow spaces in file names.
posted by caddis at 4:15 PM on December 21, 2006


As public says, underscores reduced the chances of trouble, since files might in future be copied, or transferred to different systems, and some systems wouldn't handle files with a space in the name.

If you're working on files that might end up passing through a whole lot of different machines and operating systems (such as a production pipeline), it's still a useful thing to do.
posted by -harlequin- at 4:23 PM on December 21, 2006


On a web site, it's better for search results to use hyphens instead of underscores. Because, after decades of underscores being commonly used in filenames, Google et. al. can't see the words separated by underscores as separate words.
posted by kirkaracha at 4:35 PM on December 21, 2006


Some file / operating systems don't support spaces in filenames, but do support underscores.

Except his files have spaces and underscores. The question is whether he should use an underscore rather then an empty string between a date and someone's initials.

It seems easier to scan to me, but why not just use a space and be consistent, like this: "20061223 pdb weeklytimesheet.xls"

Actually what you should do is use an Access database for this stuff.
posted by delmoi at 4:38 PM on December 21, 2006


The underscore is irrelevant; it's making sure your filenames have NO SPACES that will make them most broadly/easily usable (especially important if they, or anything they'll be converted into in the future, might be used on the web).
posted by allterrainbrain at 5:07 PM on December 21, 2006


Underscores are okay. Or use dashes. Don't use spaces.
posted by chillmost at 5:31 PM on December 21, 2006


If you ever need to put a file online or on an intranet, etc., spaces are pretty ugly, hard to type, and hard to read:

a%20sample%20web%20page%20with%20spaces.html

That's one reason people tend to use hyphens or underscores instead. CamelCase is another option--better for readability but not good for searching, because the words can't be parsed automatically.

Personally I would go for:

20061223-pdb-weekly-time-sheet.xls
posted by flug at 5:33 PM on December 21, 2006


flug's example of how to do it is precisely how I do it myself.

I prefer hyphens to underscores, because if the filename is ever converted into a hyperlink, sometimes you can't see the underscores anymore.
posted by Hildago at 5:40 PM on December 21, 2006


Not sure where everyone is comming up with comments about spaces. The question was should I use _ or nothing. The only advantage I can see to the _ is that it can be used to delimate the date part of a file name. It would need to be at the beggining of the file name to be of any real help sorting files. Unless you have muliple copies of files with the same name created on different dates.
posted by phil at 5:46 PM on December 21, 2006


Just a note -- We don't use initials as they quickly conflict (atleast in our 150 person company). I would suggest adopting a different standard, such as network username, or (what we do) staff ID.

Makes it easier to work with later as well if you can key everything against a unique ID. (Re: basic database normalization)
posted by SirStan at 5:59 PM on December 21, 2006


Another thing, when using the windows command line, filenames with space just don't work very well (at all). You have to use the shortened version, which you won't know without using DIR command with whatever forgotten switch exposes them. Some crap like FILE~1.XLS. Long filename support at the command line is crap.
posted by IronLizard at 7:06 PM on December 21, 2006


Though it isn't exactly intuitive, you can figure out the DOS version of a filename through some simple logic. If it's longer than eight characters, it's going to get truncated and end with a ~1 after the first six characters, because ~1 counts as two characters out of the eight. If there is another file in the same directory that would also get shortened to the same short name, it would be ~2, and so on, according to the directory sorting, which is probably based on the ASCII character code, but for normal use, just consider it alphabetic.
posted by odinsdream at 7:43 PM on December 21, 2006


DIR /X
posted by IronLizard at 8:55 PM on December 21, 2006


If we go exactly by the examples that the poster used to illustrate the two basic options (20061223_pdb weeklytimesheet.xls vs. 20061223pdb weeklytimesheet.xls), then the issue of spaces doesn't factor, as phil correctly stated. Both options contain a space.

The files are going to sort in exactly the same way whether you use an underscore or not, as long as the use is consistent. An underscore is a character just like any other, and has a value for computing sort order, but if the same character is used in that position always, it is essentially skipped and Windows Explorer compares the values of the next character to the right. The underscore would help visual scannablility of the date information, though, since it separates the date from the rest of the filename.

Use of spaces IS a valid subject when discussing filename habits, however. Mac users and most Windows users may never run into situations where it matters, but sometimes it can. Unix doesn't like spaces in filenames, which is why files posted on the internet get "%20" where all the spaces were - that's the control code for "space".

Other situations where spaces can break things: some WebDAV implementations don't like spaces (the Mac won't read file or directory names that contain spaces on WebDAV volumes), pasted URLs or other network locations as links in email frequently get mangled if they contain spaces and are long enough to wrap, since the mail program helpfully inserts a line break where the space is. And so on.

I've just gotten myself out of the habit of using spaces in file and directory names, using underscores or CamelCase instead, since there's always something that doesn't like it. But then I use Linux and network storage a lot, YMMV.
posted by dammitjim at 9:17 PM on December 21, 2006


Actually what you should do is use an Access database for this stuff.

I wholeheartedly agree, but my company has banned Access databases for small applications, because of the number of them that have sprung up over the years. They prefer Oracle-based apps, which are fine but for a team of six people tracking weekly time, that seems a bit overkillish.

Just a note -- We don't use initials as they quickly conflict (atleast in our 150 person company).

I also agree with this, because I have problems with initials from past jobs, but this is for a team of six people that won't change until it's disbanded 7 months from now, so I'm not so worried about conflicts.

Thanks to everyone for the input, I appreciate all the answers...
posted by pdb at 9:26 PM on December 21, 2006


To put a face on what phil has said :

dir *_pdbwhatever.xls

Gets you all pdbwhatever files regardless of date

and

dir 20061223_*

Gets you all files of that date regardless of name

Somewhat useful.
posted by ill3 at 10:08 PM on December 21, 2006


That's some cohones, man, claiming your answer to be the best.

To expand on ill3's answer, technically you *could* just do

dir *pdbwhatever.xls
and
dir 20061223*

BUT if naming (or timestamping) changed, you would also be matching

20061223_apdbwhatever.xls
and
2006122308_otherfile.xls

With the underscore, there is no danger of this happening.
posted by Deathalicious at 10:52 PM on December 21, 2006 [1 favorite]


There really is no benefit (sorting-wise, anyway) to using the underscore. It makes it a ton easier to read though, compared to not having it at all.
posted by antifuse at 1:45 AM on December 22, 2006


please be aware that many people that use gui base file apps find underscores to be one of the most annoying things ever. I'd rather have 3 character filenames than underscores.
posted by blue_beetle at 9:56 PM on December 26, 2006


From a web developer point of view:
when a filename has a space in it, it is read by the browser as this: %20

I created a small example for you here: http://jammo.net/filename.html

I just got into the habit of always using underscores for my file naming, but its entirely up to you.
posted by jamjammo at 10:48 PM on December 26, 2006


« Older Why does my engine smell like ...   |  Will my site be destroyed by a... Newer »
This thread is closed to new comments.