Automatically determining class scheduling needs
July 3, 2011 10:39 AM   Subscribe

I teach at a language training school, at which there are a few hundred students. My boss recently distributed a questionnaire to all of those students to determine the best times to schedule classes for each level of language proficiency. Unfortunately, it turns out that his plan for efficient scheduling ended at the point where we got those questionnaires back. Also unfortunately, I've developed a reputation as the local computer guy (because I can replace a broken mouse and make a chart in Excel), and so the job of making sense of the data has fallen to me.

Each questionnaire has spaces for the student's name, ID number, and current language proficiency level, followed by a grid with days of the week across the top and times along the left. The idea is that we'll be able to see, for example, that intermediate students tend to want classes on Monday evenings, whereas beginners prefer Saturday morning classes. (Let's set aside, for now, the question of whether such correlations are actually likely to exist—I doubt it, but that's not my problem right now.) We should also be able to find outliers, like an individual intermediate student who can only attend classes at times that aren't convenient for any other intermediate students, so that we can talk to that student about rearranging his schedule or transferring to another center.

Two issues:

First, someone (not me) will need to enter all the data from the questionnaires into some sort of spreadsheet or database. Ideally, this should happen in Microsoft Excel or Access, since we already have Office installed on all the computers. But this is fairly complex data: for each student, there's a 35-cell grid of available times, any conceivable combination of which might contain checkmarks. What's the best way to represent and store this data? I thought of making one huge Excel table with an artfully designed UserForm for data entry, but apparently UserForms can't have that many fields. Without a UserForm, the person entering the data would have thirty-five identical-looking fields to tab through horizontally for each record, which seems like it's inviting input errors. Should I instead do it in Access, maybe with student information in one table and each 2-hour block of time during which a particular student is available stored as a separate record in another table, connected to the first table via foreign key?

The second problem is how to turn the data, once entered, into something useful. Here I have a little more flexibility in terms of tools: I can install software on my own computer and export the data from Excel or Access into whatever format that software requires. Again, the first goal is to find the optimal days and times for classes at each level, and the second is to identify students whose scheduling needs are significantly different from those of other students at the same level. We are not, at this stage, concerned with teacher or classroom scheduling—we just want to know when students want to have classes at each level, and which students fall significantly outside of the bell curve in this respect.

So what's the best tool for this? (Extra special brownie points if it makes colorful charts. My boss loves colorful charts.)
posted by Acetylene to Computers & Internet (9 answers total) 3 users marked this as a favorite
Best answer: Here's what I would do:


The columns are:
NAME - ID - LEVEL - MONDAY TIME PERIOD 1 - MONDAY TIME 2 - MONDAY TIME 3 - MONDAY TIME 4 (and so on with the days of the week and times)

The rows are filled in with:

NAME - ID - LEVEL - and then a 1 if they're available at that time

Sort the students by the column that is language proficiency and make a new column under each proficiency level called "total"

and then you do an autosum of the day/time period column so that you get a total. So for example, all the 1s for MONDAY TIME PERIOD 1 add up and you'll know that 32 of the 48 students at proficiency level A are available for MONDAY TIME PERIOD 1.

(And if you want to report percentages to your boss, do a formula to make a percent that would be like =B2/49 and then format that cell to "percent" and then drag the formula over to all the columns (if B2 was the total number of students that want Tuesdays 1-3 and 49 was the total number of students in that proficiency level category.))

And do that for all the proficiency levels.

Then you tell your boss this (in text or verbally):

For Proficiency Level A, the most popular time requested for classes are Tuesdays from 10-12 (with 64% of them preferring that day), and so on...

And if you wanted this graphically, you'd hide all the rows and make a chart with the time periods and percentage interested.
posted by k8t at 10:56 AM on July 3, 2011 [1 favorite]

Best answer: As a next step, below your summations for each proficiency level, you can mark each time period with a 1 (popular) or a 0 (unpopular). For each student, add a column where you multiply their preference for each time (1/0) by popularity of the time (1/0) and sum it across all times. You will quickly see students who did not pick any of the popular times (they will have score of 0 where everyone else will >=1.
posted by metahawk at 12:20 PM on July 3, 2011

Google docs allows you to create lengthy online forms that feed into a spreadsheet. this means multiple people at multiple locations can do tie data entry. (it also means you could have used it to begin with)

Oh, it makes colorful charts, too!
posted by jander03 at 12:51 PM on July 3, 2011

There are a couple of answers, depending on who's answering.

A computer scientist would point you to the vast existing literature on the (currently unsolved) scheduling problem.

A programmer would tell you to re-do the survey using a Google docs spreadsheet as jander03 suggests or else using a doodle poll.

An old-school administrator who's actually had to do this before would probably tell you to photocopy all the surveys onto transparencies at reduced opacity and stack them up. (Okay, for a few hundred students, he probably wouldn't actually suggest that, but I think it's such cool trick I couldn't help tossing it in there.)

There's not really much you personally can do with several hundred paper survey forms short of hiring a data entry monkey to digitize them.
posted by d. z. wang at 1:17 PM on July 3, 2011 [1 favorite]

Can someone do a detailed walkthrough a la K8T's, using Google Docs? I understand Excel, but not this 21st century Cloud business.
posted by Net Prophet at 2:19 PM on July 3, 2011

Response by poster: Thanks for the input, everyone.

I'm still a little hesitant to do this as a huge Excel spreadsheet. With Name, ID, level, and thirty-five different time options, it just seems like it's too much to enter in a single spreadsheet row without errors. UserForms apparently can't handle that many cells, either, at least without getting VBA involved—which is a possibility, but because my VBA skills aren't great, it would likely take more development time than we want to devote to it right now.

Google Docs are a good idea, but there's one detail I forgot to mention: our school is located in Beijing, and Google Docs is blocked here. I can get around that using a VPN, but the clerical people who will be entering the data are locals, and it's not really good form, professionally speaking, to ask them to circumvent the law for their job.

The data entry step will be ugly no matter what I do, but I think it's going to be easiest in Access. From there, I can analyze the data in either Access or Excel; k8t and metahawk both gave me some good pointers for that step. Also, d. z. wang, while the transparency idea won't work for the quantity of responses we're dealing with, it did bring to mind the idea of displaying the results as a time grid and using conditional formatting to make a very simple heat map, so thank you for that.

I'm still open to further suggestions, especially with the analysis step, but I'm closer than I was when I asked this question. Thanks again to all who've commented so far.
posted by Acetylene at 6:24 PM on July 3, 2011

Can you enter the data like k8t said, but instead have each proficiency level be on a separate worksheet in Excel? I'm no expert, but that was my first thought if I had to do this. That way you can cut down on the amount of raw data on each sheet, and be able to only focus on one proficiency level at a time, which might keep you from going cross-eyed.
posted by MultiFaceted at 9:07 PM on July 3, 2011

35 time options (columns) is NOTHING in Excel. I deal with spreadsheets with thousands of columns daily.
posted by k8t at 12:36 PM on July 4, 2011

Response by poster: It's not that Excel can't handle that many columns; I was more concerned with the fact that the people entering the data might make errors in that situation—especially since they have other responsibilities as well, and could be interrupted by students in the middle of entering a record. Here's what I ended up doing, in case anyone's still following this:
  1. I created an Access database to hold the information. It was just a single table, with fields for each student's name, ID, level, and then thirty-five time options (as boolean values).
  2. I created a form for entering data into that database where each time option was a checkbox, and those checkboxes were laid out in exactly the same way as they were on the questionnaire—the same grid, with days of the week across the top and times of day down the left. This made it easy for staff to enter and then verify the data from the paper questionnaires.
  3. I put the Access database file on a network drive, split it using the Tools->Database Tools->Split Database option, and gave each member of our clerical staff a copy of the front end file so they could all work on it simultaneously. We split the questionnaires evenly between them, and they had everything entered within a few hours, despite having to deal with student check-ins at the same time. They're awesome.
  4. I exported the data from Access to an Excel spreadsheet, put it on a thumb drive, and took it home to make pretty charts.
I still haven't used metahawk's tip for finding the outliers yet; that's next on the list. At this point, I'm far enough along to see that, indeed, there's no significant correlation between student level and hours of availability, and at least half the purpose of the survey is invalid.
posted by Acetylene at 1:36 AM on July 7, 2011

« Older Desperately Seeking A Monstrous Psychedelic Bubble...   |   That old thing? Oh, it's just always been here... Newer »
This thread is closed to new comments.