Excel-fi Question: Data, and not like the Android
September 2, 2014 1:13 PM   Subscribe

Trying to consolidate excel data with three variables (workbook date, name &widgets) and could use help. What is the most efficient way to combine this across weeks to make it summary with detail by week? What is the best tool to do this? Excel? Access? The fog of confusion follows clicking this link.

I'm trying to make sense of widget numbers that have collected and stored by a different individual. They do not think the same way that I do, and I'm having trouble translating their data into a matrix that I can use for basic metrics.

I have 26 workbooks (one for each week in the first half of this year). Each is dated for the week it was created.

It contains a list of names of team members (between 20-40) and a list of widgets, but doesn't contain the date created.

The widgets currently appear as row headings, and the columns contain the names of the individuals. The date isn't really captured in the body of the spreadsheet of the workbook. To make it a little harder to consolidate, there are random spaces in between to "make it pretty" which makes it very hard to consolidate. I know that I'll have to do a lot of manual work to consolidate it, but think it will be worth it if someone can help me understand how to work with data that has three variables.

What is the most efficient way to combine this across weeks to make it summary with detail by week? Essentially I want to be able to see "Bobby" "Week Number" "Amount of Widgets" but am having trouble getting my brain to convert this data from two axis to three.

Please let me know if I'm not including enough data to answer this question. I appear to have tons of it, and no way to use it.
posted by Draccy to Technology (15 answers total) 2 users marked this as a favorite
I know a couple of tricks for getting rid of blank rows in Excel:

click Find & Select -->Go to Special--> and go to blanks. You can then delete the blanks. Populated rows most be entirely populated if you use this trick, otherwise your data may end up misaligned.

Use the countblank (e.g., =countblank(a1:h10)) function to identify rows that are entirely blank for a given range, then filter out those rows.

Given what you've described, I would have the database set up as:
Name, week, widgetvalue
Bobby, 1, 25
Bobby, 2, 34

etc, creating a new week variable if none exists on the spreadsheet. You could use either Access or Excel to make pivot tables from there. Personally I don't like Access.
posted by _cave at 2:14 PM on September 2, 2014

I agree with _cave's assessment - whether by doing it manually or by using a tool, you need to remove the blanks and get all the data into a single sheet.

Just remember (or inform the different individual if they're still involved): the data goes in one sheet. The pretty presentation of that data goes in another.

Making a pivot table in Excel is pretty easy as long as you have a row that is dedicated to column labels (again, as in _cave's example). You can make a pivot table that gives you a drop-down selection to choose the week you wish to display, or you can nest widgets under users under weeks, or you can have it display by widget first, etc. They're pretty versatile, but to feed them you need a nice clean block of data. One nice thing about them: the data in them doesn't have to be sorted in any particular fashion - that's what a pivot table does.
posted by komara at 3:09 PM on September 2, 2014 [1 favorite]

Does each worksheet contain one sheet? Is it called the same in each workbook?

Does the sheet look like

Name, Widget
Alice, 12
Bob, 13
Claire, 10

Or is it different to that?
posted by 92_elements at 3:10 PM on September 2, 2014

Also I've been experimenting using csv's to combine loads of data sets together. Let me know if that would be useful to you.
posted by 92_elements at 3:14 PM on September 2, 2014

Thanks everyone! I'll be zapping the blanks, and even thought of opening all the workbooks at once and selecting all sheet and correcting the data like that. I'm just not sure how to capture the date of the transactions since column headings and row headings are already preset. How can I have a third access on that data set? Or should it be a new row heading? I'd ideally like to be able to track by name or by date.

posted by Draccy at 3:32 PM on September 2, 2014

It's like this

Widgets, Name 1, Name 2, Name 3, Name 4
Whirly Gig, 1, 3, 0, 1
ZappyDoo, 0, 0 , 0, 1

Where commas = column breaks
posted by Draccy at 3:33 PM on September 2, 2014

From what I can tell, the issue is that you have widgets as rows rather than people's names as rows?

Then it seems like you'd have to transpose all of the data in order to get it into the right format (In Excel: Ctrl C/ copy-->right click-->paste special-->transpose). That will give you names as rows and widgets as columns. You can add a date row to this spreadsheet to get it into the format that I mentioned, assuming that each of the tabs in the workbook or whatever are identified by date.

Problems may arise if you have different widgets from tab to tab when you try to bring all of the data together, as your column headers would not then match up after you transpose everything.

You're probably going to need to correct that by hand if so by identifying all possible widgets for the total data set. If this is going to be a problem, I would recommend going through and copying all of the widget names from each tab, copying them into a single column, then using the remove duplicates function to determine unique widget values. Then you sort that list alphabetically, and also go back and sort widgets alphabetically for each tab of data. That will make it easier to match up columns, but I still think it will end up being a kind of tedious process. If you have more widgets than people then I can see why you might choose to use persons as columns.

Are you saying that there's something that prevents you from changing the table row/ column shema?
posted by _cave at 4:27 PM on September 2, 2014

also this might be more work than necessary given that pivot tables are pretty flexible in how they let you specify row/ column values.
posted by _cave at 4:30 PM on September 2, 2014

There's nothing stopping me from doing it, I'm just dreading having to do it. Part of me thinks it's not even worth the effort, and it might be better to just start fresh. The problem I find is that the people who create this, create it for people to use on paper. I find that useless. We should be able to query, and crunch numbers!
posted by Draccy at 4:33 PM on September 2, 2014

I feel your pain. It's amazing how disordered spreadsheets can become.

What do you feel about vba/macros?

How much time do you want/have to spend on this?
posted by 92_elements at 10:22 PM on September 2, 2014

Also I suspect this data has been manually typed in. This leads to a high probability that there will be typos and mismatches between the sheets. Both in the names of people and widget names. So this will need to be checked for and (probably) elimated in your process.
posted by 92_elements at 10:26 PM on September 2, 2014

How can I have a third access on that data set?

Use 4 columns: name, week number, widget type and widget quantity

Bill, Week2, ZappyDoo,5
Bill, Week3, ZippyDoo,6
Bill, Week3, DooDah,4
Jill, Week4, ZappyDoo,7

Pulling this data into one sheet from 26 different sheets might be a bit of work to start with, but once you get a single file per week, you can do it quickly. Removing the blank lines and running a pivot table would probably be the fastest. You don't need to work with the original file when you summarize them. Make a copy of the file and remove all of the prettiness so you can work with it.
posted by soelo at 8:26 AM on September 3, 2014

I decided to mock up some examples to show how easy it is to use a pivot table once you have the data in place.

soelo has the correct idea: four columns for data, no blank lines. It would look something like this.

Once you have that data in place make sure there's a header (title, whatever you want to call it) on every column. These are the keys that Excel will use to identify data when you're modifying the pivot table (everything in Row 01 in this example). Click anywhere within the data you wish to use, and then go to Insert -> Pivot Table.

You'll get a blank table on the left with a floating box called 'Pivot Table Field List' on the right. From that field list you can choose which columns go into what areas.


Here's my first example. In A1,B1 you'll see a selection for Date. It's currently set to show (All) date ranges but you can drop that down to select any single date or any range of dates. Below that in the table is the list of worker names and underneath each name is the list of widgets they made in that date range selected. To the right is their total widget count, as well as count for individual widgets.

If you look to the 'Pivot Table Field List' box on the right you'll see that 'Date' is in the Report Filter category, 'Worker Name' is in the Row Labels with 'Widget Name' nested underneath, and then the Values field is the 'Sum of Quantity'. All I did was drag those from the list of field at the top of the box down into those areas.


Here's a second example. I haven't changed any of the raw data, I've just moved the fields around in the 'Pivot Table Field List' box. Now the selector at the top is for the widgets themselves, and the rows show each worker name and the dates upon which they made the widgets in question, and how many. I could just as easily have nested the worker names underneath the dates if that were how we wanted to look at the data.


A final example but by no means anywhere near the limit of what a pivot table can do. In this example I've put both widget and date in the Report Filter area. This means you can select both a widget and a date and the pivot table will show who made them on that date and how many they made. I've selected a specific widget and date for this screenshot to make that more clear.


Hopefully those screenshots will show how easy it is to take one nice clean spreadsheet full of data and use it to create a human-readable output in any form you desire. I did change the visual formatting of the pivot table a bit to make it easier to view in the screenshots, and if you start digging around in the settings you'll find there are many options for how data is presented.
posted by komara at 2:01 PM on September 3, 2014

I realize I should have looked back up at your original question. You said: "Essentially I want to be able to see "Bobby" "Week Number" "Amount of Widgets"

If you wanted that presentation you could do something like this where there is no field being used as the Report Filter, and the rows have widgets nested under date nested under worker name. That will still give you total widgets per worker name as well as weekly totals.
posted by komara at 2:10 PM on September 3, 2014

(final note: once you add more data to your data tab all you have to do is right-click on the pivot table and choose 'Refresh' and it will update - no need to re-do all your work every time.)
posted by komara at 2:15 PM on September 3, 2014

« Older Baby's first insurance penalty   |   Who said: "Our tradition is: We shoot cannibals"? Newer »
This thread is closed to new comments.