Please help me create a pdf document for over 10 years of payslips.
December 4, 2019 3:57 AM   Subscribe

I only have access to my payslips for a very short time. I have downloaded most of them but there's a discrepancy of about 11 that I've missed. Unfortunately, when I downloaded them (as .pdf files) I named them 'payslip payadvice dd mm yy' so of course they've not been sorted into date order. What I need to know is how to put the payslips that I have into date order so I can compare them to the ones that are on the server that I downloaded them from and pick up the ones I missed.

This is really time critical and I have little to no experience with spreadsheets or making .pdf documents (apart from saving documents as .pdf files). I have access to Microsoft Office.

Please step me through this process, keeping in my mind that I have absolutely no idea how to get these files into date order with the way they've been named by using formulas or algorithms in spreadsheets, and I also have no idea about publishing them all as separate pages in a .pdf document in a way that isn't going to take me a month or so.

If you want to perform a Christmas miracle and do this for me because you know how to do it really easily then I would be eternally grateful and would feature you in anecdotes for the rest of my days.

Realistically, giving me somewhere to start that's efficient would be almost as good.
posted by h00py to Technology (14 answers total)
 
Is this a Mac or a PC? The command-line might be the best way of renaming many files at once, but the syntax will depend on what kind of computer you’re using.
posted by vitout at 4:10 AM on December 4, 2019 [2 favorites]


So there's no chance you downloaded them in order by clicking your way down the list, such that you could sort by the creation date?
posted by teremala at 4:37 AM on December 4, 2019 [2 favorites]


Do you have excel (or numbers or openoffice....)?

If they're all in one directory, something like:

dir pay* >importme.txt

(for windows)
or

ls pay* >importme.txt


(on something nixish)

...will give you a text file listing that you can import to a spreadsheet.

Then do split-by-delimiter ("text to columns" I think is what excel calls it). Make your delimiter a space, and now you can sort by year,month,day as dog intended.
posted by pompomtom at 4:37 AM on December 4, 2019 [2 favorites]


Best answer: Bit of a low tech solution but assuming you have approx 120 - 240 pay slips, it is time consuming but not more than two hours hour I would think to print them, manually arrange them in chronological order, and then scan them in in as one big PDF?
posted by hepta at 4:56 AM on December 4, 2019 [2 favorites]


Best answer: Are the contents of the PDF searchable using Windows Explorer? That is, if you have the folder open, can you use the "search documents" tool to find documents with specific text in the document? If so, I would:
1. put all the payslips in one folder.
2. in that folder "search documents" for a year e.g. 2018. That should turn up only documents from the year 2018. You may have to fool with it a bit to find the right search term. (maybe ", 2018" - that is, comma space 2018)
3. Move all the documents from 2018 into their own folder.
4. Repeat 2-3 for other years.
5. Now you have them at least sorted into folders by year, and can see which years you don't have enough (24?) payslips for. If you have weekly payslips, you may want to repeat the "search documents" routine to search for the name of the month, and sort into monthly folders.
Would that work? If so, let me know if you need more specific instructions for these steps.
posted by evilmomlady at 5:00 AM on December 4, 2019 [1 favorite]


Response by poster: Ah, hepta. You may have confirmed what I already thought - I'm going to have to work at this. But evilmomlady, thank you for the shortcut; I think that will be very useful.
posted by h00py at 5:09 AM on December 4, 2019


Response by poster: pompomtom, I feel I should know what you're saying but alas, I have vast deficits of knowledge when it comes to excel and spreadsheets and stuff. Even such a basic thing as knowing where to type 'dir.etc' is necessary, unfortunately.

I use Windows.
posted by h00py at 5:14 AM on December 4, 2019


Best answer: A higher-tech solution, that may not take too much time:

1. Open the folder where you have all these files saved.

2. Select all the files you're interested in. (If they're all in the folder consecutively, click the top one, hold your Shift key down, and click the bottom one. If they're not consecutive, then click one, hold your Control key down, and single-click on each of the others.)

3. Up in the menu of the File Explorer window, go to the "Home" tab (which is probably already visible).

4. Click the icon "Copy path"

5. Open Excel, start a new workbook

6. Right-click on a cell, and paste

This will paste the full name (and complete folder info) of every highlighted file in the folder.


If every file is actually named using a standard naming convention, you can also do this in Excel to create a list without the folder info, leaving just the names:

1. Click into the blank cell to the right of the first item in the list.

2. Type out the exact name of the file that is in the cell on the left, but without the folder info.

3. After pressing Enter, click once on that cell again. (You want the text to be saved in the cell, and the cell to be the active cell - it will have a border around it)

4. Up in the Excel ribbon, on the Home tab, click the Fill icon, and then Flash Fill.

This should automatically generate just the file names all the way down your list. Now you can delete the first column.


You can repeat the flash fill process with just the years, for example, and then sort that new column to see which ones you're missing.

Here's more info on Flash Fill if these instructions don't make sense: https://www.excel-easy.com/examples/flash-fill.html
posted by SuperSquirrel at 6:00 AM on December 4, 2019 [2 favorites]


Best answer: If you're using Windows, try this bit of command-line magic. It's a little technical but it will get your files renamed into a format that will allow Windows to sort them by date:
  1. Copy the path to wherever your payslips are stored; if you have the folder open in Windows Explorer, you can get this by clicking the path bar that shows where the folder is located. It will look something like "C:\Users\yourusername\Documents\payslips".
  2. Open up Windows PowerShell by pressing Start and typing "powershell".
  3. In the window that appears, change directories to wherever your payslips are stored by typing:
     cd "C:\Users\yourusername\Documents\payslips"
    (paste in the path that you copied in step 1, and put it in double-quotes).
  4. Paste the following (using control-V) and press enter:
    Get-ChildItem | ForEach { `
    	$fileMatched = $_.Name-match '(payslip payadvice) (\d{2} \d{2} \d{2})\.pdf';
    	if ($fileMatched) {
    		$fileDate = [DateTime]::ParseExact($Matches[2], 'dd MM yy', $null);
    		Move-Item $_.Name ($fileDate.toString('yyyy-MM-dd') + ' ' + $Matches[1] + '.pdf');
    	}
    }
    
Assuming your PDFs are all named exactly like you said (for example, "payslip payadvice 31 01 19.pdf", for January 31st 2019), this will rename them all to be like this: "2019-01-31 payslip payadvice.pdf".

It will work across multiple years too. If you name your files using YYYY-MM-DD format, and sort them by name, they will also be sorted by date in Windows Explorer.
posted by vitout at 6:00 AM on December 4, 2019 [7 favorites]


I hope others have sussed this, (and I've NFI about combining pdfs, sorry), but given your time issue I'll blather on:

On windows, if you've just gone "download" in the browser, you'll likely have all these files in C:/Users/[YOUR ACCOUNT NAME]/Downloads


So for actual buttons:

[windows-button] (to get that all-the-things menu)
type "command" then enter (to get a command prompt)
Then the pre-excel things I said above.

Then open excel, CTRL-O (for open file) and look for "importme.txt" in your downloads directory.

Then the other things I said above.
posted by pompomtom at 6:06 AM on December 4, 2019


Soz, terrible advice. Once you have a command prompt you first need to do:

cd downloads

...to get to the correct directory.
posted by pompomtom at 6:16 AM on December 4, 2019


On anything-but-windows I'd parse the CreationDate metadata field that every PDF should have and sort on that, but I can't even find a Windows executable for pdfinfo to begin with that.

Given that time is so short, would faffing around with Excel be any quicker than just downloading them all again? Even if it is 240 files and they all try to get saved as something unhelpful as PayAdvice.pdf, as long as you clean out your downloads folder beforehand you should end up with PayAdvice.pdf, PayAdvice (1).pdf, PayAdvice (2).pdf, …, PayAdvice (239).pdf. That's in some semblance of an order, and at least you'd know you'd got them all.
posted by scruss at 8:18 AM on December 4, 2019


Bulk Rename Utility is a windows program that will rename files like this quickly. In this case, you would probably want to use the regex part to rename every file to 'payslip payadvice yy mm dd'. The help forums have plenty of regex (regular expressions) help. Then you could sort them by name and see missing dates.
posted by soelo at 9:00 AM on December 4, 2019


Best answer: ...and in the future, just name them, e.g., '2019.12.04 paystub' when you download them

Then they will sort "chronologically" when you sort them alphabetically. Note: must always fill in all digits of xxxx.yy.zz.
posted by praemunire at 9:23 AM on December 4, 2019


« Older Gift ideas for boys/men of many ages   |   Asking for a... ummm, friend Newer »
This thread is closed to new comments.