Using awk to export many files from one text file
February 12, 2010 10:50 AM   Subscribe

How can I use awk (or something better) to cut a big file into groups of files, based on the occurrence of a pattern?

I have a structure of folders on a server, that works out like this:
Client A
--Job 1
--Job 4
--Job 5
Client B
--Job 2
--Job 3

And so on, for many different clients and jobs. I wish to automate pulling a directory listing (already done via find /start/here/ -type d -maxdepth 2), then output a separate document for each client, containing a the list of jobs stored within.

I've gotten as far as redefining my awk delimiter from space to / (this is on OS X), and determining that the client name appears in $5, but I don't yet know enough awk to craft "gather all lines where $5 is the same, and put them into their own text file".

Or... am I using the wrong tool for the job? Is there a better way to do this?
posted by Steve3 to Computers & Internet (10 answers total) 1 user marked this as a favorite
 
Perl.
posted by phrontist at 10:59 AM on February 12, 2010


~/scratch/fgf $ ls | while read d ; do seq 10 | while read n ; do mkdir -p "$d/Job$n" ; done ; done
~/scratch/fgf $ ls *
ClientA:
Job1 Job10 Job2 Job3 Job4 Job5 Job6 Job7 Job8 Job9

ClientB:
Job1 Job10 Job2 Job3 Job4 Job5 Job6 Job7 Job8 Job9
~/scratch/fgf $ ls | while read d; do ls $d > "joblist$d" ; done
~/scratch/fgf $ ls
ClientA ClientB joblistClientA joblistClientB
~/scratch/fgf $ cat joblistClientA
Job1
Job10
Job2
Job3
Job4
Job5
Job6
Job7
Job8
Job9
~/scratch/fgf $ cat joblistClientB
Job1
Job10
Job2
Job3
Job4
Job5
Job6
Job7
Job8
Job9
posted by doteatop at 11:03 AM on February 12, 2010


try something like:
print "whatever data" >>"/output/" $5 ".txt"
You'll get text files named by client.
My syntax here might be shaky. It's been a few years.
posted by DarkForest at 11:04 AM on February 12, 2010


I think this will do what you need. If not, we may need more details:

for i in Client*;do find $i -type d -maxdepth 2 >`basename $i`.txt;done
posted by chrisamiller at 11:05 AM on February 12, 2010


Something like:
DIR=/start/here
while read client
  echo $client > $DIR/$client/$client.log
  find $DIR/$client -type d -maxdepth 1 -exec echo "--{}" >>$DIR/$client/$client.log
end< <(find $DIR -type d -maxdepth 1)

posted by rhizome at 11:11 AM on February 12, 2010


You could do something like this. If there is some pattern that distinguishes the Client Lines from the Job Lines, you will want to match for those.


BEGIN { do something here, if needed }

{
if ( /pattern that matches Client lines/) {
outputfile = $5 ".txt"
}

if ( /pattern that matches Job lines/ ) {
print $0 >> outputfile
}
}

END { do something at the end, if needed... }

posted by jefbla at 11:27 AM on February 12, 2010


Response by poster: Woah, cool. Lots of approaches for me to interpret- this is great, thanks.

Chrisamiller: I'm dangerously close to getting yours to work, but one hangup. The Client name often contains spaces. I'm getting exactly the output I need if my client is "blackstapler", but "Coffee mug" yields:
find: /root/directory/Coffee: No such file or directory
fine: mug: no such file or directory

I must be one pair of properly spaced quotes away from making it work, but I haven't determined where they go.....
posted by Steve3 at 11:45 AM on February 12, 2010


Best answer: Try this:

for i in Client*;do find "$i" -type d -maxdepth 2 >`basename "$i"`.txt;done

It's good practice to quote bash variables anyway (though I tend to be lazy and only do it when I have to)
posted by chrisamiller at 12:06 PM on February 12, 2010 [1 favorite]


Response by poster: That works perfectly- thanks chrisamiller.
posted by Steve3 at 12:28 PM on February 12, 2010


"ls | while read d" is analogous to Useless use of cat. There is no need to run a whole subprocess and pipe here, the shell is perfectly capable of expanding globs on its own. "for d in *" is the right way. Actually in this case you should use "for d in */" so that the glob only matches directories and not files.
posted by Rhomboid at 4:41 PM on February 12, 2010


« Older Need to get out of the Valley.   |   Can I really become famous in the Philippines? Newer »
This thread is closed to new comments.