csv data formatting
February 13, 2009 3:07 PM   Subscribe

I have a large amount of csv data that I need formatted in a certain way. How do I do this?

I have a csv file with five columns, A-E. I wish to see them as a tree structure, with column A being the parent node of B, column B being parent node of C, and so on.

I need the structured in a certain way:

id:"xxxx",name:"Column A1",children:[{
id:"yyyy",name:"Column B1",data:[{key:"ColumnA1",value:"pppp"}],children:[{
id:"zzzz",name:"Column C1",data:[{key:"Column B1",value:"pppp"}],,children:[{

... etc

I need a script that will read in the csv and spit out the structure. Help?
posted by sidr to Computers & Internet (11 answers total)
 
Graphviz will draw it. Your description of the format of the csv is unclear, so I can't say how to get from that to graphviz's output.
posted by orthogonality at 3:36 PM on February 13, 2009


Response by poster: I don't want a graphical output, I want the text structure that I asked for!

The csv has five columns; each column has one or more values occurring multiple times.
posted by sidr at 3:53 PM on February 13, 2009


Yeah, I'm with orthogonality... I don't understand your output structure. Some platform info would help too.
posted by pompomtom at 4:03 PM on February 13, 2009


(I'd either use sed or VBA, depending on the platform, but that's just me...)
posted by pompomtom at 4:03 PM on February 13, 2009


This is really pretty easy if you can program a little, and if not, you're going to have to find someone who can. I find it very unlikely that there's any existing tool that you could just sort of plug-n-chug with.

(This shouldn't take a good pro more than a few hours to do, probably)
posted by RustyBrooks at 4:14 PM on February 13, 2009


Build a small example, space out the desired output with periods (to make sure we can follow the field structure better), and post both input & output. Like:

INPUT:
1,2,3
1,2,4
5,6,7

OUTPUT:
1
.2
..3
..4
5
.6
..7

(but that's just one way I am trying to interpret your question.)

Then we'll have a better chance of answering.

BTW, there are EXCELLENT Excel help forums out there that can answer this question as well, and are free & easy to use. FYI.
posted by IAmBroom at 5:09 PM on February 13, 2009


Is the output supposed to be JSON?
posted by nicwolff at 6:01 PM on February 13, 2009


Yeah. A better textual description would help a great deal. What does "[{" mean? Are those filler character, or indicative of something in the structure? What comes after "children"?
posted by ellF at 7:00 PM on February 13, 2009


sounds like what you want is a relational database. kinda hard to do with a script, unless it's pretty customized. if it's an ad hoc project, then you're going to have to do it manually in whatever platform you're familiar with.
posted by lester's sock puppet at 7:13 PM on February 13, 2009


id:"xxxx",name:"Column A1",children:[{
id:"yyyy",name:"Column B1",data:[{key:"ColumnA1",value:"pppp"}],children:[{
id:"zzzz",name:"Column C1",data:[{key:"Column B1",value:"pppp"}],,children:[{


Explain to me what parts of the above are taken from your file, which are text you have to add, and where the xxxx comes from, is it just a running number?

Am I correct in saying that if you were doing this manually you would "manually" write the ids (xxxx, yyyy, zzzz) and the names/keys (column a1, b1, etc)? Am I also correct in assuming that once the entire row is processed you would have up to and including E1?

I always recommend Editplus for this sort of stuff. It has keystroke recordning that is really easy to use if you aren't that into coding etc, and the ability to do regular expressions if you are.

If what I wrote above is the correct interpretation, then I would carefully record the process to create 1 record. Then I would run the keystroke recording over and over again.
Important here would be that instead of changing zzzz and columnb1 and so forth, just let it be a placeholder, same for all of them xxxx and columnxx. From what you are showing in the example your various keys, ids, etc are always in the same columns, so you can afterwards just make a long list in excel using the "fill repeating data", same as you do to make a list of consecutive numbers, and then "column paste" that over the columns with the ids, and keys. Does that make any sense at all? Might make a little more sense if you grab Editplus and check out the keystroke recorder.
posted by Iteki at 2:32 AM on February 14, 2009


You could install MySQL on your machine and have that easily transform the data.
posted by JJ86 at 8:58 AM on February 14, 2009


« Older Help me get my plant laid.   |   Why do some social web platforms catch-on (e.g.... Newer »
This thread is closed to new comments.