How can I do... anything? with this JSON file?
June 27, 2013 7:47 AM   Subscribe

So, I've found the source of all the data I need in my database, but it is in a JSON file. MSSQL cannot read JSON and the entire internet seems devoted to telling me how brilliant and easy JSON is and then giving me 20 pages of code to deal with it. I don't have time to implement a JSON parse in C or anything like that. I just need to put this perfectly simple bit of data into a database. Can you help?

I've tried online convertors, but the file is just a bit too big for them it seems. A downloadable one might work, but I can't find one.

I do have a Table0valued Function in MSSQL which takes a JSON parameter, but I'm quite new to MSSQL and I don't really know how to feed the parameter into it.

Any ideas?
posted by Just this guy, y'know to Computers & Internet (11 answers total) 3 users marked this as a favorite
 
How is the file structured? Can you paste a snippet?
posted by SemiSophos at 7:51 AM on June 27, 2013


And how big is the file?
posted by Gomez_in_the_South at 7:54 AM on June 27, 2013


Response by poster: I think this is one element.

{"STANOX":"06309","UIC":" ","3ALPHA":" ","NLCDESC16":" ","TIPLOC":"WEST534","NLC":"999819","NLCDESC":"WESTERTON SIG YH534"}

It is British Railway data, which is (I think) public,
http://nrodwiki.rockshore.net/index.php/Reference_Data

It is 6 MB
posted by Just this guy, y'know at 7:55 AM on June 27, 2013


Got access to a scripting language? With Perl, I'd use JSON::XS to read it, and Text::CSV::Slurp to write it back out.

It may be a bit of a stretch to have arbitrary data map to a flat CSV file, so some intelligence is required in your converter.

Alternatively, use xmlsh's json2xml to convert it to XML and use MSSQL's XML import. I'm hand-wavey on the details of that, though.
posted by scruss at 7:58 AM on June 27, 2013


Best answer: Your link doesn't let me download the file (it asks for a username and password). But the format looks simple and I bet I could convert that into a CVS file in about 5 minutes. Can you email me the file?
posted by ryanrs at 8:03 AM on June 27, 2013 [2 favorites]


Best answer: I've manually converted data from a large array of JSONs (in my case, a table of all counties in the US, plus info and IDs on each) to a csv before when I needed as a flat file. Open the file in notepad++ or similar and replace all instances of ':' and ',' with '\t', then open in excel, strip quotes, add headers, delete key value columns, import.

It's crude and assumes that the same keys are in all the JSONS (and watch for excel auto-formatting for certain types of values), but it works just fine for simple data.
posted by postcommunism at 8:08 AM on June 27, 2013 [1 favorite]


A little bit of googling and I came across this: Consuming JSON Strings in SQL Server

Though this is probably something you should do in the outside of the database with the scripting language of your choice.
posted by RonButNotStupid at 8:26 AM on June 27, 2013


Best answer: #!/usr/bin/python
import json

f = open('CORPUSExtract.json')
j = json.load(f)

keys = ['TIPLOC', 'UIC', 'NLCDESC16', 'STANOX', 'NLC', '3ALPHA', 'NLCDESC']
print ','.join(keys)

for row in j['TIPLOCDATA']:
  values = []
  for key in keys:
    values.append(row[key])
  print ','.join(values)


520 ~/json$ ./read-rail-table.py >out.cvs
521 ~/json$ head out.cvs
TIPLOC,UIC,NLCDESC16,STANOX,NLC,3ALPHA,NLCDESC
, ,MPTE HQ INPUT, ,000800, ,MERSEYRAIL ELECTRICS-HQ INPUT
, ,CAPRI HQ INPUT, ,000900, ,AUTHORISED INPUT LOCATION 0009
, ,TFLC INTERFACE, ,001700, ,TFLC INTERFACE
, ,ATOS C INTERFACE, ,001800, ,ATOS C INTERFACE
, ,HQ INPUT TTL OS, ,001900, ,HQ INPUT TTL OPEN SCHEME
, ,LUL THRU TICKETS, ,002000, ,LUL THROUGH TICKET ISSUES
, ,TFLA INTERFACE, ,002100, ,TFLA INTERFACE
, ,TFLB INTERFACE, ,002200, ,TFLB INTERFACE
, ,ATOC INTERFACE, ,002300, ,ATOC INTERFACE

Looks ok...


524 ~/json$ cat out.cvs | sed -Ee 's/[^,]+//g' | uniq
,,,,,,

No commas in the data fields, which is good because I didn't handle escaping.


526 ~/json$ fgrep -n '"' out.cvs
10784: , , , ,132144, ,DUTY SHIFT MANAGER "E
10785: , , , ,132145, ,DUTY SHIFT MANAGER "F
10900: , , , ,132530, ,DUTY SHIFT MANAGER "B
10901: , , , ,132531, ,DUTY SHIFT MANAGER "C
10902: , , , ,132532, ,DUTY SHIFT MANAGER "D
10940: , , , ,132588, ,DUTY SHIFT MANAGER "A
10941: , , , ,132589, ,DUTY SHIFT MANAGER "B" TYPE 2
49984: , , , ,880629, ,PERTH COMBINED "A" STORE
53143: , , , ,980677, ,SALKELD ST COMBINED "A" STORE
53555: , , , ,983751, ,SLATEFORD COMBINED "A" STORE

The unmatched double quotes look bogus, but they are like that in the json file. I blame the original upstream conversion to json for messing them up.

Problem solved! Caveats:
1) I spent about 15 minutes on this and did very little sanity checking.
2) I am drunk.
posted by ryanrs at 8:29 AM on June 27, 2013 [4 favorites]


Best answer: Oh wait. CVS files need to have double quotes escaped. Whoops.

change
values.append(row[key])
to
values.append('"%s"' % row[key].replace('"', '""'))

Let me know if MSSQL barfs when importing the file. It may need more tweaks.
posted by ryanrs at 8:41 AM on June 27, 2013 [1 favorite]


> No commas in the data fields, which is good because I didn't handle escaping.

  ಠ_ಠ

This is why we don't roll our own CSV parsers or writers. Python's csv module, with handy methods like csv.DictWriter, make your code shorter and more robust.

Don't forget: "Code drunk, debug hung-over."
posted by scruss at 11:42 AM on June 27, 2013 [5 favorites]


Respectfully submitted:
import fileinput
import json
import csv
import sys

l = []
for line in fileinput.input():
    l.append(line)
myjson = json.loads(''.join(l))
keys = {}
for i in myjson:
    for k in i.keys():
        keys[k] = 1
mycsv = csv.DictWriter(sys.stdout, fieldnames=keys.keys(),
                       quoting=csv.QUOTE_MINIMAL)
mycsv.writeheader()
for row in myjson:
    mycsv.writerow(row)

posted by scruss at 7:44 PM on June 27, 2013 [2 favorites]


« Older My face is on fire.   |   Bargaining chips for buying newish used car Newer »
This thread is closed to new comments.