Getting stuff via HTTP POST for the semi-n3wb
August 12, 2015 12:38 PM   Subscribe

The good news: I've been granted access to a vast wonderland of streaming data that will make all my dreams come true. The bad news: I don't know how to access it. HTTP POST and JSON details inside...

This really boils down to asking the basics of how to pull data via HTTP POST.

I'm a journalist with ancient rogramming knowledge. (When I program, I do it in C, if that gives you a sense.) The data I'm hoping to access is fairly simple stuff: 12 data fields, nothing terribly complex. The one I'm going to be filtering by, to begin with is just a 6-hexdigit number. (Call this field "numhex." Call the others "field2", "field3", etc.)

What I'd like to do is pull down all data where numhex is a member of a small set, say, {aaaaaa, abaaa1, caaaa3}

What I've been told is that the data is gettable via HTTP POST, and the output format is JSON. I've also been told that a query to the endpoint will have this packaged in the HTTP POST: "user=user_API&password=password&format=json". I got as far as downloading cURL before my head exploded.

Can anyone give me a quick step-by-step primer for how to get access to the sweet, sweet data?

Thanks much!
posted by cgs06 to Computers & Internet (10 answers total)
 
This depends on the language you want to work with; in PHP it would look something like following (returns an array containing all data):

<?php

$sweet_sweet_data = json_decode( file_get_contents( "http://domain.com/api?user=user_API&password=password&format=json" ) );

?>

Then just loop through the data and filter to your hearts content.
posted by axismundi at 1:01 PM on August 12, 2015


Maybe using the Postman Chrome extension will be a little easier for you than using Curl? It lets you just paste in the body you want to send via your POST request, e.g. "user=user_API&password=password&format=json" assuming you have a valuid user_API and password.

The thing is, "user=user_API&password=password&format=json" looks like the kind of thing that normally goes in a URL (an https URL) rather than a POST body.

Can you give us the actual URL you are trying to post to?

(Also, I don't get how streaming comes into play here. What you're doing is making a basic http request – it means you'll open a socket, make the request, get the response in its entirely, then close the socket. When you stream, you usually leave the socket open and chunks come down a bit at a time.)
posted by ignignokt at 1:01 PM on August 12, 2015 [2 favorites]


The thing is, "user=user_API&password=password&format=json" looks like the kind of thing that normally goes in a URL (an https URL) rather than a POST body.

There is nothing surprising about this. Payloads in a POST request are URL-encoded all the time.
posted by invitapriore at 1:03 PM on August 12, 2015 [1 favorite]


Is there a way to filter the data on the server, or do you have to do it on the client? On the client side you can use cURL and jq in combination to do this. The commandlines below may need to be tweaked to work on Windows - I'm testing on Mac.

To fetch the data and format it in a human-readable way:
$ curl --data "user=user_API&password=password&format=json" http://link.to.endpoint | jq .

To do the actual filtering, something like:
$ curl --data "user=user_API&password=password&format=json" http://link.to.endpoint | jq 'map(select(.numhex == "aaaaaa" or .numhex == "abaaa1" or .numhex == "caaaa3"))'

Ideally, you could do the filtering server-side, but the provider would have to tell you how to request that in a way specific to their service. It would probably involve changing either the URL or the data you put in the HTTP POST to specify the filter.
posted by pocams at 1:04 PM on August 12, 2015 [4 favorites]


What I'd like to do is pull down all data where numhex is a member of a small set, say, {aaaaaa, abaaa1, caaaa3}

What I've been told is that the data is gettable via HTTP POST, and the output format is JSON.


A POST request is a pretty general thing. I doubt the service you will be using returns all of the data all of the time, so you probably need to either request a dynamically generated URL or put something in the POST body to tell the server what data to give you.

This is not something that can be answered in general, you'll have to ask the people providing the service for more details.
posted by Dr Dracator at 1:09 PM on August 12, 2015


Response by poster: Thanks for the answers so far...

I'd rather not post the actual URL, but it's of the form
https://api.company.com/webservice.php
if that helps...

I'll get my hands on jq and Postman this evening and play around... thanks for those suggestions.

I *assume* the filtering would be on the client side, as it's a fairly large amount of real-time data that I'm looking for, but I'm not 100% sure.

As for streams... I know that there exists a stream of data... each object -- numhex is a unique identifier -- is having its attributes constantly updated. Whether I'm tapping the data directly, or a periodically-refreshed database at a a discrete time period, or what, I don't rightly know.
posted by cgs06 at 1:22 PM on August 12, 2015


I feel like you'd probably benefit from a more general understanding of HTTP (The Wikipedia article for HTTP and for the POST method in particular are decent enough primers) as I feel like you may not have the correct mental model for this problem, and cargo cult programming is a terrible way to live.

Speaking a bit more specifically to your question, HTTP is a pretty simple client/server communication protocol. When you're POST-ing to the URL (in your example, https://api.company.com/webservice.php), you're sending a request where the body of that request is that URL-encoded string you mentioned. The specifics of what should go in the body of that request are determined by whoever is providing the webservice, so you would need to get the specific API details from them.

Assuming you're able to successfully craft the request (and yeah, the curl commands or the Postman extension mentioned is probably going to be the easiest way to go about doing that), you'll get back a response from the server with the data. In your case, it sounds like this data is going to be in the JSON format, but again the exact details of the format of the response are up to whoever implemented the API, so we'd either need documentation or at least a sample of the response to give you specific advice on parsing this data into something usable.

As far as filtering goes, unless this is something that the webservice API supports as one of the POST parameters, you'll need to read in the entire response and then do the filtering yourself, client-side.

Also, actual data streams which constantly update over the same request/connection are relatively rare in HTTP (which was not built as a streaming protocol); typically it's one request, one response, and then the session terminates. If you need to get up-to-date values, you'll need to pull the data again by performing another request.
posted by Aleyn at 12:24 AM on August 13, 2015 [1 favorite]


You're gonna have to script it. I suggest Python -- if you're on a Mac it's built-in, on Windows it's easily installed.

The "requests" library makes it pretty straightforward (you'll have to install package 'requests' via pip). This is just a toy example but should give you an idea:
import requests

wanted_ids = ['4a1111', '5b2222', '6c3333']

for id in wanted_ids:
    payload = {
        user: "myself",
        password: "swordfish",
        format: "json",
        binhex: id
    }
    response = requests.post('http://some.company.com/api.php', payload)
    # dump to a file, if you want
    file = open(id + '.json', 'w')
    file.write(response.text)
    file.close()
    my_data = response.json()
    # the received data is now in my_data

posted by neckro23 at 11:05 AM on August 13, 2015


Seconding Postman for ad hoc requests, and Python if you want your computer to do it regularly.
posted by doiheartwentyone at 2:59 PM on August 13, 2015 [1 favorite]


Response by poster: Thanks, everyone.

Got it working... I can't filter on the server side, but for the moment, I can reliably pull the data down and have it append to a file in CSV format, which I can then query and filter locally.

I appreciate all the help!
posted by cgs06 at 8:31 PM on August 14, 2015


« Older why do you like Tide enough to Like-like Tide?   |   Dealing with Elderly parent, and nasty sibling Newer »
This thread is closed to new comments.