I want a special snowflake server for christmas!
December 25, 2010 1:12 AM   Subscribe

How do I implement my special snowflake server?

When a client connects to the server, the server should send some pre-defined data. I'm thinking a pickled python list would be best here. The client would take this data, turn it back into a list, and store it.

Occasionally, the client will send some instructions to the server. The server will take these instructions and apply them to a list. Once that is done, the server should push the new data to all clients listening.

An example in case this is unclear:
Alice's and Bob's clients both connect to the server and both receive "list = ([0,0,0])". Bob sends "add 1 to list[0]". The server has a subroutine for handling this; the server's internal list now looks like "list = ([1,0,0])". The server then pushes "list[0] = 1" to both Alice and Bob. Eve connects now and receives "list = ([1,0,0])", and so on.

What do I use to do this? I've tried using the python socket module but I honestly have no idea how I could make the server interact with multiple clients while maintaining a "master" list that any client can change. Threading seems promising, but eventually I would like a lot of clients and I've heard threading doesn't scale well. Maybe I could do this somehow over http with a cgi file and a database, but I don't know how I could push the changes. I've looked into WebSocket, but I can't find enough information about it and I still can't figure out how to handle multiple clients. I'm also having trouble getting the client to listen and get ready to send things at the same time.

How would you do this? Are there any projects already out there that are made for this? Any links to tutorials or easy examples would be greatly appreciated, too.

I would like to write both the client and the server in Python, but that's not set in stone. This would need to run on a unix box. Thanks for the help!
posted by wayland to Computers & Internet (15 answers total) 1 user marked this as a favorite
 
How would you do this?

So your project needs a persistent store for list-type data, with multiple users connecting over the network? Why are you not using your favorite free database server for this, with clients polling for new data at regular intervals?

A full database might be overkill for your solution, but the software required is free and the time it will take you to implement it properly is definitely less than what you would need to roll your own.

Maybe I could do this somehow over http with a cgi file and a database, but I don't know how I could push the changes.

An http server is not strictly required - your clients can connect to the db server directly using an appropriate python wrapper ( subject to network architecture/ security concerns ).

If there is a lot of data, and you don't want to send the entire list over the network every time something changes, a simple way would be to add a timestamp field to each entry and only send data that has changed since the last update.
posted by Dr Dracator at 1:34 AM on December 25, 2010


Also, can you give us an idea of the scope of your project? Is it for work or a hobby project? How many users are you expecting? How often will the data change? How much data will be passed around? Is it mission critical? Are there security concerns? How much of a problem is it if some clients are left with stale data?

Your description as given could apply equally well to a toy vending machine appliance for a university dorm and a full-blown corporate network handling financial transactions, the answers you get will be more useful if you tell us which one you are going for.
posted by Dr Dracator at 1:40 AM on December 25, 2010


Ah, forgot to mention: I don't want the clients to be able to change the list directly. Instead, I want the requests by the clients to be handled by the server so I can have a set of rules. For example, say I don't want anyone to be able to add more than 3 to number.

Also, I would like to eventually be able to serve different data based on who the client is (either by a username prompt or IP address).

The list will get large and I'm expecting rapid successions of changes followed by slower ones, so it feels like having the client look for changes every x seconds or whatever would be a waste. Correct me if I'm misunderstanding something, though.
posted by wayland at 1:48 AM on December 25, 2010


I'd do it in Twisted, which has the right abstractions and good documentation/examples.
posted by themel at 2:20 AM on December 25, 2010


If you're absolutely determined to do real push from server to client, you're going to want your clients to create connections that stay alive for as long as their sessions do (having your end open connections back to them when you want to push something is a configuration nightmare that you don't want to tangle with). Personally, I'd be inclined to offload the encryption and authorization you're going to need to ssh, using port forwarding; that would also mean that your server only needs to keep open one persistent TCP connection per client. Have a look into the format of an opensshd authorized_users file - you can lock stuff down pretty well.
posted by flabdablet at 4:47 AM on December 25, 2010


Setting up a system wherein the clients do the active connecting every x seconds has two disadvantages: there is some lag time for updates to propagate (which implies the potential for conflicting updates), and it is not the most efficient plan in terms of computing resources.

It is far and away the most efficient system in terms of how much time it will take you to set up.

A client/server based system of this kind will have fairly high load capabilities; hundreds or thousands of clients connecting every 30 seconds is still a fairly light load for an apache server if it's only updating with changes. Doing it this way allows you to build it as a straightforward http request, with auth happening in any of a variety of ways. What's answering the request could be as simple as a php script (select changes from db since last update, send changes).

If you really want to build it as a socket-based application, it sounds like you're going about it in the right way. It also sounds like you're combining two problems and getting confused about what solution solves which problem. You can think about the list update operations totally independently of the transport mechanism, and probably should. Regardless of what transport mechanism you're using, you need to build in something to make sure that updates from two different threads (or http requests, or thrown pigeons) don't step on each other.
posted by contrarian at 5:06 AM on December 25, 2010


How fast does this need to happen?

I mean if speed isn't too big of a concern, the easiest way to do this is just with HTTP polling. Clients send an http GET every second, and in some cases they do an http POST. You wouldn't even need to write the server, just use apache.

so client code would just be something like this:

while(true){
  a = get("https://wherever/filename",username,password);
  if(a != b){
    handle_change(a);
    b = a;
  }
}


where get() takes a URL and a username/password and the server uses http basic auth.

Now, if you need it to be more realtime, your server could still be pretty simple. Open a server socket and listen for new connections. When a new connection comes in, send a copy of the state to it.

In Java you have a Separate objects for reading and writing to a socket (an input stream and output stream) For this application you would store the output streams in one collection and listen to the input streams on one thread for each of them. When new data comes in on the input stream, you would then update your internal model (for new connections) and then pass the information you revived from the client on to all the other clients (the data you receive and send back out can be the same, and the code to apply the changes can be the same on both as well. )
posted by delmoi at 6:17 AM on December 25, 2010


This seems ideal for an actual database. You could write a few scripts to represent the client-side portion, and just send queries directly to the database over TCP. Usually this gets overlooked because databases so frequently get used with a web front-end, but connecting directly to them to execute queries is exactly what you're asking about.
posted by odinsdream at 6:28 AM on December 25, 2010


People are offering all sorts of complex multi-tier solutions.

Honestly, this problem sounds trivial to me. Use Twisted. Let clients connect. Periodically send out the serialized list to all connected sockets (don't use pickle, btw, it's not compatible between versions). If you receive data back, run it through your rules engine. Store the list both in a file and in memory... write through to disk on every update, and read from disk only on startup, otherwise work from memory.
posted by Netzapper at 7:20 AM on December 25, 2010


Ewww. Think open and better accepted standards. You don't want to serialize to Python, use JSON so you at least have the hope of future web-app-ification. True push is almost impossible, that implies a service running on the client machine and the server being able to open a connection to the client on demand which isn't going to happen with all of the NAT in place on the network without a lot of firewall rules setup, crossing fingers, sacrificing chickens. If you really want to get close to that ideal, start with a chat server like solution (IRC/XMPP).

Your server runs a chat server, with a room that's protected via auth. Th ere's plenty of IRC/XMPP client code out there that handles automatic reconnection. Then your app is another client that joins the same 'room' and listens for commands and 'says' data.

Client: connect server 'the.app.com:666' auth 'user@password', join 'app_channel'.
listen for 'data: *data here*' messages. do something with them.
send 'app: add 1 1000' commands. receive 'data: *data here*' messages.
repeat.

Server: connect ...
on_new_client: send 'data: *data here*' private message.
listen for 'app: add *' messages. verify/check permissions/etc. do work. send 'data: *data here*' update.
repeat.

Presto. Stable well tested client-client message passing router. Server is just another client that responds to certain type of messages. Clients just listen for data messages and send modification commands. Any 'chat bot' code could easily be modified to do this.

I would totally use XMPP vs. IRC because the relative ease of encapsulating data and extracting it from the XML message, and because the ejabberd XMPP server is so solid you could throw a brick at it and it wouldn't even flinch.

In fact, the not the metafilter XMPP chat server mu.jklmnop.net has a per-user chatbot that tracks conversations with something as simple as:
   message => sub {
      my ($cl, $acc, $msg) = @_;
      my $repl = $msg->make_reply;
      $repl->add_body($RS->reply($msg->from, $msg->any_body));
      $repl->send;
   },

All you would have to do is check $msg->any_body against your commands, $msg->from against your authorization code, put your data in $repl.

Looks complicated, but like Netzapper says, it really isn't. I'm just breaking out the session/clients handling and letting a robust pre-existing chat server manage that. Then the clients and server become just a simple listen for messages and respond loop and all of the connect/reconnect stuff gets taken care of automagically.
posted by zengargoyle at 8:49 AM on December 25, 2010


Take a look at mongrel2, which is in raw alpha stage, but is built on the ZeroMQ network library that handles this exact scenario (i.e., arbitrary broadcast to subscribed clients). If you're not talking about a public production server, meaning you can allow for some flakiness and code around it, then it might be exactly what you need. You can write your server-side handlers and clients in Python.

Agreed with using JSON as a protocol rather than serialized Python.
posted by fatbird at 9:26 AM on December 25, 2010


REST + JSON is what I would recommend, but also look into PubSubHubbub/Dnode, which are both node.js notification protocols.
posted by rhizome at 12:22 PM on December 25, 2010


The list will get large...

If the list really is going to get very large, and the scope of each of the changes will be small relative to the size of the entire list, it seems like you'll want to implement some rsync-type algorithm that just sends deltas / diffs as the update to the client.

In fact, if that's what it would be like, maybe you want to use rsync itself for updating the clients: maintain the list as a flat XML or binary file in a read-only filesystem on the server and have the clients poll that and then read from their incrementally-updated local copy.

Beyond that, the update transmissions from the clients seem like an ideal application for a standard message queuing server like what fatbird mentions above (though there are several others available besides what he mentions.) That'll already have transactions implemented and other things you might find yourself needing.
posted by XMLicious at 3:17 PM on December 25, 2010


True push is almost impossible, that implies a service running on the client machine and the server being able to open a connection to the client on demand which isn't going to happen with all of the NAT in place on the network without a lot of firewall rules setup, crossing fingers, sacrificing chickens.
Uh... have you ever done any socket programming? It's not hard at all, you just keep the connection open.
posted by delmoi at 10:06 PM on December 25, 2010


It's not hard at all, you just keep the connection open.
That's the hard part. The client has to initiate the session, therefore it's not true push. True push is a daemon living on the client machine that can accept a new connection at any point in time. Data changes on the server, the server has to connect to the client if there isn't an existing connection to the client to push the data. What happens when your client or server has to restart? Even if you get the same reserved or ephemeral port as you had before, your TCP sequence numbers are invalid. Client rebuilding a connection to the server is usually easy (the web works), Server rebuilding a connection to the Client is hard when you have NAT/dynamic addressing in place.

I do a metric-buttload of network programming. Mostly RPC internally, XMLRPC externally. It is easy if your client and serverh are both up at static endpoints without any NAT/firewall in the way that needs to be configured. Then you just implement an 'update' command on the client that the server can call whenever it needs to.
posted by zengargoyle at 1:17 PM on December 26, 2010


« Older Not that there's anything wrong with gaiety   |   What problems should one be aware of before buying... Newer »
This thread is closed to new comments.