Tutorials / Algorithms / Software for Federated Services
April 5, 2017 10:26 AM   Subscribe

The FPP about the federated microblogging service Mastodon has sparked my interest in how a very large federated service might function.

I imagine (but don't know) that similar concepts/algorithms power things like Mastodon, Tor, and Torrents; perhaps even the DNS system.

I would go through the Mastodon codebase, but it's in Ruby and I don't know Ruby. (I'm mainly familiar with C#, PHP, JS, tiny bit of Python.)

I'm particularly curious about scaling to massive numbers of federated servers. If there are 100,000 nodes/servers, how do they coordinate? How would client software pick which node to talk to? How much do you duplicate information to ensure that when nodes go dark the information is likely to remain?

What if data duplication / availability is not a design goal, and each node can act like an independent shard, so that if a node goes dark, its particular share of data is no longer accessible? (Thinking in terms of the internet, if foo.com goes down, that's "fine" for everyone but the owners of foo.com. If foo.com *wants* to go down, then that's fine for everyone period.)

It seems that each node would still have to hold a share of routing information for the network (unless there is a separate system like DNS for holding routing information), but otherwise the network's "work" would be greatly reduced.

Very high level overviews would be helpful; I wouldn't mind seeing references to Computer Science whitepapers, but I suspect I'd struggle with them. Practical overviews using readily available software ("HOWTO federate teh things with Postgres and Node") would be great. Classes at an online training resource like edx or similar would be amazing.
posted by Number Used Once to Computers & Internet (3 answers total) 4 users marked this as a favorite
 
Email is probably the best example of a well known federated protocol/data specification. Anyone can run a server and send or receive messages from any other server. All routing information is contained in the email address - user@host.com.

It looks like Mastadon uses the OStatus protocol to achieve federation for status updates. You can take a look at the spec here.
posted by Fidel Cashflow at 10:54 AM on April 5, 2017 [2 favorites]


It's a dated and little used technology now, but the Mastodon federation resembles in many ways the Usenet/NNTP model. You can find more in depth descriptions, but the basic idea was that any Usenet server could host newsgroups that originated on that server, or that originated on another server and were cached locally.

So, you could connect to your ISP news server at news.isp.com and it would have portions of the usenet feed as well as local newsgroups. Moderation could be done serverside (in that you could filter groups and moderate messages or some such) or client side (via killfiles) where you could block reciept of messages based on criteria.

It was a robust model in that server admins had discretion in what they hosted and what they shared, and a user account on a full feed NNTP server was often worth the money vs. a free account on an ISP server. Since you could choose which groups you subscribed to and which servers you used, it was possible to access content on ServerB that the admin of ServerA did not want to host. It was also possible to host private, non-published groups, so that only those with credentials on that server could access it.

The fact that client software was required to access these servers is what killed Usenet in favor of web based forums. There were other harassment issues, but then as now, client side killfiles and moderation are the solution to that. I've always felt that NNTP never got the love it deserved - it was a tidy and robust messaging protocol.
posted by Pogo_Fuzzybutt at 12:25 PM on April 5, 2017 [1 favorite]


This is a bit tangential, but if you're interested in the actual algorithms involved, you might enjoy reading about consensus algorithms. There are plenty of decent lectures on youtube about Raft and Paxos in particular. A bit dry, but surprisingly accessible.

In a similarly tangential vein, the original Bitcoin paper is, minus a few equations, rather easy to read. It helps to approach with the understanding that Bitcoin is a social innovation (incentivizing recording 'truth') backed with some technology/protocol, not the other way around.
posted by so fucking future at 4:12 PM on April 5, 2017 [2 favorites]


« Older Attacked by mystery bug. What bit me and how...   |   Simplify my work tops, dressy jeans edition Newer »
This thread is closed to new comments.