Servers and databases and UIs, oh my
November 9, 2010 1:04 PM   Subscribe

How do tech limitations affect design choices on large scale Web sites?

Hi HiveMinders - and specifically Web techie MeFis,

I would like to know a bit more about how technology limitations and scalability limit design choices on Web sites. While I know enough about the front end (i.e. what shows up in your browser - the HTML/CSS/Javascript/DOM stuff) I'm trying to understand the backend part - specifically how large sites like LinkedIn, Meetup and Facebook make their user experience choices. I'm a user experience designer, so I understand user interactions and how to build a UI. I know the basics of backend Web tech - that most sites use a version of LAMP, and that the more complexity you add in features, the more problems you may have with speed of data loading, browser compliance and if the servers are able to handle the number of unique hits etc., and that you're calling up data on a database that lives on a server. What I'm looking to understand is how current Web technologies limit your design choices especially with large scale sites with millions of users and transactions with a server- and if you have any good books or resources that help a non-techie understand the topic. Apologies if this seems a little chat filter-y - honestly, I'm just looking to understand about the issues involved, so any good case studies or examples is appreciated.

I have a concrete example - Meetup and LinkedIn and their user account management or lack thereof. Not to attack these 2 sites specifically, because I do love using both of them - but they're a site that has some of these design issues. Both sites have a huge number of users and user groups, and lots of functionality within those groups. If you go under your account settings, you'd think the ability to control how much email you get from the site would be there. The designer part of me says, 'have a global email communications setting via a checkbox that allows the user to turn off ALL emails from Meetup'; instead, with Meetup, you have to go to each individual group's setting page and change it there. Is there a reason on the tech end of things why global settings aren't an option? I'm seeing the front end solution (i.e. add checkboxes), but is there a back end requirement with the databases and server loading that prevents these kind of simple solutions from being implemented? I know enough about the reasoning that goes into design choices from the non-tech requirements angle - the usual suspects, like usability isn't always important when developing the site, that some sites choose to have weird account settings on purpose (Facebook as one obvious example), etc. What are the *technical* requirements that prevent the most humane design choices in UI?

posted by rmm to Computers & Internet (11 answers total) 5 users marked this as a favorite
You may be missing the forest for the trees in your example. The more meetup pages you have to visit, the more ads you see. Myspace was (historically) notorious for doing this, making any change take you through 2-3 pages covered in ads.
posted by Oktober at 1:09 PM on November 9, 2010

I've got to say, often times this is specific to the history of the codebase and/or database design. Often times in a large system, unless it's been redeveloped completely (like I think Amazon or eBay have done a good number of times), there are "vestiges" of the original architecture in there. This can be anything from the fact that, say, the database developer decided it would be a good idea to put the address information in the same table as the user data (vs. putting them in separate tables so any one user can have multiple addresses), to structuring the code base in such a way that, say, every time you update the formatting of the addresses on the site you have to do it in ten different places because there wasn't any thought put into structuring the templates well. These aren't the greatest examples, they are pretty basic mistakes that most any developer would probably be able to avoid these days (especially with all the great MVC systems out there now that do the heavy lifting for you), but my point is just that there many not be a consistent reason why different web application architecture impose constraints on the user interface design.

That said, another way to look at understanding what you want to understand could be to go through a few well architectured MVC framework (say, Rails and django), and try to understand what they encapsulate and what they don't, and why. For most larger systems an off-the-shelf MVC system is either heavily customized and optimized, or something is built from the ground-up, but nonetheless HTTP is HTTP, and there are certain things that all systems need to do reliably, and there are things that anyone who has been building web apps for long enough knows are a waste of time to build oneself. Understanding a modern MVC framework is probably the best place to start to really get this, in my opinion.
posted by dubitable at 1:51 PM on November 9, 2010

UX designer here too.

Echoing what Oktober says, the Meetup example you gave really sounds like a business decision to me. There may be technical reasoning behind the decision as well.

Also of note is that technically, what you suggest is probably *possible*. But there's a fine balance between accomplishing business, product, UX, and development goals.
posted by hijinx at 1:51 PM on November 9, 2010

I recall an interesting essay (I think linked by Kottke?) a few weeks/months ago by a developer who was explaining how "Web 2.0" design/development decisions impacted the performance (and thus the design/features) of big sites (AKA, how scaling up broke features, etc). A really interesting read.

but I can't find it. I have been googling for 20 minutes. Maybe it rings a bell for somebody else?
posted by misterbrandt at 2:00 PM on November 9, 2010

In reality, bad UI is usually some combination of:

1) Not enough UI experts, or lack of knowledge in positions that influence design. Also lack of communication between visual designers and programmers. For example, I've seen too many projects where a database is designed far before a visual designer is ever involved.

2) Rapid implementation with little to no user testing. People who develop software are often "too close" to the project and can miss UI problems.

3) Functionality subject to triage where ROI is concerned. Beware of middle management.

4) Some core element (like the database) was not optimized properly at the start of the project, and it'd require more work than is deemed profitable to make certain changes. Database design has all sorts of potential problems -- what type of fields to use, when to create a relational table vs. just another field, proper use of unique IDs, etc. A bad database will have bad queries, and if you're then trying to make asynchronous calls... it can all snowball and go to hell very quickly.

5) Sticking too closely to default settings (field types in your database designer, tag attributes in your HTML program, etc.), an old design, or a pre-made framework.

A lot of practical web work is the application of bandages upon bandages until the entire thing collapses and you start again from scratch. (Well, seldom from scratch... you still cannibalize things, and quite often, the very things that caused the problems in the first place...)
posted by Wossname at 2:05 PM on November 9, 2010 [2 favorites]

Maybe it's just because I spend my days working on the scaling and reliability side of things, but I don't really get the sense that we're ever the driving side of UX decisions. For the most part, either business needs drive a decision or a decision is made because it's easiest to program. For the most part, servers and software are plenty fast so any potential problems don't manifest themselves while you are developing and testing. You only see the strain when there's a world of data in there or a ton of users. So poor guys like me are brought in when there are too many fail whales or things start falling apart due to popularity, and then you're fighting fires as they pop up. Strangely, it sounds almost exciting when I put it that way...
posted by advicepig at 2:35 PM on November 9, 2010 [1 favorite]

There's no good technical reasons (except the implementation-specific technical reasons) for not doing the things you say. There's plenty of business-prioritization reasons, time reasons, developer convenience reason, etc. Sometimes it's developers being lazy, sometimes it's businesses being 'evil' (see: facebook and privacy settings).
posted by beerbajay at 2:57 PM on November 9, 2010

@misterbrandt: I remember that article, but I can't find it either.

There are certain classes of features that don't scale well on large websites. These features are cheap to implement for few entries, but the cost-per-entry can be described by a geometric function. This means that the computation cost rapidly exceeds your available resources as the number of entries increases.

Those kinds of features include ones that make use of data involving more than one level of graph searching. Remember when Facebook had a feature that told you how many steps removed you were from a given user? That kind of algorithm is extremely expensive once your user pool grows above a certain size, and can't be cached effectively because every new friend a user gets alters the social graph to an unpredictable extent.

There's a whole class of mathematics and computer science devoted to this kind of thing.

In most cases, user experience problems in web applications are the result of bad design over technical limitations. Most often it's not even anyone's fault, it's just momentum -- it would take too long, be too expensive, or require user-alienating upheavals to change the way things are in favor of the way things should be.

I'm generally sympathetic to developers stuck working with creaky codebases, having maintained many myself. While there may be no mathematical or technical-theoretical reason why a particular application couldn't be some other way, there are often practical architectural reasons why they can't. There's no theoretical reason why you couldn't build a school bus into a helicopter, but in no sane universe do school bus designers plan for future helicopter modifications. In this case, it's better to tear down and start over.
posted by TeslaNick at 2:59 PM on November 9, 2010

Fantastic answers everyone - I would normally just mark them all as 'best', because they do give me different perspectives on the issue, which is often exactly what I'm looking for (the good ol' blind beggars and elephant metaphor). However, I won't mark them all as 'best', because it looks funny when all the answers have grey backgrounds (design nerd, I know).

dutible, your mention of frameworks (and in particular that idea of MVC frameworks in general) is part of what I want to explore - although I may have to buy a few dozen developers coffee to understand all of it :) I may just print this out and ask them.

TeslaNick, do you have any examples of the features that don't scale well? For me a lot of this is trying to understand how a feature works (not from the front end, but the back end - I can see how something like a data model explains the data interactions, but am still trying to wrap my head around the idea that a feature has its own data set, and how that interacts with 'global' features like account management - and why they all can't play well together. Part of the problem (IMHO) is that people seem to think 'add a feature' as an independent entity that lives on its own, as opposed to how a feature lives in a whole ecosystem (which is a better metaphor for a Web site).

Anyway, fascinating stuff, so thanks everyone!
posted by rmm at 10:36 AM on November 10, 2010

I'd say that the intersection of user interface design and scalability is cacheability, or the ability to use an older version of information more efficiently. For example, the view stats for youtube (or the download stats on sourceforge) are not real-time because to do that would require an update+query of the number-of-hits database for every video viewed. Instead, you can have N machines that receive "video X was just played" messages, and at some regular interval they all dump their counts into a central database of view counts, and then at regular intervals those hit counts are mirrored off to M caches that are valid for 12 hours or whatever. This means that when viewing a video you can register your view with one of those N machines, and you can get the hit count from one of those M caches, leaving the actual database relatively unburdened. But it means that you don't see the view count update after watching a video, and that counts are delayed by some number of hours, and that really hot videos might show only a hundred views for the first day or so of their life even if they've been viewed 50k times. Maybe that's an acceptable user experience to be able to efficiently serve a billion videos per second, though. You might have similar situations with static images for example, where updating an image isn't instantaneous because images are served by a third party CDN cloud or something and it takes a bit of time for a newly uploaded photo to propagate through the network.

Another common one is not allowing (or allowing but discouraging) user customizations for the front page (or whatever the highest-trafficked URL is) of the site. Allowing customization is kind of the antithesis of cacheability, because it means that every page is different depending on who's viewing it. But you can cheat that a little bit if you take the hotspots and make substantial portions of them the same for all visitors, and then bolt on the customization parts (like the "You're logged on as (user)") on top of that. This can lead to conflicting interests, because from a performance standpoint anonymous/not-logged-in users are your best friend but from a business logic standpoint you tend to want to encourage people to sign-up and be logged in when using the site because it allows for all the interaction like commenting, tagging, rating that generates content for you.

Sometimes database load directly determines whether a feature can be offered. For example, letting a user display their comments sorted by score can be a very expensive operation because you probably didn't include score in the table index due to the fact that it changes so often (which would necessitate a whole lot of index thrash) but you almost certainly do include timestamp in the index because that's static. So retrieving the 20 most recent comments is efficient and can be done entirely through the index which is already sorted. Retrieving the 20 most highest scoring comments on the other hand requires reading every comment, then sorting them, then discarding all but the 20 highest. This can be even more expensive if the data is normalized and scores are stored separately as individual votes/favorites/whatevers. Therefore as a site grows you might see them remove some features that they used to offer, such as the ability to view their personal/profile data in different ways.
posted by Rhomboid at 10:48 AM on November 10, 2010

« Older Looking for magazines that publish enthralling...   |   Mac vs. PC, 2010 Newer »
This thread is closed to new comments.