Generating new comment counts
March 28, 2005 2:22 PM   Subscribe

What's the most efficient way to provide a MeFi-style x new comments count for a MeFi-style chat board?

Working it out for every generated page seems hugely expensive, and almost certainly won't scale. I've no need for it to be spot-on accurate -- just the near-enough of MeFi would be fine.

High-level pseudocode is fine, thanks, though this specific instance is using Perl.
posted by bonaldi to Computers & Internet (7 answers total)
I asked a question related to this in MetaTalk... Matt's answer might be informative:

It's a session variable and it's supposed to work like so:

You hit mefi for the first time since yesterday. A script checks the timestamp on yesterday and stores that as a session variable. Then it grabs the current time (now) and puts that into the database, replacing your timestamp of yesterday. All queries work off the session variable.

Now, in order for that session variable to die, and for your next visit to reference your previous visit today, you need to stay away from loading any page for 20 minutes. If you check the site more often than that, it will never be updated, and continue to use yesterday's date.

These should all be keyed off the database, and not a cookie, so I don't know why you are getting different counts on different computers, other than it is taking more than 20 minutes to move from one computer to the next.

posted by odinsdream at 2:27 PM on March 28, 2005

If your generated pages are dynamic to begin with, then you're talking about a strict count query against the comment table, comparing a constant. In fact, if you're doing an index page, you could query the Db in such a fashion as to group your results and order them appropriately:

select count(id), threadID from comments
where dts > '$mylastvisit'
group by threadID
order by threadID desc

(I name my date/timestamp fields 'dts')

This assumes that threadID is incremented naturally as new threads are added to the DB, so that displaying them in descending order isn't difficult.

This gives you a result set you can then page through as you're displaying the index page. To do it in all one query, join against your threads table for the content, author, and that sort of thing.
posted by thanotopsis at 2:50 PM on March 28, 2005

I can tell you how I do it on Viewropa if you're interested but it's too late to go into it now. If I forget to come back and post then just e-mail me.
posted by dodgygeezer at 2:51 PM on March 28, 2005

Doing it per thread is not actually hard or especially expensive (and much more flexible for the user than the mefi system), so don't discount it off hand. The query is something like:

SELECT currentthreads.threadid, count(messages.msgid) FROM currentthreads, messages LEFT JOIN lastviewed ON currentthreads.threadid= lastviewed.threadid AND messages.msgid>lastviewed.threadid AND lastviewed.userid=$currentuser WHERE currentthreads.threadid=messages.msgid GROUP BY currentthreads.threadid ORDER BY currentthreads.threadid DESC

The key is to have a short list of threads actually on the page cached (my currentthreads table) before you start, otherwise it gets hard to persuade the database not to calculate read count for every thread. For this, you can use a disk table regenerated once a minute, or a temporary memory table generated with each load.

I've used this on a very busy message board with counts re-caulculated on each load, and it worked fine, since the query is very easily optimised by indexes.

(Oh, obviously you have to update lastviewed with each page view)
posted by cillit bang at 3:10 PM on March 28, 2005

I'm not a relational database programmer, so this may not be 100% relavant. (I use an isam db).

My solution would be to have an extra table and add some fields onto the post record.

The "posts" table would be given the following extra field
Number of Comments.
I'd update the number of comments each time a comment is posted (i.e. add one to the value).

The new table would contain
Post ID, Number of Comments Read, UserID
I'd update this to the "number of comments for a post" whenever a user reads a thread.

When displaying the number of comments for a user, I'd link the two records, calculate the difference and show it.
With the merged file, that'd give only one extra record read per post on the front page.

It would mean however that non-users wouldn't see how many comments have been added since they last viewed the site.
A couple of other things...
Because the "number of comments on a thread" field is essentially calculated information, I'd write a quick utility to check the integrity of the value.
posted by seanyboy at 4:29 PM on March 28, 2005

On preview, there are a couple of issues.
Firstly, you don't want to be doing constant updates to the CommentsFor User table. Only update this value if new comments have been added to the Post. (i.e. CommentCount is not the same as UserCommentCount)
Secondly, only update the UserCommentCount when they read the comments for a thread.

A final option may also be this.
You'll have an audit of what pages have been visited by which users.
Every twenty minutes or so, you could use this information to update the UserCommentCount. (cron job or something) That should save you running multiple updates when people are refreshing there pages.
posted by seanyboy at 4:39 PM on March 28, 2005

If your database server supports subqueries (versions of MySQL from 4.1.x up do), you can do it that way given the date of the last visit.
    LinkURL, LinkDescription, LinkText, 
        SELECT COUNT(*) FROM Comments 
            Parent = Review.ReviewID 
            AND CommentDate >= '$last_vist'
    ) AS RecentCommentCount 
    FROM Review
    additional where clauses here

posted by tomorama at 9:11 PM on March 28, 2005

« Older good or evil?   |   How to get Word to open the next drop-down... Newer »
This thread is closed to new comments.