Generating new comment counts
March 28, 2005 2:22 PM   Subscribe

What's the most efficient way to provide a MeFi-style x new comments count for a MeFi-style chat board?

Working it out for every generated page seems hugely expensive, and almost certainly won't scale. I've no need for it to be spot-on accurate -- just the near-enough of MeFi would be fine.

High-level pseudocode is fine, thanks, though this specific instance is using Perl.
posted by bonaldi to Computers & Internet (6 answers total)
 
If your generated pages are dynamic to begin with, then you're talking about a strict count query against the comment table, comparing a constant. In fact, if you're doing an index page, you could query the Db in such a fashion as to group your results and order them appropriately:

select count(id), threadID from comments
where dts > '$mylastvisit'
group by threadID
order by threadID desc


(I name my date/timestamp fields 'dts')

This assumes that threadID is incremented naturally as new threads are added to the DB, so that displaying them in descending order isn't difficult.

This gives you a result set you can then page through as you're displaying the index page. To do it in all one query, join against your threads table for the content, author, and that sort of thing.
posted by thanotopsis at 2:50 PM on March 28, 2005


I can tell you how I do it on Viewropa if you're interested but it's too late to go into it now. If I forget to come back and post then just e-mail me.
posted by dodgygeezer at 2:51 PM on March 28, 2005


Doing it per thread is not actually hard or especially expensive (and much more flexible for the user than the mefi system), so don't discount it off hand. The query is something like:

SELECT currentthreads.threadid, count(messages.msgid) FROM currentthreads, messages LEFT JOIN lastviewed ON currentthreads.threadid= lastviewed.threadid AND messages.msgid>lastviewed.threadid AND lastviewed.userid=$currentuser WHERE currentthreads.threadid=messages.msgid GROUP BY currentthreads.threadid ORDER BY currentthreads.threadid DESC

The key is to have a short list of threads actually on the page cached (my currentthreads table) before you start, otherwise it gets hard to persuade the database not to calculate read count for every thread. For this, you can use a disk table regenerated once a minute, or a temporary memory table generated with each load.

I've used this on a very busy message board with counts re-caulculated on each load, and it worked fine, since the query is very easily optimised by indexes.

(Oh, obviously you have to update lastviewed with each page view)
posted by cillit bang at 3:10 PM on March 28, 2005


I'm not a relational database programmer, so this may not be 100% relavant. (I use an isam db).

My solution would be to have an extra table and add some fields onto the post record.

The "posts" table would be given the following extra field
Number of Comments.
I'd update the number of comments each time a comment is posted (i.e. add one to the value).

The new table would contain
Post ID, Number of Comments Read, UserID
I'd update this to the "number of comments for a post" whenever a user reads a thread.

When displaying the number of comments for a user, I'd link the two records, calculate the difference and show it.
With the merged file, that'd give only one extra record read per post on the front page.

It would mean however that non-users wouldn't see how many comments have been added since they last viewed the site.
A couple of other things...
Because the "number of comments on a thread" field is essentially calculated information, I'd write a quick utility to check the integrity of the value.
posted by seanyboy at 4:29 PM on March 28, 2005


On preview, there are a couple of issues.
Firstly, you don't want to be doing constant updates to the CommentsFor User table. Only update this value if new comments have been added to the Post. (i.e. CommentCount is not the same as UserCommentCount)
Secondly, only update the UserCommentCount when they read the comments for a thread.

A final option may also be this.
You'll have an audit of what pages have been visited by which users.
Every twenty minutes or so, you could use this information to update the UserCommentCount. (cron job or something) That should save you running multiple updates when people are refreshing there pages.
posted by seanyboy at 4:39 PM on March 28, 2005


If your database server supports subqueries (versions of MySQL from 4.1.x up do), you can do it that way given the date of the last visit.
SELECT 
    LinkURL, LinkDescription, LinkText, 
    (
        SELECT COUNT(*) FROM Comments 
        WHERE 
            Parent = Review.ReviewID 
            AND CommentDate >= '$last_vist'
    ) AS RecentCommentCount 
    FROM Review
    additional where clauses here

posted by tomorama at 9:11 PM on March 28, 2005


« Older bebo.com: good or evil?   |   How to get Word to open the next drop-down... Newer »
This thread is closed to new comments.