A query connundrum.
September 3, 2006 5:51 AM   Subscribe

Attention SQL gurus. This has me completely stumped. How can I find the index (position) of a row in a table without querying the entire set?

Here is the situation:

I have a web application that has some ajaxified tables on many of the pages. These tables show various slices of data. Most of the tables are about 10 rows per page, so for example, if the table was a list of all Customers, and you've got 10,000 customers, that would be 1,000 pages of customers.

To initially show the table, I have to do two queries: one is the initial 10 rows that are to be displayed, the other is a count(*) to get the total amount of rows.

Here is the problem:

I have a particular customer ID in my hand. I'd like the webtable to take this ID and jump to the correct page that will show that customer. Naturally, it has to take into account sorting.

Hypothetical simplified example:
ID NAME
1102 C_D's Tavern
1294 Bob's Discount Dairy
1299 Chico's House of Wicker
...
9309 Larry's Llama Emporium


If I sort by ID (natural order), the index of Bob's Discount Dairy would be 2, Chico's Hosue of Wicker would be 3, etc. If I'm displaying "page 1", that means the first 10 indexes, so I know those two would be on the first page. "OK, so what page is ID# 5129 on?" That's the problem.

What I want to know is if there's a way to figure out the index of a particular ID without having to query the entire dataset. Or if I do have to query the entire dataset, I'd like it to be as painless as possible. Another important note: some of these queries are being run off of VIEWs, which could complicate things.

This is on Microsoft SQL Server 2000.
posted by Civil_Disobedient to Computers & Internet (12 answers total) 2 users marked this as a favorite
 
Best answer: You could do count(*) where ID <= 5219. Divide that by ten, and now you know what page it's on.
posted by jaded at 5:58 AM on September 3, 2006


Response by poster: Unfortunately, that assumes a natural sort. If I were to order by NAME, all of a sudden the IDs would get jumbled and the where clause wouldn't be accurate. Really great idea, but I don't think I can use it, sadly.

Just to clarify, I'm not (entirely) adverse to retrieving the whole ID column from the table, since it will likely be indexed. It would be a taxing operation, but it would only be done on initial generation of the table, and I think our users can put up with a few extra hundred miliseconds for the feature.
posted by Civil_Disobedient at 6:50 AM on September 3, 2006


Response by poster: Holy cow. Looks like the same logic is used for the less-than/greater-than operator in the where clause as is used by the ORDER BY routine.

In other words, this works:
select count(*) from customer where name < 'chico\'s house of wicker'br>
AWESOME! Thanks, jaded.
posted by Civil_Disobedient at 6:56 AM on September 3, 2006


Best answer: jaded's answer works even if you sort by name.

If you do what jaded suggests, "select count(*) from table where id <= @desired_id", you get the number of rows (plus one) preceding the desired row. Since you know the number of rows you display per page,, again as jaded points out, you can do this division to find the page desired_row is on.

To get it by name, you do exactly the same thing, but with the name rather than the ID: "select count(*) where name < @desired_name".

Now, to get the desired_name given the id, you just do a subquery:
"select count(*) from table where name < ( select name from table where id = @desired_id )"

Note that jaded's answer needs to be slightly modified for names; an id is presumably unique, but a name isn't. For non-unique things, you want to use less-than ("<"), not less-than-or-equal (<=). If you use less-than-or-equal and you have more than one matching thing, you'll find the page of the last repeated thing, not the first one. So in the general case, use less-than, and figure the thing you want ids one row after that (or it doesn't exist at all).

So the page offset (that is, the zero-based page number) you want can be found with

select @pageoffset = ( count(*) + 1 ) / @rowsperpage from table where name < ( select name from table where id = @desired_id )

and the one-based page number with:

select @page = ( count(*) + 1 ) / @rowsperpage + 1 from table where name < ( select name from table where id = @desired_id )

(And jaded deserves best answer.)


Civil_Disobedient writes "Holy cow. Looks like the same logic is used for the less-than/greater-than operator in the where clause as is used by the ORDER BY routine."

Then the database is broken. But please post the SQL select that you think shows this.
posted by orthogonality at 7:15 AM on September 3, 2006


Then the database is broken. But please post the SQL select that you think shows this.

Huh? Are you saying the comparison operator shouldn't use the same comparison function as ORDER BY's sort algorithm?
posted by cillit bang at 8:29 AM on September 3, 2006


cillit bang writes "Huh? Are you saying the comparison operator shouldn't use the same comparison function as ORDER BY's sort algorithm?"


No. (Although it is possible to explicitly tell the order by to use a different collation.) I misunderstood Civil_Disobedient to be saying the order by on the query changed how the less-than operator worked.
posted by orthogonality at 8:57 AM on September 3, 2006


Response by poster: I misunderstood Civil_Disobedient to be saying the order by on the query changed how the less-than operator worked.

What I meant was that the same internal methods the database uses to determine ORDER BY are also used in the less than / greater than operator in the WHERE clause. In other words, doing an ORDER BY on a string will sort it in alphabetical order, just as doing a less than on a string will use the alphabetical value in determining what's "before" and "after."

For non-unique things, you want to use less-than ("< ), not less-than-or-equal (=).

Excellent point! I wonder if there's a way to return the offset of collisions with another subselect? For example, let's say there are multiple records where the name is identical, but I need a particular record. I could do a separate count off the column where the property is equal, then if that's greater than 1 perform a subselect, using the count method above where ID = whatever the ID is.

Data Example:
ID NAME
12 Company 1
22 Company 2
41 Company 2
52 Company 2
79 Company 3
91 Company 4


OK, given that example above, let's say there's one row per page. That makes it easier. I want to know what page ID=41 is on. The table is sorted by NAME. So first thing, I need to figure out if the NAME is unique or not.

select count(*) from table where @sort_column = (select @sort_column from table where id=@id)

I do that and I get three records. Next thing is to determine what page "Company 2" starts on.

select count(*) from table where @sort_column < "Company 2"

That tells me that "Company 2" starts on the count+1, which is page two. Now I need to get the sub-offset of record ID=41 when the company name = "Company 2", and add it to the value I just got. I'm assuming, and please correct me if I'm wrong, that when you specify an ORDER BY on just one column, it will order by that column, and then order by the ID (the natural order). So...

select count(*) from table where id < @ID and @sort_column = "Company 2"

This result+1 = 2 (the offset of ID=41, when the company names are the same). Add that to the previous offset (page 2) and I get 4, which is the correct page+1.

There's got to be a way to wrap this up in a single query!

Many thanks in advance, by the way.
posted by Civil_Disobedient at 10:53 AM on September 3, 2006


There's got to be a way to wrap this up in a single query!


If you "ORDER BY name, id", do this:

SELECT count(*) FROM blah WHERE name< 'company 2' or (name='Company 2' and id'41') br>
I'm assuming, and please correct me if I'm wrong, that when you specify an ORDER BY on just one column, it will order by that column, and then order by the ID (the natural order)

SQL has no concept of a natural order. If you don't explicitly specify one, expect the order to be random. So you need to add id to the ORDER BY clause if you want things predictable.
posted by cillit bang at 3:50 PM on September 3, 2006


Best answer: MeFi messed that up:

SELECT count(*) FROM blah WHERE name<'company 2' or (name='Company 2' and id<41)
posted by cillit bang at 3:51 PM on September 3, 2006


Response by poster: That is deceptively straight-forward, and my initial reaction was "It can't be that simple!" yet all my test cases so far show this works, and works without requiring any ORDER BY clause at all (I'm guessing since the "name < blah is effectively determining the order).br>
Thank you all for helping me out... I am supremely grateful!
posted by Civil_Disobedient at 4:59 PM on September 3, 2006


The ORDER BY has to be in the query that actually loads the data. The name<'company 2' or (name='Company 2' and id<41) in the counting query is the equivalent of ORDER BY name, id in the loading query, and obviously they both need to match.
posted by cillit bang at 6:57 PM on September 3, 2006


Civil_Disobedient writes "That is deceptively straight-forward, and my initial reaction was 'It can't be that simple!'"

Been there! I remember when I first learned this about SQL. I'd taught myself to do basic queries, and I understood it as set theory. But it took a colleague to show me that I could use arbitrary expressions in SQL, eg., not just "select x from it where x < 10 but select ( x * 4 + 1 ) / 3.3 from t where x / 2 x - 30.br>
I recall asking him, "hoe can that be legal"? and he explained, it's just an expression. (In retrospect I should have read a SQL parser BNF, of course.)

As soon as I realized that, I had this great feeling of happiness, as I realized I could pretty much make SQL do anything I wanted. And hat a waste it is to pull values out of the database just to manipulate / transform then in front-end code. Eventually I'd write a whole automatic multiple-sequential-timed-invoice-per-mortgage calculator and generator for Freddie Mac, all in T-SQL, mostly using views that transformed and calculated data in the underlying tables.
posted by orthogonality at 9:17 PM on September 3, 2006 [1 favorite]


« Older Forward Head Posture = Pain   |   Let me prove it to you. Occupational exams Newer »
This thread is closed to new comments.