Converting Postrges-specific query to Mysql
July 14, 2006 9:19 AM   Subscribe

I need help converting a Postgres-specific SQL query to MySQL

I'm in the middle of converting an application to use a Mysql back-end instead of Postgres. The database schema I got converted fairly easily, but one of the SQL queries that the app executes is failing, and I don't know how to write it to make MySQL happy. I'm assuming that the EXTRACT(epoch from task_id::abstime) is where it's failing.

Anyone with good Mysql-fu that can help?

SELECT EXTRACT (epoch from task_id::abstime),
task_name
FROM tasks WHERE task_id < now()
AND task_state = '%d';
posted by mcstayinskool to Computers & Internet (14 answers total) 1 user marked this as a favorite
 
Can you execute the query in something like phpMyAdmin and see what the error is?
posted by Skorgu at 9:27 AM on July 14, 2006


Best answer: In general, to get the epoch from an internal date representation you'll want something like this:


SELECT UNIX_TIMESTAMP(`column`) FROM `table` etc...


The task_id::abstime is new syntax to me so I'm not sure what that's referring to, it may be a postgres-only thing. What format is the date column stored in?
posted by Skorgu at 9:30 AM on July 14, 2006


Best answer: If you're just trying to select the task_id column as a UNIX timestamp instead of a date (which is what it looks like), you probably want the following:

SELECT UNIX_TIMESTAMP(task_id)
FROM tasks
WHERE task_id < now()
AND task_state = '%d';
posted by uncleozzy at 9:31 AM on July 14, 2006


Best answer: To echo what everyone else said faster use UNIX_TIMESTAMP().
posted by togdon at 9:34 AM on July 14, 2006


Epoch is just the number of seconds from the unix epoch, which is Jan 1 1970 00:00:00 UTC (or, if the operand's an interval, the number of seconds in the interval). Extract just converts/truncates a timestamp or date value. The :: operator is a typecast in postgresql.

So the whole thing is, treating task_id as an abstime, find the difference between that and the unix epoch.

Abstime is now deprecated in postgres; it covers the same range as unix time and has the same one second granularity (both are signed 32-bit values).

Since in unix time the epoch (1 Jan 1970) is by definition zero, what extract is doing here is likely a subtraction: task_id - 0. So if my guess is correct and abstime in this case is unix time, all you need to do is return task_id. You'll need to double check this of course, with the added complication that different versions of postgesql time make abstime UTC time or local time.


Some notes: it appears that the system you're converting generated task_id by taking the current time. If task_id really is supposed to be an id, it really should be unique. Since abstime has a one-second granularity, if the system is fast enough to add two tasks in the same second (and with modern hardware, it might well be), you'd get two different tasks with the same id.

To avoid this potential problem. you probably want to make task_id an auto-increment synthetic key, and make the timestamp a part of a task's attribute data.
posted by orthogonality at 9:57 AM on July 14, 2006


(Oh, and having task_state a string (varchar) rather than an enumnerated type? Error prone and not robust.)
posted by orthogonality at 10:03 AM on July 14, 2006


Response by poster: Thanks all, esp. to orthogonality for the extra time you spent on this. I'm taking a look at UNIX_TIMESTAMP() now...hopefully that will do it.

I agree with what you're saying about making the column task_id unique, but this is an application that I'm mainly trying to kickstart rather than redesign completely. In other words, move it to MySQL and get it working like it was before, nothing more. Good advice nevertheless.
posted by mcstayinskool at 10:40 AM on July 14, 2006


Now-that-you-have-an-answer-derail: I'm always interested in why people prefer My to PG; can you spare a paragraph to expand on that?
posted by baylink at 11:02 AM on July 14, 2006


mcstayinskool writes "this is an application that I'm mainly trying to kickstart rather than redesign completely. In other words, move it to MySQL and get it working like it was before, nothing more."

Right, but mysql is widely reputed (*cough*) to be faster than postgres -- I'm assuming that's why your clients/employers wanted the retro-fit in the first place. Regardless, in a year or two they're going to move to faster hardware. And every year or two thereafter.

If either of these things result in more than one task being added in a second, the system's going to produce all sorts of erroneous results.* And the last person to touch it -- that is, you -- will be blamed. Not the guy who made the mistake of using timestamps for ids. They'll reason (erroneously) that they had no problems until mcstayinskool upgraded the database, so it must be mcstayinskool's error.

If you haven't been given the resources to fix this problem, then at least perform due diligence and CYA by documenting it and getting that signed-off on by whomever you report to. Then, the problem's still not fixed, but it's not your problem. (Or you'll be given the resources, which means you'll make more money on the project, or if your contract is fixed-cost all-inclusive, you may be forced to fix it, but that's still better than losing your reputation two years hence.)


*Prediction: mysql didn't until recently have correlated subqueries ("sub-selects"), so you're probably not using them. A correlated sub-query that produced two results (from two tasks with the same id) would cause the database engine to immediately throw an error. Which would be good. But likely your system won't have correlated subqueries and won't be that lucky; instead, both tasks will be updated/modified when only one should be. This silent data corruption will go unnoticed for some time; when it is noticed, a significant portion of the data will corrupt and will have to be laboriously re-created by hand, at great expense.
posted by orthogonality at 11:17 AM on July 14, 2006


baylink writes "I'm always interested in why people prefer My to PG; can you spare a paragraph to expand on that?"

MySQL is reputedly faster for simple queries. Posgtresql is far more versatile, but is complicated enough to have a steeper learning curve.

(E.g., pg allows you, out of the box, to use three different languages for internal stored procedures -- or to write your own stored procedure or stored procedure language in C. But all three standard languages (until recently?) had quirks that made them a pain to use. (pg required the whole sp to be delimited by quotes. This made using quotes inside the sp a pain in the ass, a pain that made no sense unless you were a C coder and realized, "oh, this is a string that's gonna be parsed by some C program something". Exposing the implementation details of the parser like that was just clumsy.)

In part, MySQL is easier because it just doesn't (or didn't, until recently) support much of ANSI SQL. But this ironically made it more attracting to authors of various "glue" libraries like PHP: a set of PHP classes could much more easily encapsulate all of MySQL than all of pg; and as a library author, you look a lot better by saying, "my library lets you use all MySQL operations" than saying "my library gives you access to a limited subset of pg operations".

So you end up with a lot more libraries and frameworks that use MySQL, so if you were database agnostic but wanted to use one of those frameworks, you ended up using MySQL as a backend. And so you got used to running and admining MySQL, and you eventually had an investment in MySQL, in terms of your time and learning, and why then start from scratch with pg?

Because MySQL supports a lot less of ANSI SQL (in particularly, until MySQL 5.0, it didn't support views or sub-queries), programmers who were only SQL beginners felt more comfortable with it, because they didn't have to deal with SQL gurus telling them there were better (and less obvious) ways to do things. Like the library being able to cover all of MySQL because there was less to cover, a programmer could claim to know everything about MySQl -- to be a MySQL guru -- with a lot less work than mastering ANSI SQL. And so these MySQL gurus have a vested interest getting people to use MySQL.

But because MySQL doesn't support views or sub-queries, in many cases you have to do in a front-end (or middle-tier) client what you could do in the database in ANSI SQL. In particular, you have to pull lots of data out of the database in order to do calculations that just go back into the database. So while MySQL may be faster for these simple queries, those savings are offset by the need to pull out much more data because you can't do complicated queries within the database.

For dead simple stuff (non-personalized CMS like blogs), MySQL probably is faster, but it's less extensible. And once you're doing more complicated stuff (work-flow, accounting, book-keeping, scheduling) you end up paying a real price for not having views and sub-queries and stored procedures.

So using MySQL is like saying "why should I bother buying expensive copper-bottomed pans when all I'm going to be doing is frying eggs" -- if all you plan to do is fry eggs, by all means get a cheap aluminium pan. If you're hoping to open a five star restaurant, you probably know more about cooking and understand the value of copper-bottomed pans.
posted by orthogonality at 11:50 AM on July 14, 2006


Response by poster: I can speak a little to the "why Mysql and not Postgres" question, though not nearly as elegantly or well-informed as orthogonality just did.

1. for this particular project, the reason is simple. We (re: "I") maintain a dozen or more MySQL databases for misc functions internally. This application is literally the only thing that runs on Postgres in our company, there's nothing it's doing that MySQL can't do, so there's no good reason for keeping it there (esp. when the developer isn't even with the company any more).

2. I often do not have the luxury of having a graphical interface to the database, logging into a command-line remotely through one or more firewalls. For that reason, I end up using the command-line clients mysql and psql. And psql SUCKS. In mysql, I type 'show tables' to see tables in a database. In psql, it's \dt. To quit in mysql, I type 'quit', in postgres, it's '\q'. This may seem trivial, but for someone that is jumping around between about 3 dozen technologies and languages on a given day, the psql interface is infuriating. I realize this has *nothing* to do with the relative merits of the DB engine itself, but my database requirements are rarely if ever in need of the extra SQL that Postgres provides, so MySQL is my de facto choice.
posted by mcstayinskool at 12:28 PM on July 14, 2006


Ok; both nice explanations, from different axes. Appreciate it. McStay; you do have phpPgAdmin installed, right?

Lots of people (I was one of them) don't realize that phpMyAdmin has a counterpart...
posted by baylink at 6:11 PM on July 14, 2006


I always get a little annoyed when people port from PostgreSQL to MySQL. It's like making water flow uphill; it's not natural, dammit.

For that reason, I end up using the command-line clients mysql and psql. And psql SUCKS. In mysql, I type 'show tables' to see tables in a database. In psql, it's \dt. To quit in mysql, I type 'quit', in postgres, it's '\q'. This may seem trivial, but for someone that is jumping around between about 3 dozen technologies and languages on a given day, the psql interface is infuriating. I realize this has *nothing* to do with the relative merits of the DB engine itself, but my database requirements are rarely if ever in need of the extra SQL that Postgres provides, so MySQL is my de facto choice.

Those are some crazy criteria for evaluating psql versus MySQL's shell. First, the MySQL shell has \-prefix commands itself, including "\q" to quit, and "\h" to list the syntax for an SQL command. And PostgreSQL's "\dt" is arguably a lot fewer keystrokes than "show tables"! I type these commands a lot, and I want them to stay short.

PostgreSQL, on the other hand, has excellent and invaluable Readline-based context tab completion (type "sel" and hit tab and you'll get "select"; type "select foo from b" and hit tab, and it will suggest the first table starting with "b", or on double-tab list every table starting with "b"), which I have not seen in MySQL (though it could admittedly be some Debian package I have not installed).

Personally I find the \-prefix system to be the more intuitively designed mechanism, a logical separation between SQL and the shell's higher-level metacommands: SQL doesn't have a standard way of accessing database metadata, and the \-prefixed commands are merely psql-level metacommands that access pg_tables and other internal tables for you. MySQL's proprietary SQL extensions just add more database-specific stuff to think about, when you should be moving away from database-specific stuff.

As for quitting psql, it's also available as ctrl+D, as in just about every Unix shell/interpreter/Readline-based program out there -- including MySQL.
posted by gentle at 9:19 PM on July 14, 2006


@gentle: thanks; that's my reflex as well, but you have *good arguments*. :-)

(bookmarks thread)
posted by baylink at 7:18 PM on July 15, 2006


« Older What makes web ads work?   |   From a t-shirt idea to a geek badge of pride Newer »
This thread is closed to new comments.