SQL help
November 19, 2007 1:20 PM Subscribe
Please help me with a giant SQL-ish mess. I need a better solution than MS Access for dealing with millions of records and building queries.
The data that I'm looking to mine comes in 30-odd tables (XX.dat extenstion, | separated text and number values, 300-odd MBs of data, millions of records) that theoretically link together via a unique identifier. I have no SQL experience and minimal MS access experience. Right now, I'm converting the tables I need to XX.txt, and importing them into Access. I'm cobbling together some of the end data I need but it's a terrible pain and not easy to update (have to re-import those tables every time there's an update). I'm not building my queries properly, and I have no clue how to build proper reports that spit out what I want. I have little confidence that I'm not creating duplicates or screwing this all up somehow. The SQL discussions already have on AskMe go over my head.
So what I'm looking for is hopefully a better application than Access which can handle all of this data better and natively as SQL (and is very easy to use - no coding!) OR if it's unreasonable to think that I can amateurishly manage all of this in an effective way, please tell me so so that I can post to MefiJobs.
The data that I'm looking to mine comes in 30-odd tables (XX.dat extenstion, | separated text and number values, 300-odd MBs of data, millions of records) that theoretically link together via a unique identifier. I have no SQL experience and minimal MS access experience. Right now, I'm converting the tables I need to XX.txt, and importing them into Access. I'm cobbling together some of the end data I need but it's a terrible pain and not easy to update (have to re-import those tables every time there's an update). I'm not building my queries properly, and I have no clue how to build proper reports that spit out what I want. I have little confidence that I'm not creating duplicates or screwing this all up somehow. The SQL discussions already have on AskMe go over my head.
So what I'm looking for is hopefully a better application than Access which can handle all of this data better and natively as SQL (and is very easy to use - no coding!) OR if it's unreasonable to think that I can amateurishly manage all of this in an effective way, please tell me so so that I can post to MefiJobs.
Access has a default back-end that sucks, but you can connect it to a real database pretty easily. My immediate advice, is to try something that costs nothing:
Install MySQL server.
In your Control Panel, ODBC, add a ODBC->MySQL driver.
Then, import your data in Access using the ODBC data store. Viola, a much better database, using the interface you already know.
posted by cmiller at 1:27 PM on November 19, 2007
Install MySQL server.
In your Control Panel, ODBC, add a ODBC->MySQL driver.
Then, import your data in Access using the ODBC data store. Viola, a much better database, using the interface you already know.
posted by cmiller at 1:27 PM on November 19, 2007
Response by poster: You don't say what format the .dat data is (it could be anything) or how you are converting it to text
The files are named *.dat. What I'm currently doing (which seems to work) is opening them in notepad, saving them as *.txt files, which then allows me to open them in Access (as they are too big for excel and Access doesn't know what to do with .dat files, apparantly). The data is just | separated text. I realize I might be confusing in my explanations, I don't really know what I'm doing.
I'll try what you suggested now, cmiller...thanks...
posted by loquax at 1:33 PM on November 19, 2007
The files are named *.dat. What I'm currently doing (which seems to work) is opening them in notepad, saving them as *.txt files, which then allows me to open them in Access (as they are too big for excel and Access doesn't know what to do with .dat files, apparantly). The data is just | separated text. I realize I might be confusing in my explanations, I don't really know what I'm doing.
I'll try what you suggested now, cmiller...thanks...
posted by loquax at 1:33 PM on November 19, 2007
As much as I hate to recommend Microsoft stuff, SQL Server Express Edition is free and can be used for databases up to 4gb, I think. You can use SQL Server Management Studio just like you'd use Access to build a query.
posted by SpecialK at 1:38 PM on November 19, 2007
posted by SpecialK at 1:38 PM on November 19, 2007
I would suggest that your problem isn't the back end database, but rather your inexperience with the tools at hand and the underlying concepts.
Jet is a dog, but I your problem isn't that the database file grows when performing updates. The problem is you are attempting a job you don't know how to do.
Call a professional. This is probably a trivial problem for someone who knows what they are doing.
posted by mrbugsentry at 1:40 PM on November 19, 2007
Jet is a dog, but I your problem isn't that the database file grows when performing updates. The problem is you are attempting a job you don't know how to do.
Call a professional. This is probably a trivial problem for someone who knows what they are doing.
posted by mrbugsentry at 1:40 PM on November 19, 2007
Learn the tools.
Install Access 2007 (which has a nice automation engine that lets you store import tasks)
For what you are doing you don't need SQL Server Express, or MySQL, or a 'real' backend. For simple single instance reporting Access works fine.
Most Importantly:
Learn SQL -- Not the Access GUI. Everything you learn will therefor be usable with SQL Server Express, Postgres, MySQL, Oracle, DB2, etc.
posted by SirStan at 1:45 PM on November 19, 2007
Install Access 2007 (which has a nice automation engine that lets you store import tasks)
For what you are doing you don't need SQL Server Express, or MySQL, or a 'real' backend. For simple single instance reporting Access works fine.
Most Importantly:
Learn SQL -- Not the Access GUI. Everything you learn will therefor be usable with SQL Server Express, Postgres, MySQL, Oracle, DB2, etc.
posted by SirStan at 1:45 PM on November 19, 2007
Response by poster: I would suggest that your problem isn't the back end database, but rather your inexperience with the tools at hand and the underlying concepts.
Yes this is true. I have no real interest in or time to learn anything. If these tables can't automatically load into something that can spit out a query relatively simply (and be updateable weekly, etc), I'm not willing to spend any more time on this and will pass it on. I feel like I'm 90% of the way there with Access, but actually getting the information I need is difficult, and updating the data is terrible. I'll play with the suggestions mentioned so far, but I think you're right. To that end, anyone have any idea how much it would cost to build a database (with several queries and reports) that is easy to update (append) with new data in access for a dummy like me? Anyone want to take a stab at this before I post it to jobs?
posted by loquax at 1:58 PM on November 19, 2007
Yes this is true. I have no real interest in or time to learn anything. If these tables can't automatically load into something that can spit out a query relatively simply (and be updateable weekly, etc), I'm not willing to spend any more time on this and will pass it on. I feel like I'm 90% of the way there with Access, but actually getting the information I need is difficult, and updating the data is terrible. I'll play with the suggestions mentioned so far, but I think you're right. To that end, anyone have any idea how much it would cost to build a database (with several queries and reports) that is easy to update (append) with new data in access for a dummy like me? Anyone want to take a stab at this before I post it to jobs?
posted by loquax at 1:58 PM on November 19, 2007
Spend a few tens of hourse learning enough SQL to get you by. The kind of work you're doing likely won't take much SQL, and by the time you've learned enough of it to do what you want, you will have way more confidence in the integrity of your data.
posted by flabdablet at 1:58 PM on November 19, 2007
posted by flabdablet at 1:58 PM on November 19, 2007
If your primary need is for a quick script that imports a bunch of text files into a .mdb, I could probably knock one up for you next Thursday (busy until then). I recently had to do this for a different project, and the details are still relatively fresh in my mind.
You're welcome to look at the script I'd be using as a starting point for this.
posted by flabdablet at 2:09 PM on November 19, 2007
You're welcome to look at the script I'd be using as a starting point for this.
posted by flabdablet at 2:09 PM on November 19, 2007
I would install Oracle XE , it's free, and use an External Table to query your flat file and the DML Merge command to sync up with the new files when you get them.
People on Oracle's OTN forums would probably walk you through it if you had any issues.
posted by Slenny at 3:23 PM on November 19, 2007
People on Oracle's OTN forums would probably walk you through it if you had any issues.
posted by Slenny at 3:23 PM on November 19, 2007
*by sync up I mean merge the contents of the flat file to a (probably) heap table.
posted by Slenny at 3:25 PM on November 19, 2007
posted by Slenny at 3:25 PM on November 19, 2007
In terms of just importing the .dat files, if all you're doing is opening them in Notepad and re-saving them as .txt files, you can probably just rename the files from foo.dat to foo.txt - Notepad probably isn't actually changing any of the data inside the files. Besides that, nthing the suggestion to get professional help - from your description this doesn't sound like an extremely hard technical problem.
posted by whir at 5:34 PM on November 19, 2007
posted by whir at 5:34 PM on November 19, 2007
« Older Any online sources of full resolution videos of... | This new sliced bread is the greatest thing since... Newer »
This thread is closed to new comments.
posted by rbs at 1:26 PM on November 19, 2007