Find it and add it once more in mysql field
January 21, 2009 2:28 AM   Subscribe

I've made a large mess and need to fix it. It's a "find this" and "add it once more, but above it, in that same field" dealio in one single tables field in mysql. "This" is a different string every time. Not sure this can be done with mysql, so I'm open to all other ideas.

Ok, so I have a badly created embed and object tag code all saved in one mysql field lets call it 'body'. I'm doing the embed like the example at the bottom of this apple page.

Body has other data/info before and after the embed and object tags so 'body' can vary greatly in length.
The Object tags calls the quicktime codebase, and underneath that we have the param name= stuff but we lack param name=src linking to a file! Below the param name comes the embed src=link to file! tag. So, what we want to do is get that link from embed, in each individual 'body' field, and create a param name src with the same link and add this to this same 'body' field.

Each link is different, obviously, and we're talking 45,000 entries which is why I am scratching my head how to do this. Is there a way to, say, add (wildcard) where like% solution I could do, I'm stumped?
posted by dabitch to Computers & Internet (19 answers total)
 
Wow.

Any chance you can simplify your question?

I understood correctly, you've got some html in a database and you want to add where BLAH is taken from . Is this right?

If so, you can do this with an sql query using substr() and position functions. Much better, and easier to read would be to fetch each row, manipulate and update it using Perl or something.
posted by devnull at 2:40 AM on January 21, 2009


Response by poster: OK... Let me give this another shot. :)

Yes, I have some HTML in my database, in a single field. I want to add BLAH where BLAH is in that same field, and BLAH is different in each field. So, GET BLAH, add BLAH once more in that same field, and do it again on each field - get BLAH and repeat BLAH. Since every field is different how do I make it get BLAH and add BLAH 45,000 times?

Update it using Perl or something, ah, sounds like a great idea actually (like I said I'm open to ideas) however, I do not know exactly how to execute great ideas like whip up a perl script like that. my brain is fried.
posted by dabitch at 2:58 AM on January 21, 2009


Jesus, describing seems not to be your forte today. Please give three examples.
posted by orthogonality at 4:20 AM on January 21, 2009


Best answer: I think what he's saying is that he is doing the embed exactly like the bottom of that apple link, BUT he doesn't have the line < PARAM NAME="src" VALUE="MyMovie.mov" >. Since that value is the same as the one farther down in the < EMBED>, he wants to parse that filename from the < EMBED> section (where SRC="MyMovie.mov") and create the line that displays < PARAM NAME="src" VALUE="MyMovie.mov" > from that value. He's got 45,000 versions of it with different values for "MyMovie.mov".
posted by ghostmanonsecond at 4:34 AM on January 21, 2009


Response by poster: I have a single table, with three fields, lets call them number, date, HTML

I have data in the field HTML:
stuff RANDOM stuff

I want to grab random, and put it back in that same HTML field, like so:
RANDOM stuff RANDOM stuff

In each HTML field entry the random is a different charlength as it is a local hyperlink like /folder/name-of-it.mov - I can only guess where RANDOM might be if I search on, for example ".mov" and where the link begins, that is "src" - so I want what is between these two constants. And I want to insert it again, back into the same exact field I found it in.

I'm looking for something that allows me to say: grab everything between "src" and ".mov" in filed HTML, copy it it, and put it back in again repeating it in the same field.

On preview, I think I must hire ghostmanonsecond to be my translator!
posted by dabitch at 4:41 AM on January 21, 2009


What are you using on the front end to display this? PHP? and is your goal to get this correct in the database, or do you really just want the web pages to display correctly? If you just want it to display right, you could parse and fix it on the fly each time you display the page (using PHP, etc). That would save you from having to write the entries back to the db and you might be more comfortable handling this in that front end language over Perl or SQL.
posted by ghostmanonsecond at 5:06 AM on January 21, 2009


Response by poster: Yeah I'm displaying this in php, it's drupal setup, but I am even more lost in fixing that on the fly than I am correcting data.
posted by dabitch at 5:13 AM on January 21, 2009


There was a reason I wrote "Please give three examples." If you can't do that, I can't help you. You gave another description; while it's clearer, it lacks essential information that an example would give. You are asking us to give, gratis, of our time to help you, and then you are wasting our time.
posted by orthogonality at 5:15 AM on January 21, 2009


does RANDOM always equal .mov, or is it any file type? If it's any file type, you need another table of filetype extensions. If it's always .mov, you can use instr() or locate()

Alternatively, you can use string functions to find the file name. Without writing all your code for you, I think you could find the filename by first finding the extension, then testing each character before that for "\" and each character after that for a space.
posted by Pants! at 5:34 AM on January 21, 2009


Best answer: I don't know how to fix this problem, but I think I understand what dabitch is saying, so maybe I can help the process along by re-explaining the problem.

In her table she has field called body. Body contains html code. In her html code she has code to embed a quicktime movie. She does this by using an <OBJECT> element, with an <EMBED> element nested within it. She created the <EMBED> element properly, with a src attribute that correctly references the movie filename. Unfortunately she made a mistake and left out the necessary <PARAM name="src" VALUE=embed src attribute value goes here> child element for the <OBJECT> tag.

For each record, she wants to go in, grab the value of the <EMBED> object's src tag, and create a <PARAM> child element of the corresponding <OBJECT> element with the same value as the <EMBED> object's source attribute, and update the table accordingly.

So, ortho, for three examples (dabitch can, of course, comment if they're right or wrong):

1.

<OBJECT ...> <PARAM name="autoplay" value="true"> ... conspicuous lack of <PARAM name="src" value="..."><EMBED ... src="foo.mov" /></OBJECT>

changes to:

<OBJECT ...> <PARAM name="autoplay" value="true"> ... <PARAM name="src" value="foo.mov"><EMBED ... src="foo.mov" /></OBJECT>

2.

<OBJECT ...> <PARAM name="autoplay" value="true"> ... conspicuous lack of <PARAM name="src" value="..."><EMBED ... src="bar.mov" /></OBJECT>

changes to:

<OBJECT ...> <PARAM name="autoplay" value="true"> ... <PARAM name="src" value="bar.mov"><EMBED ... src="bar.mov" /></OBJECT>

3.

<OBJECT ...> <PARAM name="autoplay" value="true"> ... conspicuous lack of <PARAM name="src" value="..."><EMBED ... src="baz.mov" /></OBJECT>

changes to:

<OBJECT ...> <PARAM name="autoplay" value="true"> ... <PARAM name="src" value="baz.mov"><EMBED ... src="baz.mov" /></OBJECT>

Which is all to say that this is a longer winded explanation with examples of what ghostmanonsecond said.
posted by Reverend John at 6:04 AM on January 21, 2009


It sounds like you want to grab a filename from a link. Is that it? Can't this be done before the original INSERT using a regular expression, or am I missing something?
posted by Civil_Disobedient at 7:28 AM on January 21, 2009


Or are you dealing with legacy data? And why aren't you using identity columns?
posted by Civil_Disobedient at 7:29 AM on January 21, 2009


Here's my quick, TOTALLY UNTESTED perl script. You'll need the following modules on your system: DBI, DBD::mysql, and HTML::TreeBuilder. It also assumes that there's a column in the table that unique identifies each row, called `id`.
#!/usr/bin/perl -wuse strict;use DBI;use HTML::Parser;my $dbh = DBI->connect( 'DBI:mysql:database=mydb;host=myhost.example.org', 'myuser', 'mypass', { RaiseError => 1 } )	|| die "Cannot connect to database: $!";my $sth_update = $dbh->prepare( 'UPDATE `mytable` SET `body` = ? WHERE `id` = ?' );my $sth_select = $dbh->prepare( 'SELECT `id`, `body` FROM `mytable`' );my ($id, $body);$sth_select->execute;$sth_select->bind_columns( \$id, \$body );while ($sth_select->fetch) {	my $tree = HTML::TreeBuilder->new;	$tree->parse( $body );	$tree->eof;	$tree = $tree->elementify;	# skip if this already has a src PARAM	next if $tree->look_down( '_tag' => 'param', 'name' => 'src' );	my $embed = $tree->look_down( '_tag' => 'embed' );	next unless $embed;	my $param = HTML::Element->new( 'param', 'name' => 'src', 'value' => $embed->attr( 'src' ) );	$tree->unshift_content( $param );	$sth_update->execute( $tree->as_HTML, $id );}

posted by sbutler at 7:43 AM on January 21, 2009


Uggg... okay, that failed to copy right. Let's try this.
posted by sbutler at 7:46 AM on January 21, 2009


Best answer: Damn it! Small bug. One more time.
posted by sbutler at 7:47 AM on January 21, 2009


Best answer: Assuming Reverend John is correct about what you want to do, you can't do it inside mysql. You'll either need to write a throwaway script to pull each record from the database, run a regexp on it, and store it again; or as ghostmanonsecond suggests you could leave the data alone and just run a regexp inside the display php page before you display the html.

A perl script would look something like this (I'm glossing over the read/write from the database with pseudocode here:)

while <(list of all records)> {
    $foo = (get the next record from the database)
    # This is the actual search-and-replace; adjust as needed to fit your html:
    $foo =~ s#<EMBED src="(.*?).mov" />#<PARAM name="src" value="$1.mov" /><EMBED src="$1.mov" />#gi;
    # (store the updated record back into the database. Better yet into a copy of the database, so you can test it before it's too late)
}

If you go the other route (which would be easier and safer today but a maintenance headache later on), you would just include that same regexp into the php page before displaying the html. I'm very rusty in PHP so this may not be perfectly accurate, but should give you a start. Assuming the html you want to tweak is already pulled from the database into variable $foo,

<?php
$foo = preg_replace('/<EMBED src="(.*?).mov" \/>/', '<PARAM name="src" value="$1.mov" \/><EMBED src="$1.mov" \/>', $foo);
echo $foo;
?>
posted by ook at 8:00 AM on January 21, 2009


Should've previewed; sbutler's perl script is both more complete and more intelligent (using tree parsing instead of just whacking a regexp at the problem.)
posted by ook at 8:02 AM on January 21, 2009


Response by poster: Thank you all. I've been offline (kid-pickuptime at preschool) and with the info gathered from previous hints found that indeed, yanking it out and throwing a script at it was the best way to go - so I did a short php idea - and saved the results into a file. And here I see I could have done it in perl too (actually I might just do that, I'm still testing stuff here and sbutler's perl script looks like a party.) What I did was this; (start with the usual php+mysqlconnect blabla)

$result = mysql_query("SELECT id, body FROM tablename");

while ($row = mysql_fetch_array($result, MYSQL_ASSOC)) {
preg_match('/]*src="([^"]+)"/m', $row["body"], $matches);
$param = sprintf('', $matches[1]);
$newbody = substr_replace($row["body"], $param,
strpos($row["body"], "<embed"), 0);
$query = sprintf("UPDATE node_revisions SET body='%s' WHERE
vid=%d;", mysql_real_escape_string($newbody), $row["vid"]);
printf("%s\n", $query);
}

mysql_free_result($result);

orthogonality, I did not mean to waste anyones time and I'm sorry that my terrible explaining made it seem so. My three examples would have looked pretty much exactly like Reverend John's as ghostmanonsecond nailed the thinsg needing replacing. I misunderstood.
posted by dabitch at 10:56 AM on January 21, 2009


Response by poster: ack, mean to change my silly tables name to "tablename" in both instances there and not just at the start, now I just look confused, but then again, that's true.
posted by dabitch at 11:00 AM on January 21, 2009


« Older What are the best programs that can convert almost...   |   Is CMS even what I need? Newer »
This thread is closed to new comments.