<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel> 

      <title>Comments on: Bookmarklets and PHP parsing of websites</title>
      <link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites/</link>
      <description>Comments on Ask MetaFilter post Bookmarklets and PHP parsing of websites</description>
	  	  <pubDate>Wed, 08 Mar 2006 09:29:58 -0800</pubDate>
      <lastBuildDate>Wed, 08 Mar 2006 09:29:58 -0800</lastBuildDate>
      <language>en-us</language>
	  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
	  <ttl>60</ttl>

<item>
  	<title>Question: Bookmarklets and PHP parsing of websites</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites</link>	
  	<description>I&apos;ve been trying to find good resources on creating a bookmarklet for a web site i&apos;m working on, and then parsing the passed url with PHP. &lt;br /&gt;&lt;br /&gt; Basically i want to write a bookmarklet that allows a user to &quot;post this page,&quot; the page being a youtube or google video page, to a website im working on. I haven&apos;t really been able to find good resources on integrating that with a PHP script that would then load the page and parse it, and in the process extract the embed this object tag. Any info on where i could find help on parsing web pages with PHP, and how one goes about setting up a web page to take urls passed by bookmarklets would be greatly appreciated.&lt;br&gt;
&lt;br&gt;
I have root access to the server, and we&apos;re running apache 1.3.</description>
  	<guid isPermaLink="false">post:ask.metafilter.com,2008:site.33973</guid>
  	<pubDate>Wed, 08 Mar 2006 09:24:41 -0800</pubDate>
  	<dc:creator>sourbrew</dc:creator>
	
	<category>php</category>
	
	<category>bookmarklet</category>
	
	<category>parse</category>
	
	<category>html</category>
	
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529491</link>	
  	<description>This is easy. You pass various elements of the page using javascript variables into GET variables in the URL, which your php script then picks up from $_GET[&amp;quot;variablename&amp;quot;]. I&apos;ll make you a contrived example in a minute to get you started.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529491</guid>
  	<pubDate>Wed, 08 Mar 2006 09:29:58 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529517</link>	
  	<description>Here&apos;s your basic bookmarklet. You want to compress it all into one logical line of code (no newlines), although you can have multiple javascript statements separated by ;&apos;s. &lt;br&gt;
&lt;br&gt;
&lt;code&gt;javascript:d=document;w=window;t=&apos;;if(d.selection){t=d.selection.createRange().text}else%20if(d.getSelection){t=d.getSelection()}else%20if(w.getSelection){t=w.getSelection();}void(w.open(&apos;http://yourdomain.com/path/to/yourscript.php?&amp;amp;pagetitle=&apos;+escape(d.title)+&apos;&amp;amp;url=&apos;+escape(d.location.href)+&apos;&amp;amp;quote=&apos;+escape(t),&apos;_blank&apos;,&apos;status=yes,resizable=yes,scrollbars=yes&apos;))&lt;/code&gt;&lt;br&gt;
&lt;br&gt;
This creates a new window that loads a URL that looks like http://yourdomain.com/path/to/yourscript.php?pagetitle=something&amp;amp;url=somethingelse&amp;amp;quote=theselectedtextonthepage&lt;br&gt;
&lt;br&gt;
Continued in the next comment...</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529517</guid>
  	<pubDate>Wed, 08 Mar 2006 09:57:17 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: sourbrew</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529527</link>	
  	<description>what if i don&apos;t want the user to actually have to select the text though, if i just want to parse the site using PHP. I&apos;m familiar with streams in C++, java, and some other languages, but i&apos;ve been having a hard time finding GOOD help sites on setting them up in PHP. Should i just cave and buy the Orielly book?</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529527</guid>
  	<pubDate>Wed, 08 Mar 2006 10:05:18 -0800</pubDate>
  	<dc:creator>sourbrew</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529530</link>	
  	<description>Here&apos;s the form part of yourscript.php, which handles all the GET variables passed in in the URL:&lt;br&gt;
&lt;br&gt;
&lt;code&gt;&amp;lt;form&amp;nbsp;method=&amp;quot;post&amp;quot;&amp;nbsp;class=&amp;quot;myformclass&amp;quot;&amp;nbsp;action=&amp;quot;backendscript.php&amp;quot;&amp;nbsp;name=&amp;quot;bookmarklet_handler&amp;quot;&amp;gt;&lt;br&gt;
&amp;lt;table&amp;gt;&lt;br&gt;
&amp;lt;td&amp;nbsp;align=&amp;quot;right&amp;quot;&amp;gt;&amp;lt;label&amp;nbsp;for=&amp;quot;title_field&amp;quot;&amp;nbsp;class=&amp;quot;mylabelclass&amp;quot;&amp;gt;Headline:&amp;lt;/label&amp;gt;&amp;lt;/td&amp;gt;&lt;br&gt;
&amp;lt;td&amp;gt;&amp;lt;input&amp;nbsp;class=&amp;quot;mytextboxclass&amp;quot;&amp;nbsp;id=&amp;quot;title_field&amp;quot;&amp;nbsp;name=&amp;quot;pagetitle&amp;quot;&amp;nbsp;size=&amp;quot;48&amp;quot;&amp;nbsp;value=&amp;quot;&amp;lt;?=$_GET[&amp;quot;pagetitle&amp;quot;]?&amp;gt;&amp;quot;&amp;nbsp;/&amp;gt;&amp;lt;/td&amp;gt;&lt;br&gt;
&amp;lt;textarea&amp;nbsp;class=&amp;quot;mytextareaclass&amp;quot;&amp;nbsp;id=&amp;quot;new_post_textarea&amp;quot;&amp;nbsp;name=&amp;quot;mainbody&amp;quot;&amp;nbsp;style=&apos;width:800px;height:500px;&apos;&amp;gt;&amp;lt;?=&amp;quot;&amp;lt;a&amp;nbsp;href=\&amp;quot;&amp;quot;&amp;nbsp;.&amp;nbsp;$_GET[&amp;quot;url&amp;quot;]&amp;nbsp;.&amp;nbsp;&amp;quot;\&amp;quot;&amp;gt;&amp;quot;&amp;nbsp;.&amp;nbsp;$_GET[&amp;quot;pagetitle&amp;quot;]&amp;nbsp;.&amp;nbsp;&amp;quot;&amp;lt;/a&amp;gt;\n\n&amp;lt;blockquote&amp;gt;&amp;quot;&amp;nbsp;.&amp;nbsp;$_GET[&amp;quot;quote&amp;quot;]&amp;nbsp;.&amp;nbsp;&amp;quot;&amp;lt;/blockquote&amp;gt;&amp;quot;?&amp;gt;&amp;lt;/textarea&amp;gt;&lt;br&gt;
&amp;lt;/table&amp;gt;&lt;br&gt;
&amp;lt;/form&amp;gt;&lt;/code&gt;&lt;br&gt;
&lt;br&gt;
Then all you have left to do is write &amp;quot;backendscript.php&amp;quot;, which will handle the POSTed form input after your user has  edited the new post to their liking and add it to the database. I assume you know how to handle a POST form in php.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529530</guid>
  	<pubDate>Wed, 08 Mar 2006 10:07:10 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: sourbrew</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529531</link>	
  	<description>I suppose the functionality i invision goes like this&lt;br&gt;
&lt;br&gt;
User sees &amp;quot;phatty&amp;quot; video&lt;br&gt;
&lt;br&gt;
User click bookmarklet&lt;br&gt;
&lt;br&gt;
My php file recieves url&lt;br&gt;
&lt;br&gt;
Loads source for passed url&lt;br&gt;
&lt;br&gt;
Determines if its YouTube or Google Video&lt;br&gt;
&lt;br&gt;
Parses page to find a specific set of tags&lt;br&gt;
&lt;br&gt;
Inputs tags into submission box.&lt;br&gt;
&lt;br&gt;
I sort of feel like i&apos;m asking for too much with all of that, a pointer to a good reference manual covering stream parsing would probably be adequate for now.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529531</guid>
  	<pubDate>Wed, 08 Mar 2006 10:07:30 -0800</pubDate>
  	<dc:creator>sourbrew</dc:creator>
</item>
<item>
  	<title>By: sourbrew</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529533</link>	
  	<description>also, that base code for the booklet should help a lot in at least getting things started, thanks</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529533</guid>
  	<pubDate>Wed, 08 Mar 2006 10:11:06 -0800</pubDate>
  	<dc:creator>sourbrew</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529535</link>	
  	<description>sourbrew-streams? You&apos;re barking up the wrong tree! You want to use the DOM (document object model) in Javascript to pick out the parts of the document that you want to pass to php, in GET variables. Read up on the DOM to figure out how to select particular tags that you&apos;re looking for; there are tons of excellent DOM tutorials on the web if you google. javascript in the bookmarklet + DOM -&amp;gt; escape()&apos;ed GET vars in the URL -&amp;gt; PHP form -&amp;gt; form handling script -&amp;gt; database -&amp;gt; website.&lt;br&gt;
&lt;br&gt;
If you google around, you can also find a website that will take your multiline javascript and compress it into a single line with no spaces, suitable for use in a bookmarklet. That way you can write it in a comfortable way in your favorite code editor, and then turn it into a single-line, properly-escaped bookmarklet.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529535</guid>
  	<pubDate>Wed, 08 Mar 2006 10:11:45 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529541</link>	
  	<description>&lt;blockquote&gt;Loads source for passed url&lt;br&gt;
&lt;br&gt;
Determines if its YouTube or Google Video&lt;br&gt;
&lt;br&gt;
Parses page to find a specific set of tags&lt;/blockquote&gt;You can do all this in javascript with its regular expressions. Just figure out if the URL (d.location.href in the bookmarklet above) contains youtube, and if it does, use the DOM to locate the embed or object tag, if any, and set a variable to contain the (properly escaped) innerHTML property of the tag. Then you handle it in PHP as above.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529541</guid>
  	<pubDate>Wed, 08 Mar 2006 10:17:51 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529545</link>	
  	<description>More specifically, if you know that YouTube always gives the embed tag you&apos;re looking for a specific id, you can do this in javascript:&lt;br&gt;
&lt;br&gt;
embtag=escape(document.getElementById(&amp;quot;the_youtube_id&amp;quot;).innerHTML);&lt;br&gt;
&lt;br&gt;
and then pass it in the URL to your php script. Look at YouTube&apos;s source code on a couple of different pages and see if the pickings are as easy as that. I suspect they are.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529545</guid>
  	<pubDate>Wed, 08 Mar 2006 10:22:20 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529549</link>	
  	<description>Anyway, good luck. This should be enough pointers for you to be able to figure out how to do this.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529549</guid>
  	<pubDate>Wed, 08 Mar 2006 10:23:56 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: scottreynen</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529632</link>	
  	<description>My php file recieves url&lt;br&gt;
&lt;br&gt;
Loads source for passed url&lt;br&gt;
&lt;br&gt;
&lt;br&gt;
$source = file_get_contents( $_GET[&apos;url&apos;] );&lt;br&gt;
&lt;br&gt;
You should do the parsing server-side. You could do it client-side and that would be faster, but it&apos;s likely not worth it. At some point after you release the bookmarklet, Google Video or YouTube will change HTML format. Then you&apos;ll need to change your parser. If the parser is on your server, you can change it for everyone all at once. If it&apos;s on each individual user&apos;s client browser in the bookmarklet, each user will need to first realize the bookmarklet is broken and then return to your site to get the updated parser.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529632</guid>
  	<pubDate>Wed, 08 Mar 2006 11:35:34 -0800</pubDate>
  	<dc:creator>scottreynen</dc:creator>
</item>
<item>
  	<title>By: sourbrew</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529635</link>	
  	<description>Thank you sir, i won&apos;t be tackeling this until tomorrow, wanted to get my ducks in a row. I&apos;ll let you know how it works out.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529635</guid>
  	<pubDate>Wed, 08 Mar 2006 11:36:29 -0800</pubDate>
  	<dc:creator>sourbrew</dc:creator>
</item>
<item>
  	<title>By: sourbrew</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529638</link>	
  	<description>scottreynen, yeah i was already planning for that eventuality.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529638</guid>
  	<pubDate>Wed, 08 Mar 2006 11:37:25 -0800</pubDate>
  	<dc:creator>sourbrew</dc:creator>
</item>
<item>
  	<title>By: evariste</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529649</link>	
  	<description>scottreynen-ah, good point about the format changing.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529649</guid>
  	<pubDate>Wed, 08 Mar 2006 11:44:07 -0800</pubDate>
  	<dc:creator>evariste</dc:creator>
</item>
<item>
  	<title>By: scottreynen</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529820</link>	
  	<description>You might want to look at &lt;a href=&quot;http://www.metafilter.com/mefi/49857&quot;&gt;this FPP&lt;/a&gt; before you spend too much time on this.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529820</guid>
  	<pubDate>Wed, 08 Mar 2006 13:28:01 -0800</pubDate>
  	<dc:creator>scottreynen</dc:creator>
</item>
<item>
  	<title>By: sourbrew</title>
  	<link>http://ask.metafilter.com/33973/Bookmarklets-and-PHP-parsing-of-websites#529862</link>	
  	<description>scottreynen, yeah i&apos;ve seen that program before. Dosn&apos;t really apply to our site though, made a concious decision not to make downloads very easy. With the frequent take down notices to YouTube it seemed like it had the potential to call forth nasty-grams from the sky.</description>
  	<guid isPermaLink="false">comment:ask.metafilter.com,2008:site.33973-529862</guid>
  	<pubDate>Wed, 08 Mar 2006 13:58:33 -0800</pubDate>
  	<dc:creator>sourbrew</dc:creator>
</item>

    </channel>
</rss>
