<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"
    xmlns:dc="http://purl.org/dc/elements/1.1/"
     xmlns:admin="http://webns.net/mvcb/"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
	<channel>
	  <title>Ask MetaFilter questions tagged with url</title>
      <link>http://ask.metafilter.com/tags/url</link>
      <description>Questions tagged with 'url' at Ask MetaFilter.</description>
	  <pubDate>Tue, 15 Dec 2009 13:07:51 -0800</pubDate> <lastBuildDate>Tue, 15 Dec 2009 13:07:51 -0800</lastBuildDate>

      <language>en-us</language>
	  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
	  <ttl>60</ttl>	  
	<item>
	<title>Print IE URLs</title>
	<link>http://ask.metafilter.com/140750/Print%2DIE%2DURLs</link>	
	<description>Printing the URLs of IE Favorites I want to print the names and URLs of a bunch of Internet Explorer Favorites.  If I export them and load the HTML file into IE, I get just the names as links.  If I load the HTML file into Word, the URLs are there, but surrounded by garbage.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.140750</guid>
	<pubDate>Tue, 15 Dec 2009 13:07:51 -0800</pubDate>
	<category>InternetExplorer</category>
	<category>print</category>
	<category>URL</category>
	<dc:creator>KRS</dc:creator>
	</item>
	<item>
	<title>Bamboozled by Google</title>
	<link>http://ask.metafilter.com/137562/Bamboozled%2Dby%2DGoogle</link>	
	<description>Why does Google Alerts add stuff like this to the end of the URLs it sends me every day by email:
&amp;amp;ct=ga&amp;amp;cd=eOGlPN9EEgs&amp;amp;usg=AFNjCNGPR1lLrp7KJVpHt-2HktLYhtBMQg I guess Google has its reasons. So what I really mean is, why does Google do this when most of the time it breaks the link? The result is that, whenever I click on such a link, 90% of the time I get a &quot;page not found&quot; result and I then have to reload the page with the gibberish stripped away. I know that&apos;s not a huge chore, but it still seems daft. Can I put a stop to this? No illumination on wikipedia, from Google, or anywhere else that I could find.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.137562</guid>
	<pubDate>Sun, 08 Nov 2009 11:49:49 -0800</pubDate>
	<category>alerts</category>
	<category>google</category>
	<category>googlealerts</category>
	<category>url</category>
	<dc:creator>londongeezer</dc:creator>
	</item>
	<item>
	<title>Jeez Google pt. 2 - We still want clean URLs</title>
	<link>http://ask.metafilter.com/137350/Jeez%2DGoogle%2Dpt%2D2%2DWe%2Dstill%2Dwant%2Dclean%2DURLs</link>	
	<description>Remove these crappy ultra-long links from my Google Search Results when I right-click! What&apos;s the best way to get rid of these ultra-long Google urls that pop-up when you right-click a link in Google search results?  It&apos;s basically a rehash of &lt;a href=&quot;http://ask.metafilter.com/84805/What-happened-to-clean-URLs-Google-Jeez&quot;&gt;this AskMe from 2008&lt;/a&gt;.&lt;br&gt;
&lt;br&gt;
I don&apos;t use greasemonkey.  I had customizegoogle installed but that was abandoned and no longer works - does optimizegoogle do this?  I just want to get my right-click links back.  I tried removing web history, logging in or out doesn&apos;t matter, and deleting cookies, etc - nothing seems to stop it, and I hate it.&lt;br&gt;
&lt;br&gt;
Lil help?!</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.137350</guid>
	<pubDate>Thu, 05 Nov 2009 14:11:33 -0800</pubDate>
	<category>ClickTracking</category>
	<category>Google</category>
	<category>Links</category>
	<category>Redirection</category>
	<category>Results</category>
	<category>RightClick</category>
	<category>Search</category>
	<category>URL</category>
	<dc:creator>cashman</dc:creator>
	</item>
	<item>
	<title>How to make my PHP-generated URLs prettier?</title>
	<link>http://ask.metafilter.com/135672/How%2Dto%2Dmake%2Dmy%2DPHPgenerated%2DURLs%2Dprettier</link>	
	<description>I&apos;d like to make more easily understood URL paths for my PHP-generated pages. Instead of using the conventional query string method such as &lt;br&gt;
&lt;br&gt;
mysite.com/page.php?section=2&amp;amp;product_id=1 &lt;br&gt;
&lt;br&gt;
I&apos;d like to be able to set up my URL to read &lt;br&gt;
&lt;br&gt;
mysite.com/tools/widget/ &lt;br&gt;
&lt;br&gt;
It&apos;s more human-readable, and I think better for SEO, but I don&apos;t know how to do it!  Unfortunately, all my Google and PHP.net searches around URL parsing, etc. don&apos;t seem to address how to use alternatives to the ?&amp;amp; method of using GET variables.  Does anyone have any advice or links to articles on this subject?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.135672</guid>
	<pubDate>Fri, 16 Oct 2009 12:55:53 -0800</pubDate>
	<category>php</category>
	<category>url</category>
	<dc:creator>fellorwaspushed</dc:creator>
	</item>
	<item>
	<title>Name Recognition</title>
	<link>http://ask.metafilter.com/134271/Name%2DRecognition</link>	
	<description>One business + two Flickr accounts + lots of Google recognition = one frustrated business owner. Help me decide which Flickr account to use, or if I should delete both and start over. Some time back I opened a Flickr account and chose a personal URL. I also opened a second Flickr account with the name of my business. I&apos;ve been using the personal account with greater frequency than the business account, so the personal account is the one that shows up in Google searches if you search for the business name.&lt;br&gt;
&lt;br&gt;
I&apos;d prefer the Flickr account with the business URL to be the one that shows up in Google searches, but Flickr does not allow changes or transfers of custom URLs. So I&apos;m tempted to delete the personal account and start over, but doing so means losing all the Google recognition. So I&apos;m faced with three options:&lt;br&gt;
&lt;br&gt;
1. Delete the personal account (con: lose the Google recognition)&lt;br&gt;
2. Use the personal account as the official business account (con: personal URL)&lt;br&gt;
3. Delete both accounts and start over&lt;br&gt;
&lt;br&gt;
All advice is welcome.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.134271</guid>
	<pubDate>Wed, 30 Sep 2009 17:45:22 -0800</pubDate>
	<category>flickr</category>
	<category>google</category>
	<category>naming</category>
	<category>SEO</category>
	<category>URL</category>
	<dc:creator>Ann Onymous</dc:creator>
	</item>
	<item>
	<title>Add Me To Your Favourites</title>
	<link>http://ask.metafilter.com/132565/Add%2DMe%2DTo%2DYour%2DFavourites</link>	
	<description>I need to send a link in an email which on the other end will create a favourite in in the recipients IE.  I&apos;ve googled around this and found lots of resources on adding a favourite from a website but they all seem to assume the recipient is on the right page already.  I&apos;m using Outlook 2007 if it&apos;s relevant.

(Bit dull as a question I know but I&apos;ve been wrestling with this all morning)</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.132565</guid>
	<pubDate>Fri, 11 Sep 2009 04:31:27 -0800</pubDate>
	<category>2007</category>
	<category>favourites</category>
	<category>html</category>
	<category>Hyperlink</category>
	<category>outlook</category>
	<category>URL</category>
	<dc:creator>eb98jdb</dc:creator>
	</item>
	<item>
	<title>I Want a Particular Domain</title>
	<link>http://ask.metafilter.com/130596/I%2DWant%2Da%2DParticular%2DDomain</link>	
	<description>What is the best way to go about getting an expired domain? I set a calendar event to notify me of the date a domain whois said would expire, and when I checked back the owner had (not yet) decided to renew. I tried to grab it, but was told it wasn&apos;t available, so I did the research and found out he has &quot;about a month&quot; to change his mind, and then for &quot;about a month&quot; he can still redeem the domain for a higher fee. I did find out that in 75 days it is free to grab. There was some info about &quot;drop&quot; registrations, whereby newly available domains are purchasable for a few hours a day, etc. This is just a long intro to explain I have looked into the process. What I want to know is what way to go about this where I will pay the least amount of money and stand the highest chance of getting it? Any pitfalls I should be aware of?&lt;br&gt;
&lt;br&gt;
If he&apos;s not going to use it, I would like to take control.&lt;br&gt;
&lt;br&gt;
Should I contact him and ask him to renew and transfer it to me? I&apos;d be willing to pay a nominal fee here. Or should I wait and risk someone beating me to the punch during one of the drop periods or when it is totally up for grabs?&lt;br&gt;
&lt;br&gt;
My fear is that if I express interest he&apos;ll ask more than I am willing to pay, but if I don&apos;t I&apos;ll be beat out by someone faster. Also, if I could get it transfered I wouldn&apos;t have to wait the 2.5 months.&lt;br&gt;
&lt;br&gt;
So what&apos;s the best way of doing this? It&apos;s registered through godaddy (if this matters), and he wasn&apos;t using it for any significant web facing services (one short hosted video and lots of google ads). The content never changed for the 6 months I checked in on the site. It&apos;s been expired for four days now.&lt;br&gt;
&lt;br&gt;
What&apos;s your best advice?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.130596</guid>
	<pubDate>Wed, 19 Aug 2009 09:28:13 -0800</pubDate>
	<category>domain</category>
	<category>resolved</category>
	<category>URL</category>
	<dc:creator>cjorgensen</dc:creator>
	</item>
	<item>
	<title>URL WTF</title>
	<link>http://ask.metafilter.com/129885/URL%2DWTF</link>	
	<description>My site is getting a lot of weirdly-formed referring URLs--some with two or three URLs linked with commas--or URLs that do not appear to contain links to my site. What&apos;s going on? Here are two examples of URLs in my hit log--the first is one referrer that shows up in my log as three URLs (including mine) strung together:&lt;br&gt;
&lt;br&gt;
http://www.lilidaviesnolink.co.uk/457/magic-betty-sing-along-6th-june, http://www.phonogramnolink.us/blogs2/dpdc/2009/05/trek_30.html, http://www.mattdidthatnolink.com/weblog&lt;br&gt;
&lt;br&gt;
Another referring URL without a link to my site:&lt;br&gt;
&lt;br&gt;
http://www.mp4dunyasinolink.com/index.php&lt;br&gt;
&lt;br&gt;
(To make the URLs work, you&apos;ll need to remove the &quot;nolink&quot;)&lt;br&gt;
&lt;br&gt;
What&apos;s going on? What are these referrers, and why am I getting them? If this is spam, could someone please explain what they&apos;re doing and how I disable/remove it?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.129885</guid>
	<pubDate>Tue, 11 Aug 2009 16:04:35 -0800</pubDate>
	<category>link</category>
	<category>referrer</category>
	<category>spam</category>
	<category>URL</category>
	<dc:creator>mattdidthat</dc:creator>
	</item>
	<item>
	<title>Thanks for the pageviews, no thanks for the spam.</title>
	<link>http://ask.metafilter.com/129317/Thanks%2Dfor%2Dthe%2Dpageviews%2Dno%2Dthanks%2Dfor%2Dthe%2Dspam</link>	
	<description>I have been getting weird Wordpress referrer spam the last 4 days, but there are no injections or anything of the like on my site. I have a site running wordpress.  No comments or users, I&apos;m using it as a simple CMS.&lt;br&gt;
&lt;br&gt;
I&apos;m using the StatPress plugin to check out who is coming to the site.  This morning, I noticed an abnormally large number of visitors the last few days.&lt;br&gt;
&lt;br&gt;
People seem to be visiting pages like mysite.com/?myfjkfosljfsfjd (NB : not a string I&apos;ve seen, just an example).  When clicked, it will go to my homepage. Checking the source, there is nothing out of the ordinary (no spam links, etc).  If you google that end string by itself, you get one result, to my site, with a summary that lists a whole bunch of viagra type words.&lt;br&gt;
&lt;br&gt;
Any idea what is going on, and how I can stop this?  &lt;br&gt;
&lt;br&gt;
I was running 2.8.2, upgraded to 2.8.3 this morning.&lt;br&gt;
&lt;br&gt;
You can get my details from my userpage if you want specifics.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.129317</guid>
	<pubDate>Wed, 05 Aug 2009 06:26:21 -0800</pubDate>
	<category>blog</category>
	<category>google</category>
	<category>links</category>
	<category>referrer</category>
	<category>resolved</category>
	<category>spam</category>
	<category>url</category>
	<category>wordpress</category>
	<dc:creator>tip120</dc:creator>
	</item>
	<item>
	<title>Only New Entries In Delicious RSS Feeds For Tags</title>
	<link>http://ask.metafilter.com/119785/Only%2DNew%2DEntries%2DIn%2DDelicious%2DRSS%2DFeeds%2DFor%2DTags</link>	
	<description>Can I filter out previously posted URIs from a delicious RSS feed for a specific tag? I&apos;m subscribed to a handful of delicious RSS feeds for certain tags. While the signal to noise ratio isn&apos;t great, I do see about a dozen things on each feed that is something I would have missed otherwise. However, most of the stuff that I see on a daily basis (over 200 posts) are URIs I know have been posted before. I estimate I can easily cut in half the stuff I don&apos;t want to see if there was a feed that only included a URI that was completely new to delicious. Is there any way to do this? Is there a yahoo pipes for this? Something else I should look into (aside from roll my own aggregator and filter manually)? Thanks!</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.119785</guid>
	<pubDate>Fri, 17 Apr 2009 10:38:14 -0800</pubDate>
	<category>aggregate</category>
	<category>delicious</category>
	<category>filter</category>
	<category>internet</category>
	<category>rss</category>
	<category>signaltonoise</category>
	<category>uri</category>
	<category>url</category>
	<dc:creator>Brian Puccio</dc:creator>
	</item>
	<item>
	<title>How do I make Firefox go to a page that I choose if I type a bad URL?</title>
	<link>http://ask.metafilter.com/114662/How%2Ddo%2DI%2Dmake%2DFirefox%2Dgo%2Dto%2Da%2Dpage%2Dthat%2DI%2Dchoose%2Dif%2DI%2Dtype%2Da%2Dbad%2DURL</link>	
	<description>How do I make Firefox go to a page that I choose if I type a bad URL? &lt;a href=&quot;http://ask.metafilter.com/89509/ISP-hijacks-invalid-URLssometimes&quot;&gt;This&lt;/a&gt; askmefi post has some potentially useful info but it does not apply in my case.&lt;br&gt;
&lt;br&gt;
In Firefox 3.06, if I type a bad URL into my address bar, such as &lt;a href=&quot;http://blah/&quot;&gt;http://blah/&lt;/a&gt; and instead of doing a google search or an error page, or showing me a url that I might have been attempting to reach, like blah.com, it takes me to this spam search page: &quot;http://hwerror.hwpub.com/?ck=esb02oob60&amp;amp;et=1&quot;.&lt;br&gt;
&lt;br&gt;
I have modified &quot;keyword.url&quot; in my about:config to do a google search on keywords (different than the default I&apos;m feeling Lucky Search), but it has no effect when I type only a single word into my address bar and hit enter.&lt;br&gt;
&lt;br&gt;
I have tried to disable &quot;keyword.enabled&quot; and that has no effect on this problem either.&lt;br&gt;
&lt;br&gt;
I have run firefox in safe mode and it has the same result.&lt;br&gt;
&lt;br&gt;
I have already switched my DNS server to OpenDNS and confirmed that it is working, so I do not believe that my ISP is re-routing my request.&lt;br&gt;
&lt;br&gt;
It is interesting to me, that if I view source on the page, it shows a frameset, and only within that frameset does it pull in the url for the search page referenced above.&lt;br&gt;
&lt;br&gt;
I believe that somewhere inside the code or configuration files, or possibly an extension, Firefox has been tweaked to do this to me.  Is it possible for me to reconfigure it so that it does not happen?&lt;br&gt;
&lt;br&gt;
I am already aware that I am causing this by typing a malformed URL, and that I can do keyword searches if I type two words into the address bar, and also that I can hit CTRL+ENTER to magically append &quot;.com&quot; to a single word in the address bar.&lt;br&gt;
&lt;br&gt;
I just want to know if there is a way to put this intercept this redirect and put in my own to google or whatever I desire.&lt;br&gt;
&lt;br&gt;
Thank you all.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.114662</guid>
	<pubDate>Thu, 19 Feb 2009 15:20:02 -0800</pubDate>
	<category>address</category>
	<category>bar</category>
	<category>bug</category>
	<category>dns</category>
	<category>firefox</category>
	<category>invalid</category>
	<category>redirect</category>
	<category>search</category>
	<category>spam</category>
	<category>url</category>
	<dc:creator>farmersckn</dc:creator>
	</item>
	<item>
	<title>Word to PDF conversion - where&apos;d my links go?</title>
	<link>http://ask.metafilter.com/114177/Word%2Dto%2DPDF%2Dconversion%2Dwhered%2Dmy%2Dlinks%2Dgo</link>	
	<description>How can I keep the embedded links in my Word doc when I convert to PDF? (MacOSX solutions preferred, though not required.) I have a bundle of Word docs with embedded hyperlinks that I need to convert to PDF, and I need to have those links active. In converting to PDF, (save as PDF or print to PDF in Word) the links go away. I have a copy of Adobe Acrobat 5.0 - I know how to physically add all the links, and how to create web links from URLs in text. Unfortunately it&apos;s time consuming to add all the links to all the documents in Acrobat, and the &quot;create web links&quot; batch function in Acrobat 5.0 only works for addresses in http format. &lt;br&gt;
Are there any workarounds? Can I do this in a scalable, batch-friendly way? Software I can use - bonus points for free? I am running MacOSX Leopard, with Word 2008 though I have Word 2004 at home. Windows solutions appreciated too, we have PCs at work. Thanks in advance!</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.114177</guid>
	<pubDate>Fri, 13 Feb 2009 16:55:51 -0800</pubDate>
	<category>conversion</category>
	<category>hyperlink</category>
	<category>link</category>
	<category>pdf</category>
	<category>resolved</category>
	<category>url</category>
	<category>word</category>
	<dc:creator>operating thetan</dc:creator>
	</item>
	<item>
	<title>URL rewrite in IIS7: make foo.com/bah/ redirect to example.com/foo/bah/</title>
	<link>http://ask.metafilter.com/113872/URL%2Drewrite%2Din%2DIIS7%2Dmake%2Dfoocombah%2Dredirect%2Dto%2Dexamplecomfoobah</link>	
	<description>Help me with a URL rewrite in IIS7 I want to implement a rewrite in IIS that would allow the following without having to create a new rule for each one.&lt;br&gt;
&lt;br&gt;
www.foo.com/books/ redirects to www.example.com/foo/books/&lt;br&gt;
www.foo.com/films/ redirects to www.example.com/foo/films/&lt;br&gt;
www.foo.com/etc/ redirects to  www.example.com/foo/etc/&lt;br&gt;
www.foo.com/etc/etc/ redirects to  www.example.com/foo/etc/etc/&lt;br&gt;
&lt;br&gt;
I am familiar with the steps in &lt;a href=&quot;http://learn.iis.net/page.aspx/460/using-url-rewrite-module/&quot;&gt;using url rewrite module in IIS7&lt;/a&gt; if that makes passing on instructions a bit easier.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.113872</guid>
	<pubDate>Tue, 10 Feb 2009 12:10:29 -0800</pubDate>
	<category>IIS</category>
	<category>IIS7</category>
	<category>module</category>
	<category>redirect</category>
	<category>regexp</category>
	<category>rewrite</category>
	<category>URL</category>
	<dc:creator>furtive</dc:creator>
	</item>
	<item>
	<title>Pimp my URI</title>
	<link>http://ask.metafilter.com/113273/Pimp%2Dmy%2DURI</link>	
	<description>Flat- or tree- URL scheme for a site redesign? I am in the process of migrating a site (from Typo3 to Django, yay), and need to make a decision about URL schemes.&lt;br&gt;
&lt;br&gt;
The site previously had URLs of the kind: &quot;/category/name_of_entry&quot;. However, the categories have moved around, with old ones being deleted, new ones created, and entries moved as necessary. This will ikely continue to be the case. Tim Berners Lee &lt;a href=&quot;http://www.w3.org/Provider/Style/URI&quot;&gt;says &lt;/a&gt;that you should never, ever change the URL of a page, and I see his point.&lt;br&gt;
&lt;br&gt;
So I&apos;m planning to use a flat URL scheme, sans categories, something like &quot;entries/name_of_entry&quot;. This implies changing the URLs at least this one time. I also seem to remember that the googleplex likes sites with a treelike structure, would throwing everything in a single non-category affect their opinion of us?&lt;br&gt;
&lt;br&gt;
I might use date-based URLs, but I don&apos;t have this information for the few hundred existing entries. We have &apos;editions&apos;, so I could also set up the URLs around them, but it still seems kind of messy, as some articles don&apos;t belong to any edition.&lt;br&gt;
&lt;br&gt;
I&apos;m thinking of setting up a redirect so the old URLs still resolve, so we don&apos;t lose all inbound links.&lt;br&gt;
&lt;br&gt;
What are your thoughts, best practices and advice on this?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.113273</guid>
	<pubDate>Tue, 03 Feb 2009 12:00:06 -0800</pubDate>
	<category>timbernerslee</category>
	<category>uri</category>
	<category>url</category>
	<category>web</category>
	<dc:creator>signal</dc:creator>
	</item>
	<item>
	<title>How can I launch URLs from the OS X dock (with proper icons)?</title>
	<link>http://ask.metafilter.com/113127/How%2Dcan%2DI%2Dlaunch%2DURLs%2Dfrom%2Dthe%2DOS%2DX%2Ddock%2Dwith%2Dproper%2Dicons</link>	
	<description>How can I create an app/shortcut that lives in the OS X dock that has an icon of my choosing and launches in the browser of my choice?  Difficulty: no @ icons, no SSB. It seemed like a simple task when I decided I wanted to launch Google Docs from my OS X dock.  However, after much searching and testing, I have found the following options, none of which work for me:&lt;br&gt;
&lt;br&gt;
1) Drag the URL/favicon onto the dock.  This works great to launch the web page I want from the dock in my default browser (what I wanted).  However, it also has that lame @ icon (which I can&apos;t seem to change to anything but a file type preview icon) AND it is forced to live outside of the app section.&lt;br&gt;
&lt;br&gt;
2) Use a tool like Fluid (fluidapp.com) to create an application to launch the URL.  This works great by using the icon of my choice and being able to be placed in the app section (solves above problems).  But it also forces these links to open in an SSB (site specific browser) that loses all of my cookies, etc.  I&apos;d rather it just open in Firefox.&lt;br&gt;
&lt;br&gt;
So, does anyone know of a way that I can combine these and have a URL launch from the dock:&lt;br&gt;
&lt;br&gt;
- with the icon of my choice&lt;br&gt;
- in Firefox&lt;br&gt;
- from the app section of the dock (optional)&lt;br&gt;
&lt;br&gt;
???&lt;br&gt;
&lt;br&gt;
Thanks,&lt;br&gt;
Seth</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.113127</guid>
	<pubDate>Sun, 01 Feb 2009 14:34:09 -0800</pubDate>
	<category>bookmark</category>
	<category>dock</category>
	<category>icon</category>
	<category>launch</category>
	<category>link</category>
	<category>osx</category>
	<category>ssb</category>
	<category>url</category>
	<dc:creator>SethLeonard</dc:creator>
	</item>
	<item>
	<title>Tips as to UNIX Shell Script to Programmatically Save a Webpage as a Text File</title>
	<link>http://ask.metafilter.com/113010/Tips%2Das%2Dto%2DUNIX%2DShell%2DScript%2Dto%2DProgrammatically%2DSave%2Da%2DWebpage%2Das%2Da%2DText%2DFile</link>	
	<description>I&apos;m writing a shell script to take a webpage and convert it into a text file.  I&apos;d appreciate tips as to how to store URLs, save the file with the URL&apos;s title as its name, and also just general tips as to how to improve the script and/or achieve the process better.  Specific questions inside. I&apos;d like to write a shell script which converts a webpage into a text file.  After a lot of tinkering with various note-taking applications, Firefox extensions, and so on, I&apos;ve found that the best tool for me is just plain good old-fashioned text files.  However, I&apos;d love it if I could automate the process a little bit more, and so I&apos;m writing a shell script to get a webpage into a text-form equivalent.&lt;br&gt;
&lt;br&gt;
Right now, I&apos;ve got:&lt;br&gt;
&lt;br&gt;
&lt;tt&gt;links -dump -width 512 &quot;$1&quot; | cut -c 4- &amp;gt; /tmp/temp.file&lt;br&gt;
lynx -listonly -dump &quot;$1&quot; | sed &apos;1,3d&apos; | cut -c 7- &amp;gt;&amp;gt; /tmp/temp.file&lt;br&gt;
edit -b /tmp/temp.file&lt;/tt&gt;&lt;br&gt;
&lt;br&gt;
In this example, &lt;tt&gt;$1&lt;/tt&gt; is a Web address.&lt;br&gt;
&lt;br&gt;
What this does is:&lt;ol&gt;&lt;li&gt;Uses &lt;tt&gt;links&lt;/tt&gt; to save the text of the page.  I use this instead of &lt;tt&gt;lynx&lt;/tt&gt; because the &quot;&lt;tt&gt;-width 512&lt;/tt&gt;&quot; lets it handle it without inappropriate line breaks, and &lt;tt&gt;links&lt;/tt&gt; seems to let handle punctuation spacing better than &lt;tt&gt;lynx&lt;/tt&gt;.  (The &quot;&lt;tt&gt;cut&lt;/tt&gt;&quot; removes the extra lefthand margin.)&lt;/li&gt;&lt;li&gt;Uses &lt;tt&gt;lynx&lt;/tt&gt; to generate the list of links that are on that page, removing the &quot;References&quot; header, margin, and numbering.  Links doesn&apos;t seem to have any way of recording the URLs when generating a text copy.  It appends that to the work in progress.&lt;/li&gt;&lt;li&gt;Sends this to TextWrangler to open up in the background.&lt;/li&gt;&lt;/ol&gt;I&apos;m seeking the community&apos;s advice on this on three points:&lt;ol&gt;&lt;li&gt;The way I&apos;ve got it working now is okay, but, ideally, I&apos;d like to handle URLs in the way that Mefi&apos;s print stylesheet handles it &amp;mdash; the URL appearing right after the link text.  So in a webpage converted into a text file, instead of it being &quot;&lt;tt&gt;Google&lt;/tt&gt;&quot;, it&apos;d be &quot;&lt;tt&gt;Google [http://www.google.com]&lt;/tt&gt;&quot;.  I&apos;m aware that &lt;tt&gt;lynx&lt;/tt&gt; lets you do footnotes (&quot;&lt;tt&gt;[1]Google&lt;/tt&gt;&quot; and later &quot;&lt;tt&gt;1. http://www.google.com&lt;/tt&gt;&quot;), but &lt;tt&gt;lynx&lt;/tt&gt;&apos;s handling of line breaks and spacing isn&apos;t great.&lt;/li&gt;&lt;li&gt;I&apos;d then ideally like to have this script save the results automatically to a text file on my Desktop with the URL&apos;s &lt;tt&gt;TITLE&lt;/tt&gt; attribute as the name of the file.&lt;/li&gt;&lt;li&gt;I&apos;m wondering if, given the format, any odd punctuation in the URL could screw up the process.&lt;/li&gt;&lt;li&gt;Also, I imagine this might be an enjoyable script for others &amp;mdash; and if so, any other modifications to the script that would improve the overall process and/or end goal &amp;mdash; and/or any utilities that do this process better than what I&apos;m hacking up &amp;mdash; would be appreciated.&lt;/li&gt;&lt;/ol&gt;</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2009:site.113010</guid>
	<pubDate>Fri, 30 Jan 2009 16:53:03 -0800</pubDate>
	<category>ascii</category>
	<category>html</category>
	<category>links</category>
	<category>lynx</category>
	<category>resolved</category>
	<category>sed</category>
	<category>store</category>
	<category>text</category>
	<category>title</category>
	<category>url</category>
	<dc:creator>WCityMike</dc:creator>
	</item>
	<item>
	<title>How to see all top ranked movies on IMDb?</title>
	<link>http://ask.metafilter.com/108718/How%2Dto%2Dsee%2Dall%2Dtop%2Dranked%2Dmovies%2Don%2DIMDb</link>	
	<description>How can I see the top-ranked IMDb movies beyond the top 250? I&apos;d like to see movies ranked in order beyond the top 250 here:&lt;br&gt;
&lt;br&gt;
&lt;a href=&quot;http://www.imdb.com/chart/top&quot;&gt;http://www.imdb.com/chart/top&lt;/a&gt;&lt;br&gt;
&lt;br&gt;
In other words, how do I see the continuation of this list, i.e. #251 and lower?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.108718</guid>
	<pubDate>Mon, 08 Dec 2008 13:23:40 -0800</pubDate>
	<category>film</category>
	<category>hacking</category>
	<category>imdb</category>
	<category>movies</category>
	<category>ranked</category>
	<category>ranking</category>
	<category>top250</category>
	<category>url</category>
	<dc:creator>wastelands</dc:creator>
	</item>
	<item>
	<title>URL repeating, repeating URL</title>
	<link>http://ask.metafilter.com/108373/URL%2Drepeating%2Drepeating%2DURL</link>	
	<description>Why is the url on my blog showing up twice in the address bar? I have a wordpress blog hosted on my own domain as &quot;www.domain.com/blog.&quot; &lt;br&gt;
When you enter that url it will load and then the url in the address bar will change to &quot;http://domain.com/blog#http://www.domain.com/blog&quot;&lt;br&gt;
&lt;br&gt;
It does this for sub pages of the blog to, for example,&lt;br&gt;
&lt;br&gt;
http://www.domain.com/blog/?p=123#http://www.domain.com/blog/?p=123&lt;br&gt;
&lt;br&gt;
So my question is why is the url repeating itself in the address bar? I checked the wordpress settings and I don&apos;t see any way to set it this way. The other pages not on my blog don&apos;t do this domain.com/contact.html just stays the same.&lt;br&gt;
&lt;br&gt;
Thanks and I hope this isn&apos;t to confusing.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.108373</guid>
	<pubDate>Wed, 03 Dec 2008 22:48:20 -0800</pubDate>
	<category>repeatingURL</category>
	<category>resolved</category>
	<category>URL</category>
	<category>Wordpress</category>
	<dc:creator>lilkeith07</dc:creator>
	</item>
	<item>
	<title>What are best practices for changing the directory of a Wordpress blog?</title>
	<link>http://ask.metafilter.com/107977/What%2Dare%2Dbest%2Dpractices%2Dfor%2Dchanging%2Dthe%2Ddirectory%2Dof%2Da%2DWordpress%2Dblog</link>	
	<description>I want to move my Wordpress blog from a subdirectory on my domain to the root directory -- i.e., I want the URL of my posts to be changed from http://www.domain.com/blog/post-permalink to http://www.domain.com/permalink . What&apos;s the best way to ensure that external links don&apos;t die? Is there a way to make sure that individual posts will retain their Google PageRank? I know that some variation on a RewriteRule in the site&apos;s .htaccess file is what I&apos;m looking for, but I&apos;m unclear on the specifics. What are best practices for this sort of transition?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.107977</guid>
	<pubDate>Sat, 29 Nov 2008 13:06:13 -0800</pubDate>
	<category>blog</category>
	<category>change</category>
	<category>htaccess</category>
	<category>php</category>
	<category>seo</category>
	<category>url</category>
	<category>wordpress</category>
	<dc:creator>tweebiscuit</dc:creator>
	</item>
	<item>
	<title>Make my urls purty!</title>
	<link>http://ask.metafilter.com/107726/Make%2Dmy%2Durls%2Dpurty</link>	
	<description>What&apos;s the best way for me to create human-readable urls for a Wordpress site? So I&apos;ve taken a job converting a very static site to one that will use Wordpress as a CMS. A couple of key points up front:&lt;br&gt;
&lt;br&gt;
1) We are 100% committed to using Wordpress. Thanks for respecting this.&lt;br&gt;
2) I will NOT be making use of any wordpress theme or allowing users to go to any Wordpress pages at all. We are using it as a CMS on the backend, and on the frontend I will write custom php pages and queries to display what i want to display.&lt;br&gt;
&lt;br&gt;
The site as it stands now has 1000s of articles in totally flat html pages. We want to get away from that for obvious reasons. My inclination is to create one page to use over and over again, so where the old url would&apos;ve been like:&lt;br&gt;
&lt;br&gt;
www.thesite.net/article_name.html&lt;br&gt;
&lt;br&gt;
the new url would look like:&lt;br&gt;
&lt;br&gt;
www.thesite.net/article.php?id=124&lt;br&gt;
&lt;br&gt;
two issues with this:&lt;br&gt;
1) the site is quite popular, so a lot of people have probably bookmarked old articles and I dont want those links to die when we remove all the flat html pages.&lt;br&gt;
&lt;br&gt;
2) I would prefer people to see human-readable urls rather than ugly querystrings. &lt;br&gt;
&lt;br&gt;
My first instinct would be to create a big redirect file (301), sending every old article url to the new format. This addresses point one but not point two, since after the redirect the user would still see the ugly url, right? (if not, please enlighten me!)&lt;br&gt;
&lt;br&gt;
My other thought is to use the &quot;ugly&quot; article.php as an include inside a wrapper, so create &quot;article_name.php&quot; which contains nothing but a variable for the article id and a copy of the include which takes in that variable. problems with this are a) we still clutter up our server with 1000s of files, and b) I have non-tech people messing with code and ftp, which i&apos;d rather not.&lt;br&gt;
&lt;br&gt;
So, what is the solution? How does mefi get those awesome clean urls like &quot;http://ask.metafilter.com/107722/Batch-Adding-Text-File-Name-to-JPEG-Images&quot;?&lt;br&gt;
&lt;br&gt;
I&apos;m guessing &quot;107722&quot; and &quot;Batch-Adding-Text-File-Name-to-JPEG-Images&quot; are some sort of aliases and not literal folders on the server. Is this something automated that is beyond the power of Wordpress? I hope not.&lt;br&gt;
&lt;br&gt;
thanks!</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.107726</guid>
	<pubDate>Tue, 25 Nov 2008 16:01:08 -0800</pubDate>
	<category>cms</category>
	<category>development</category>
	<category>human</category>
	<category>php</category>
	<category>press</category>
	<category>readable</category>
	<category>redirect</category>
	<category>resolved</category>
	<category>url</category>
	<category>urls</category>
	<category>web</category>
	<category>word</category>
	<category>wordpress</category>
	<dc:creator>drjimmy11</dc:creator>
	</item>
	<item>
	<title>Help needed with URL/PHP issues on Apache/MySQL server</title>
	<link>http://ask.metafilter.com/102315/Help%2Dneeded%2Dwith%2DURLPHP%2Dissues%2Don%2DApacheMySQL%2Dserver</link>	
	<description>I am having trouble with URLs 404ing on an Apache/MySQL server for a website using Wordpress, with some custom add-ons to use Wordpress as a CMS. I have a website which I have built for a client using Wordpress, on an Apache server with MySQL as the database. Their blog is available on http://www.theirdomain.com/blog/ and works fine. &lt;br&gt;
&lt;br&gt;
I&apos;ve created several index.php files on the main site, and in subfolders, using the Wordpress header and some custom PHP code to pull their page content from the WP database (I&apos;m using Wordpress pages as data for a CMS). The URLs for the pages are in the format http://www.theirdomain.com/services and http://www.theirdomain.com/aboutus/&lt;br&gt;
&lt;br&gt;
Unfortunately, I am having several errors a day with problems and 404 where the URLs are not resolving properly (i.e. you type in http://www.theirdomain.com/aboutus/ and it 404s). I can get round it by typing the filename index.php at the end, but I wanted to keep the URLs &quot;clean&quot;.&lt;br&gt;
&lt;br&gt;
The problem mainly occurs in IE7 (no surprise), but has occurred to several users - my clients and clients of theirs, and they are understandably miffed.&lt;br&gt;
&lt;br&gt;
Can someone help point me towards any possible solutions?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.102315</guid>
	<pubDate>Mon, 22 Sep 2008 04:07:40 -0800</pubDate>
	<category>apache</category>
	<category>issues</category>
	<category>mysql</category>
	<category>url</category>
	<dc:creator>Scramblejam</dc:creator>
	</item>
	<item>
	<title>What&apos;s up with my website?</title>
	<link>http://ask.metafilter.com/101467/Whats%2Dup%2Dwith%2Dmy%2Dwebsite</link>	
	<description>Why does my web page work with a www prefix and not so much without? I have a domain and hosting account with GoDaddy. My &quot;site&quot; currently consists of only one page with my Google calendar displayed. If I direct a browser to www.myaddress.info, it works fine; if I go to myaddress.info, sometimes it works, but sometimes I get a partial - just the frame from around the calendar and my name. It doesn&apos;t seem to be consistent, browser-specific, or have to do with changes to my GCal.&lt;br&gt;
&lt;br&gt;
Any thoughts? Anything I can do, or is it a fluke on the hosting side - slow/bad DNS propagation or some such? Should I fiddle with DNS redirection via my GoDaddy account to simply redirect the sometimes-working address to the always-working one?</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.101467</guid>
	<pubDate>Thu, 11 Sep 2008 11:09:06 -0800</pubDate>
	<category>calendar</category>
	<category>domain</category>
	<category>gcal</category>
	<category>godaddy</category>
	<category>google</category>
	<category>hosting</category>
	<category>url</category>
	<category>web</category>
	<category>website</category>
	<dc:creator>attercoppe</dc:creator>
	</item>
	<item>
	<title>Why is South Africa abbreviated as ZA?</title>
	<link>http://ask.metafilter.com/101331/Why%2Dis%2DSouth%2DAfrica%2Dabbreviated%2Das%2DZA</link>	
	<description>Why is South Africa abbreviated as ZA? I just got back from South Africa, where most of the URL&apos;s end with .co.za. I&apos;ve also seen the country abbreviated as ZA. Is there a reason why ZA is used instead of SA? None of the official names for South Africa involve a Z and the domain .sa doesn&apos;t seem to be used by any other country, as far as I can tell. &lt;br&gt;
&lt;br&gt;
The only answer I got so far was something along the lines of &quot;they ran out of letters by the time they got to the bottom of the world.&quot;</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.101331</guid>
	<pubDate>Tue, 09 Sep 2008 20:30:31 -0800</pubDate>
	<category>abbreviation</category>
	<category>africa</category>
	<category>south</category>
	<category>URL</category>
	<dc:creator>theseampsgoto11</dc:creator>
	</item>
	<item>
	<title>How do I extract the URLs from a web page?</title>
	<link>http://ask.metafilter.com/97935/How%2Ddo%2DI%2Dextract%2Dthe%2DURLs%2Dfrom%2Da%2Dweb%2Dpage</link>	
	<description>What&apos;s the fastest and simplest way of extracting the URLs from a html file? Input:  Any html page.&lt;br&gt;
Output:  A .txt file with the list of all the URLs in the page.&lt;br&gt;
&lt;br&gt;
Is there freeware that does this?&lt;br&gt;
How about a macro of some kind?&lt;br&gt;
&lt;br&gt;
A script would work but I don&apos;t know any of the script languages that run on PCs.  Wouldn&apos;t mind learning but only as a last resort.</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.97935</guid>
	<pubDate>Wed, 30 Jul 2008 17:47:13 -0800</pubDate>
	<category>extraction</category>
	<category>script</category>
	<category>text</category>
	<category>url</category>
	<dc:creator>storybored</dc:creator>
	</item>
	<item>
	<title>Regex woes.</title>
	<link>http://ask.metafilter.com/93745/Regex%2Dwoes</link>	
	<description>Not quite getting how mod_rewrite regex works when flatting urls with multiple variables. Hello hello,&lt;br&gt;
&lt;br&gt;
Ok, so I&apos;m working on another website but I have run into what I&apos;m sure is a pretty basic problem that I can&apos;t seem to wrap my head around. I use mod_rewrite pretty frequently to flatten the most basic types of dynamic urls, those with only one variable. But now I need to figure out how to configure my htaccess to handle urls that always have one variable, but sometimes also have 2-3.&lt;br&gt;
&lt;br&gt;
Now if I knew the same number of variables would be present all the time I think I could handle it, but when there is a variable number of variables I just can&apos;t seem to figure out what I&apos;m doing.&lt;br&gt;
&lt;br&gt;
Here is an example of what I do know how to do. Let&apos;s say I want to change the url: &lt;br&gt;
&lt;br&gt;
   http://mywebsite.com/profile/jeremy/&lt;br&gt;
&lt;br&gt;
into:&lt;br&gt;
&lt;br&gt;
   http://mywebsite.com/profile.php?name=jeremy&lt;br&gt;
&lt;br&gt;
I&apos;d use:&lt;br&gt;
&lt;br&gt;
   ReWriteRule ^profile/([A-Za-z]+)/$ /profile.php?name=$1&lt;br&gt;
&lt;br&gt;
but if sometimes I also add extra variables like so:&lt;br&gt;
&lt;br&gt;
   http://mywebsite.com/profile/jeremy/action/sort/order/desc/&lt;br&gt;
&lt;br&gt;
into:&lt;br&gt;
&lt;br&gt;
   http://mywebsite.com/profile.php?name=jeremy&amp;amp;action=sort&amp;amp;order=desc&lt;br&gt;
&lt;br&gt;
Then I just can&apos;t seem to wrap my head around it. Especially if depending on circumstances I might have urls like so where the 2nd variable in the previous example is missing, but the third is still round:&lt;br&gt;
&lt;br&gt;
   http://mywebsite.com/profile/jeremy/order/desc/&lt;br&gt;
  &lt;br&gt;
I&apos;ve Googled around but it seems most websites toughing on the subject are either too simple (and just give examples with single variables) or are too complicated and assume I already have abase level of regex knowledge which I sadly lack. &lt;br&gt;
&lt;br&gt;
So would any kindly Mefite want to give me a walk through on what exactly I should be trying to do (and most importantly why, so that I can avoid  just rote copy/pasting and instead be able to solve these kind of problems myself in the future =)&lt;br&gt;
&lt;br&gt;
Thanks much!&lt;br&gt;
Jeremy</description>
	<guid isPermaLink="false">tag:ask.metafilter.com,2008:site.93745</guid>
	<pubDate>Tue, 10 Jun 2008 17:27:29 -0800</pubDate>
	<category>htaccess</category>
	<category>regex</category>
	<category>url</category>
	<dc:creator>Jezztek</dc:creator>
	</item>
	
	</channel>
</rss>

