I want a text tool that can search documents for phrases within X words of a target.
November 18, 2004 7:23 AM Subscribe
Text search tool: I'm looking for a word processor or text tool, that will allow me to search documents for phrases centred about particular words. I want to be able to specify a 'window,' say of 15 words; and the tool will then return *all* the text that occurs 15 words either side of the search target. Example: if I searched this AskMe page for "fanboy" as a target, and set my window as '6,' I would expect to get - "aren't trying to show off their fanboy Pitchfork-esque indie creds. The less politics" - from alidarbac's question below, as one of the returns. Free/share/commercial-ware all acceptable. Many thanks!
Response by poster: Yes! That's what I'm looking for - but I didn't know it was called "KWIC," which presumably is why it was hard to find things out about it - so thanks, jessamyn!
posted by carter at 8:03 AM on November 18, 2004
posted by carter at 8:03 AM on November 18, 2004
You should also try looking for the word "concordance," which should also turn up a lot of scripts and tools used by language geeks (like myself.)
posted by Mo Nickels at 8:19 AM on November 18, 2004
posted by Mo Nickels at 8:19 AM on November 18, 2004
using grep, unix style:
cat $files_to_search | tr '\n' ' ' | egrep -o '([^ ]*[ \t]*){0,6}generation([^ ]*[ \t]*){0,6}'
posted by sfenders at 11:51 AM on November 18, 2004
cat $files_to_search | tr '\n' ' ' | egrep -o '([^ ]*[ \t]*){0,6}generation([^ ]*[ \t]*){0,6}'
posted by sfenders at 11:51 AM on November 18, 2004
The grep suggestion is awesome. I am in a sooper hurry, but go to http://slinkages.blogspot.com and scroll down for a link called something like... oh screw it
linux cookbook: analyzing text
Lots of cool stuff, and you can probably install windows versions of these commands. They work in the os x terminal.
posted by mecran01 at 3:26 PM on November 18, 2004
linux cookbook: analyzing text
Lots of cool stuff, and you can probably install windows versions of these commands. They work in the os x terminal.
posted by mecran01 at 3:26 PM on November 18, 2004
I don't know what sort of geek quotient you posess, but this is the sort of thing Perl and regular expressions were made for.
posted by icey at 4:16 PM on November 18, 2004
posted by icey at 4:16 PM on November 18, 2004
Response by poster: My geek quotient is low; my colleague's however is not. I'm initially hoping to experiment with a few off-the-shelf tools, shifting inputs and outputs between them, in order to work towards a basic proof-of-concept for a particular way of analyzing and parsing spoken and written communication (here, the KWIC and concordance tools will be a very good start). I have a big corpus I can play with.
I want to use the proof-of-concept to start spec'ing out a specific tool for a colleague to develop - who, fortunately, is *very* geeky ;) My relationship with him will go better I think if I can point to existing tools and say, "I like what this does here," or, "I wish this would do this rather than this here." Rather than just saying, "Build me something neat."
Anyway, thanks, y'all! This has been very helpful.
posted by carter at 4:34 PM on November 18, 2004
I want to use the proof-of-concept to start spec'ing out a specific tool for a colleague to develop - who, fortunately, is *very* geeky ;) My relationship with him will go better I think if I can point to existing tools and say, "I like what this does here," or, "I wish this would do this rather than this here." Rather than just saying, "Build me something neat."
Anyway, thanks, y'all! This has been very helpful.
posted by carter at 4:34 PM on November 18, 2004
This thread is closed to new comments.
posted by jessamyn at 7:56 AM on November 18, 2004