down by the river - regex and poker cards
June 29, 2009 1:40 PM
looking for regex expert to help with replacing playing card references
Hello mefite regex experts!
I am trying to process a string to find and replace playing card references. When texas hold ‘em poker players express playing cards they use a number of formats, for example ‘7-7’ to indicate a pair of sevens or ‘7-8-9’ to show the three card flop. They also show cards with their suits, for example ‘kc-kd-kh’ to show king clubs, king diamonds and king hearts. However, the ‘-‘ are arbitrary, they could use spaces ‘kc kd kh’, commas etc. AND they are variable length, so I need to find ‘A-A’ ‘7-8-9-10-j’ and ‘7c-8c-9c-10c’.
I want to find these strings and replace them with HTML (using ColdFusion) and I am looking to the regexp which will allow me to process things like:
‘xCx’ or ‘xCxCx’ or ‘xCxCxCx’ or ‘xCxCxCxCx’ or ‘xCxCxCxCxCx’ where ‘x’ is any one of a number of special characters (space, comma, !, period, -, etc.) and ‘C’ is one of 2,3,4,5,6,7,8,9,10,j,q,k,a. (casing to be ignored)
I also need to find and replace ‘xCSx’, ‘xCSxCSx’, etc, where ‘S’ is the suit – c, h, d or s.
One further complication is that when I replace the suit (‘S’) string, I need to replace it with the HTML symbol - ♠ etc. and wrap a class around it so ‘ KC ’ would become ‘ [span class=’card’] K [span class=’blacksuit’]♣[/span] [/span] ’.
(I am using [ ] instead of angle brackets to preserver the formatting!).
Is this possible? Is it too much for a single regexp??? Thanks for all help!
Hello mefite regex experts!
I am trying to process a string to find and replace playing card references. When texas hold ‘em poker players express playing cards they use a number of formats, for example ‘7-7’ to indicate a pair of sevens or ‘7-8-9’ to show the three card flop. They also show cards with their suits, for example ‘kc-kd-kh’ to show king clubs, king diamonds and king hearts. However, the ‘-‘ are arbitrary, they could use spaces ‘kc kd kh’, commas etc. AND they are variable length, so I need to find ‘A-A’ ‘7-8-9-10-j’ and ‘7c-8c-9c-10c’.
I want to find these strings and replace them with HTML (using ColdFusion) and I am looking to the regexp which will allow me to process things like:
‘xCx’ or ‘xCxCx’ or ‘xCxCxCx’ or ‘xCxCxCxCx’ or ‘xCxCxCxCxCx’ where ‘x’ is any one of a number of special characters (space, comma, !, period, -, etc.) and ‘C’ is one of 2,3,4,5,6,7,8,9,10,j,q,k,a. (casing to be ignored)
I also need to find and replace ‘xCSx’, ‘xCSxCSx’, etc, where ‘S’ is the suit – c, h, d or s.
One further complication is that when I replace the suit (‘S’) string, I need to replace it with the HTML symbol - ♠ etc. and wrap a class around it so ‘ KC ’ would become ‘ [span class=’card’] K [span class=’blacksuit’]♣[/span] [/span] ’.
(I am using [ ] instead of angle brackets to preserver the formatting!).
Is this possible? Is it too much for a single regexp??? Thanks for all help!
You need to write a parser, which is more than can be done with just a regular expression.
posted by mkb at 1:51 PM on June 29, 2009
posted by mkb at 1:51 PM on June 29, 2009
I have a poker bot (sue me!) that runs against several servers. Like others have said, I think a better approach is just a simple class structure that parses hands. ie, I have an interface "HandParser", then concrete classes "FullTiltParser",'PartyPokerParser" etc. This also lets you actually build a Hand class, with nice methods etc instead of just mucking with Strings.
posted by H. Roark at 1:53 PM on June 29, 2009
posted by H. Roark at 1:53 PM on June 29, 2009
thanks all.
mkb - I have already wirtten a parser in ColdFusion but it takes an age given the sheer number of variables (too long for real time processing).
Let me ask anothre question, can I do a two level process, firstly reformat and wrap the found strings in a special wrapper, e.g. replace ' ah ' with ' [^AH^] and then use a send regex (or parser) to do the HTML bits.
Essentially can I sterilise the strings first using regexp and then use a parser to add the HTML?
posted by the_very_hungry_caterpillar at 1:56 PM on June 29, 2009
mkb - I have already wirtten a parser in ColdFusion but it takes an age given the sheer number of variables (too long for real time processing).
Let me ask anothre question, can I do a two level process, firstly reformat and wrap the found strings in a special wrapper, e.g. replace ' ah ' with ' [^AH^] and then use a send regex (or parser) to do the HTML bits.
Essentially can I sterilise the strings first using regexp and then use a parser to add the HTML?
posted by the_very_hungry_caterpillar at 1:56 PM on June 29, 2009
should be ' and then use a second regex (or '
posted by the_very_hungry_caterpillar at 1:58 PM on June 29, 2009
posted by the_very_hungry_caterpillar at 1:58 PM on June 29, 2009
H. Roark - thx for the reply. I am not processing online poker games but text messages and twitter feeds so there is no fixed format for the way hands are described, hence the regex.
posted by the_very_hungry_caterpillar at 2:02 PM on June 29, 2009
posted by the_very_hungry_caterpillar at 2:02 PM on June 29, 2009
I'm not sure what CF's regex capabilities are, but in ruby this wouldn't be too hard. I'd use something like this:
input_string.scan(/([02-9jqka][cdhs]?)[- ,.!]?/) {|x| puts x.inspect }
That would iterate over all the regex matches in input string and call the block with the match group.
posted by Cogito at 3:22 PM on June 29, 2009
input_string.scan(/([02-9jqka][cdhs]?)[- ,.!]?/) {|x| puts x.inspect }
That would iterate over all the regex matches in input string and call the block with the match group.
posted by Cogito at 3:22 PM on June 29, 2009
The fact that they can be separated by spaces or dashes or commas makes it much harder. I'm guessing this is for a forum or online chat; if that's the case, you could do just dashes:
1) look for all character sequences of the form ([^-]{1;3})-([^-]{1;3})+ and separate them into their components
2) for each <card> of the <card>-<card>[-<card>] string:
a) If it's a valid card, replace it by the appropriate HTML.
b) if it's not, abort the whole process.
(that way you don't replace "it's a-ok with me" with "it's <some html>-ok with me".
This simplifies the problem quite a bit. You could then advertise it to your users. That way, if they want the fancy html, they'll use dashes.
posted by Monday, stony Monday at 6:11 PM on June 29, 2009
1) look for all character sequences of the form ([^-]{1;3})-([^-]{1;3})+ and separate them into their components
2) for each <card> of the <card>-<card>[-<card>] string:
a) If it's a valid card, replace it by the appropriate HTML.
b) if it's not, abort the whole process.
(that way you don't replace "it's a-ok with me" with "it's <some html>-ok with me".
This simplifies the problem quite a bit. You could then advertise it to your users. That way, if they want the fancy html, they'll use dashes.
posted by Monday, stony Monday at 6:11 PM on June 29, 2009
Oh, you could do.. (ASSuming the largest number of cards is no more than five)
sep ::== space | comma | dash
rawcard ::== cardchar [cardchar] [cardchar]
cardchar ::= 0-9 | j | q | k | a | h | d | c | s (+ uppercase)
cardseq ::== rawcard sep rawcard [sep rawcard ...]
Using something like "([0-9jqkahdcs]{1,3})[ ,-]([0-9jqkahdcs]{1,3})([ ,-]([0-9jqkahdcs]{1,3}))?([ ,-]([0-9jqkahdcs]{1,3}))?([ ,-]([0-9jqkahdcs]{1,3}))?"
(that is, rawcard sep rawcard [sep rawcard] [sep rawcard] [sep rawcard]. Using the fuction REFind, you now have an array with your cards in positions 1, 2, 4, 6 and 8.
As above, check that all the components represent cards; if so, replace them with the proper html. Lather, rinse, repeat with the rest of the string.
This is an abomination, but CF's Regex functions are very limited.
posted by Monday, stony Monday at 7:14 PM on June 29, 2009
sep ::== space | comma | dash
rawcard ::== cardchar [cardchar] [cardchar]
cardchar ::= 0-9 | j | q | k | a | h | d | c | s (+ uppercase)
cardseq ::== rawcard sep rawcard [sep rawcard ...]
Using something like "([0-9jqkahdcs]{1,3})[ ,-]([0-9jqkahdcs]{1,3})([ ,-]([0-9jqkahdcs]{1,3}))?([ ,-]([0-9jqkahdcs]{1,3}))?([ ,-]([0-9jqkahdcs]{1,3}))?"
(that is, rawcard sep rawcard [sep rawcard] [sep rawcard] [sep rawcard]. Using the fuction REFind, you now have an array with your cards in positions 1, 2, 4, 6 and 8.
As above, check that all the components represent cards; if so, replace them with the proper html. Lather, rinse, repeat with the rest of the string.
This is an abomination, but CF's Regex functions are very limited.
posted by Monday, stony Monday at 7:14 PM on June 29, 2009
Wow, I thought I'd found a better way to do this, but it turns out that CF's string fuction are seriously crippled (no split()? WTF?!). If it were a personal project of mine, I'm consider switching languages.
posted by Monday, stony Monday at 8:41 PM on June 29, 2009
posted by Monday, stony Monday at 8:41 PM on June 29, 2009
This thread is closed to new comments.
posted by Dr Dracator at 1:49 PM on June 29, 2009