Anomalous test -v results with bash associative array
September 6, 2023 9:24 PM Subscribe
Is this a bash bug? I've already posted this same question to Stack Overflow to find out where the quality answers turn up faster.
Ran across this apparent bash anomaly. Test code:
Bash bug, or just me needing better-informed expectations? If the latter, which part of what manual should I have read more carefully?
Ran across this apparent bash anomaly. Test code:
( set -x declare -A a=() for i in x "'" '"' do a[$i]= test -v a[$i] && true done : "${!a[@]}" )Output:
+ a=() + declare -A a + for i in x "'" '"' + a[$i]= + test -v 'a[x]' + true + for i in x "'" '"' + a[$i]= + test -v 'a['\'']' + for i in x "'" '"' + a[$i]= + test -v 'a["]' + : \' '"' xI would have expected true to be executed three times, not just the once. It seems that test -v fails for associative array elements where the index expands into something containing a quote character.
Bash bug, or just me needing better-informed expectations? If the latter, which part of what manual should I have read more carefully?
Response by poster: Your guess is correct. The nub of the thing is why those last two a-index expressions are evaluating to false, which is something I expect can only be explained by somebody who does use bash for scripting and does know it well.
I do use it for scripting and thought I knew it well, which is why I found this particular dark corner of it so surprising when I relied on this construction and it bit me on the arse.
posted by flabdablet at 10:10 PM on September 6
I do use it for scripting and thought I knew it well, which is why I found this particular dark corner of it so surprising when I relied on this construction and it bit me on the arse.
posted by flabdablet at 10:10 PM on September 6
Response by poster: Bash language explanations for those following along at home:
The outer parentheses make bash run everything inside them in a subshell, which means that variable declarations, settings and whatnot between the parens won't have persistent effects inside any shell into which this code gets pasted. Putting the code in parens also means that none of it will run until the closing parenthesis has been entered, which is nice for testing via pasting.
set -x turns on command tracing, which is why this code produces any output at all; every line in the output is just bash showing you the statement it just ran. Tracing stops when the subshell exits at the closing paren.
declare -A a=() creates an associative array named a and empties it.
for i in x "'" '"' do ... done runs everything between do and done three times, first with variable i set to the single-character string x, next with it set to the string ' and last with it set to the string ".
a[$i]= creates an entry in the associative array a indexed by the expansion of the variable i, and sets its value to an empty string. So at the end of the for loop there should be three entries in a, indexed by the strings x ' and ".
test -v a[$i] should succeed whenever the associative array a contains an entry indexed by the expansion of i. Since that test immediately follows the creation of just such an entry, I would have expected it to succeed every time.
true is a command that does nothing at all apart from exiting successfully. I'm using it here, after the short-circuiting logical-AND operator &&, just so it will show up in the command trace if and only if the test before that operator succeeds.
: is a shell builtin command that can be invoked with arbitrary arguments and does nothing with them. I'm using it here purely to make the shell evaluate the construction "${!a[@]}", which expands to a list of all the indexes for which the associative array a has entries.
As is evident from the command trace output, all three assignments did actually create the expected entries inside a. The mystery is why test -v refuses to confirm this when the index in question contains a single-quote or double-quote character.
posted by flabdablet at 10:47 PM on September 6 [1 favorite]
The outer parentheses make bash run everything inside them in a subshell, which means that variable declarations, settings and whatnot between the parens won't have persistent effects inside any shell into which this code gets pasted. Putting the code in parens also means that none of it will run until the closing parenthesis has been entered, which is nice for testing via pasting.
set -x turns on command tracing, which is why this code produces any output at all; every line in the output is just bash showing you the statement it just ran. Tracing stops when the subshell exits at the closing paren.
declare -A a=() creates an associative array named a and empties it.
for i in x "'" '"' do ... done runs everything between do and done three times, first with variable i set to the single-character string x, next with it set to the string ' and last with it set to the string ".
a[$i]= creates an entry in the associative array a indexed by the expansion of the variable i, and sets its value to an empty string. So at the end of the for loop there should be three entries in a, indexed by the strings x ' and ".
test -v a[$i] should succeed whenever the associative array a contains an entry indexed by the expansion of i. Since that test immediately follows the creation of just such an entry, I would have expected it to succeed every time.
true is a command that does nothing at all apart from exiting successfully. I'm using it here, after the short-circuiting logical-AND operator &&, just so it will show up in the command trace if and only if the test before that operator succeeds.
: is a shell builtin command that can be invoked with arbitrary arguments and does nothing with them. I'm using it here purely to make the shell evaluate the construction "${!a[@]}", which expands to a list of all the indexes for which the associative array a has entries.
As is evident from the command trace output, all three assignments did actually create the expected entries inside a. The mystery is why test -v refuses to confirm this when the index in question contains a single-quote or double-quote character.
posted by flabdablet at 10:47 PM on September 6 [1 favorite]
Best answer: I don't know, but this seems to work:
posted by Wobbuffet at 11:50 PM on September 6
( set -x declare -A a=() for i in x "'" '"' do test -v a[\${i}] && echo "this should not happen" a[$i]= test -v a[\${i}] && echo "all is well" test -v a["\'"] && echo "single quote is set" test -v 'a[\"]' && echo "double quote is set" done : "${!a[@]}" )
posted by Wobbuffet at 11:50 PM on September 6
I have had nothing but heartbreak with bash associative arrays and
If it was me, I'd maybe try something like
posted by whir at 11:51 PM on September 6
test -v
, and in general the moment I start to consider using them is the moment I start to rewrite my scripts in python or whatnot. That said, I suspect you've hit a bug or edge case or something having to do with using quote characters as indices specifically. (Can't see anything obvious in the impenetrable prose of the GNU Bash manual though.)If it was me, I'd maybe try something like
[[ -v a[$i] && echo value at $i: $a[$i] ]]
to print your debug output rather than relying on set -x
to do it specifically (and set the values to a non-empty string).posted by whir at 11:51 PM on September 6
Best answer: I don't know if this would be considered a bug in bash itself, but it's definitely bad documentation and confusing behavior. But the good news is that I think you can work around it. Here's my explanation of what's going on:
When you run
The bash documentation sort of hints at what's going on here:
The reason your code isn't working is that the array reference actually seen by
So one option is to escape the quote properly:
My goodness, bash is a mess.
posted by teraflop at 12:00 AM on September 7 [3 favorites]
When you run
test -v a[$i]
, the string a[$i]
gets parsed and evaluated like any other argument, and the result is passed to the test
builtin. But test -v
has weird behavior when used with an array reference: it parses and evaluates the subscript again.The bash documentation sort of hints at what's going on here:
When using a variable name with a subscript as an argument to a command, such as with unset
, without using the word expansion syntax described above, the argument is subject to the shell’s filename expansion. If filename expansion is not desired, the argument should be quoted.
but it's not really complete or accurate. As far as I can tell from reading the code, the subscript, and only the subscript, undergoes a second round of first quote expansion and then variable expansion -- regardless of whether the original string was quoted.The reason your code isn't working is that the array reference actually seen by
test
is a[']
, which isn't syntactically valid because it has an unmatched quote. So one option is to escape the quote properly:
test -v a[$(printf %q $i)]
And another option is to quote the entire subscripted variable reference, so that the correct string value is used after it gets expanded:
test -v a['$i']
Both of these seem to work for me on bash-5.2, but I haven't tested them thoroughly.My goodness, bash is a mess.
posted by teraflop at 12:00 AM on September 7 [3 favorites]
Response by poster: OK, sussed it.
This works:
The unwritten rule appears to be that the name argument supplied to test -v name gets dealt with in exactly the same way as whatever can go on the left hand side of the = in a shell variable assignment as the shell parses command lines. Specifically, any expansion syntax inside any such name will be evaluated before testing whether that name has a referent. Furthermore, to make quote removal operate correctly, such expansions have to be deferred in that way; they can't be allowed to happen before the name gets passed as an argument to test.
Same appears to apply to names supplied as arguments to unset.
Closest I can find to this in the bash manual is this paragraph in the section on arrays:
Summarizing: if you're going to use test -v to test for the existence of an entry in an associative array, wrap the name in single quotes, not double quotes, even if it includes expansions of other variables. You want test -v 'a[$i]', not anything that expands the same way as test -v "a[$i]" would. Same applies to unset.
On preview: what Wobbuffet said and teraflop explained. Both of you beat Stack Overflow, btw, on both speed and quality.
posted by flabdablet at 12:16 AM on September 7 [2 favorites]
This works:
( set -x declare -A a=() a[x]= a["'"]= a['"']= for i in x "'" '"' y do test -v 'a[$i]' && true done : "${!a[@]}" )Output:
+ a=() + declare -A a + a[x]= + a["'"]= + a['"']= + for i in x "'" '"' y + test -v 'a[$i]' + true + for i in x "'" '"' y + test -v 'a[$i]' + true + for i in x "'" '"' y + test -v 'a[$i]' + true + for i in x "'" '"' y + test -v 'a[$i]' + : \' '"' xNow, the only test -v that fails is the one for the array element a[y] which was indeed never set.
The unwritten rule appears to be that the name argument supplied to test -v name gets dealt with in exactly the same way as whatever can go on the left hand side of the = in a shell variable assignment as the shell parses command lines. Specifically, any expansion syntax inside any such name will be evaluated before testing whether that name has a referent. Furthermore, to make quote removal operate correctly, such expansions have to be deferred in that way; they can't be allowed to happen before the name gets passed as an argument to test.
Same appears to apply to names supplied as arguments to unset.
Closest I can find to this in the bash manual is this paragraph in the section on arrays:
When using a variable name with a subscript as an argument to a command, such as with unset, without using the word expansion syntax described above, the argument is subject to the shell’s filename expansion. If filename expansion is not desired, the argument should be quoted.which strikes me as kind of tangential to the issue at hand.
Summarizing: if you're going to use test -v to test for the existence of an entry in an associative array, wrap the name in single quotes, not double quotes, even if it includes expansions of other variables. You want test -v 'a[$i]', not anything that expands the same way as test -v "a[$i]" would. Same applies to unset.
On preview: what Wobbuffet said and teraflop explained. Both of you beat Stack Overflow, btw, on both speed and quality.
posted by flabdablet at 12:16 AM on September 7 [2 favorites]
Response by poster: As far as I can tell from reading the code, the subscript, and only the subscript, undergoes a second round of first quote expansion and then variable expansion
I couldn't be arsed reading the code, which is why I asked here instead, but I don't think it's just variable expansion; I think it goes the full Monty. Try unset 'a[x$(ls -al >&2)]' if you feel like having a bit of a shudder.
Might be a fun little class of script injection vulnerabilities lurking in there.
posted by flabdablet at 12:26 AM on September 7
I couldn't be arsed reading the code, which is why I asked here instead, but I don't think it's just variable expansion; I think it goes the full Monty. Try unset 'a[x$(ls -al >&2)]' if you feel like having a bit of a shudder.
Might be a fun little class of script injection vulnerabilities lurking in there.
posted by flabdablet at 12:26 AM on September 7
So did AskMe beat those punks over at the other place?
This feels like a win to me but I can't be arsed to go dig around there :)
posted by SaltySalticid at 4:48 AM on September 7 [1 favorite]
This feels like a win to me but I can't be arsed to go dig around there :)
posted by SaltySalticid at 4:48 AM on September 7 [1 favorite]
Response by poster: Yes. Wobbuffet's answer here turned up an hour and a half before the first answer over there, and as of writing this comment there are still none there as good as teraflop's.
posted by flabdablet at 5:13 AM on September 7 [2 favorites]
posted by flabdablet at 5:13 AM on September 7 [2 favorites]
Might be worth linking to teraflop's answer from the SO thread then :-)
posted by trig at 5:26 AM on September 7 [1 favorite]
posted by trig at 5:26 AM on September 7 [1 favorite]
Response by poster: Spoke too soon - Stack Overflow has taken the quality lead by providing a link to a BashPitfalls entry that covers this exact thing. Now I have a lovely new site full of dark corners to rootle happily about in!
My goodness, bash is a mess.
You say that like you think it's a bad thing :-)
posted by flabdablet at 7:24 AM on September 7 [3 favorites]
My goodness, bash is a mess.
You say that like you think it's a bad thing :-)
posted by flabdablet at 7:24 AM on September 7 [3 favorites]
I'm also to lazy but wonder if like "shellcheck" would have flagged this. But I think the testing nature of the code might through it off a bit. Lint-ers be weird like that.
My goodness, bash is a mess.
And people fuss about Perl :-)
posted by zengargoyle at 11:46 AM on September 7
My goodness, bash is a mess.
And people fuss about Perl :-)
posted by zengargoyle at 11:46 AM on September 7
a BashPitfalls entry
Hah, I was going to mention looking into [[, but figured you were already further along.
posted by rhizome at 3:55 PM on September 8 [1 favorite]
Hah, I was going to mention looking into [[, but figured you were already further along.
posted by rhizome at 3:55 PM on September 8 [1 favorite]
You are not logged in, either login or create an account to post comments
posted by qxntpqbbbqxl at 9:50 PM on September 6 [1 favorite]