I am doing a security test on a system having an embedded TCL interpreter. The system receives input from the Internet (HTTP), parses it and passes to customisable TCL scripts. During a fuzzing test (sending binary garbage in HTTP headers) I have noticed the following errors in the log:
TCL error: list element in quotes followed by "{}x" instead of space while executing "foreach header [ XXXXX ] { }"
or
TCL error: unmatched open quote in list while executing "foreach header [ XXXXX ] {}"
Here XXXXX is a command returning an array of HTTP headers, as parsed by the system. Sorry for obfuscating the real command, I hope you understand I don't want to make too many details public before the vendor is informed about the issue (if it turns out to be an issue).
TCL code producing the error is very simple:
foreach header [ XXXXX ] { }
As far as I can tell, HTTP parsing is done outside of TCL and parsed values made accessible to TCL via custom commands (possibly implemented as TCL extension).
So my questions are:
Are these error tell-tale signs of security problems with the system, such as insufficient user input validation?
If yes, can this condition be exploited to execute arbitrary TCL statements by sending the system specially crafted request, a kind of code injection attack?
Are there any "Secure TCL coding practices" document? I could not find any.
You asked this on comp.lang.tcl where I replied with:
1) Are these error tell-tale signs of security problems with the
system, such as insufficient user input validation?
They're an indication of problems in the parsing code. I'd guess that
the code is assuming that it can assume that a header is a well-formed
Tcl list, which you've found to be wholly unsafe. Sanitization is to
use something like this:
set listOfWords [regexp -all -inline {\S+} $someString]
The resulting collection of words is guaranteed to be a well-formed
list, for an arbitrary input string.
2) If yes, can this condition be exploited to execute arbitrary TCL
statements by sending the system specially crafted request, a kind of
http://en.wikipedia.org/wiki/Code_injection attack?
Probably not, not unless you then treat that list as code.
3) Are there any "Secure TCL coding practices" document? Any other
source of information on how to safely handle untrusted data?
The simplest method is to do the parsing in a Safe Interpreter:
interp create -safe parsingInterp
parsingInterp eval { make the procedures }
parsingInterp eval [list doTheParse $stringToParse]
Note that we also guarantee that constructed lists (e.g., those out of
list, and many other commands besides) are eval-safe. That is:
eval [list $a $b $c]
is exactly the same as:
$a $b $c
This is true whatever is in those variables.
Related
General comment: any new answer which gives a new and useful insight into this question will be rewarded with a bonus.
The Bash reference manual mentions that Bash supports the
following for-loop constructs:
for name [ [in [words ...] ] ; ] do commands; done
for (( expr1 ; expr2 ; expr3 )) ; do commands ; done
Surprisingly, the following for-loop constructs are also valid:
for i in 1 2 3; { echo $i; }
for ((i=1;i<=3;++i)); { echo $i; }
These unusual constructs are not documented at all. Neither the Bash
manual, the Bash man-pages nor The Linux Documentation
Project make any mention of these constructs.
When investigating the language grammar one can see that using
open-and-close braces ({ commands; }) as an alternative to do commands; done is a valid construct that is implemented for both
for-loops and select statements and dates back to Bash-1.14.7
[1].
The other two loop-constructs:
until test-commands; do consequent-commands; done
while test-commands; do consequent-commands; done
do not have this alternative form.
Since a lot of shell-languages are related, one can find that these
constructs are also defined there and mildly documented. The KSH manual mentions:
For historical reasons, open and close braces may be used instead of do and done e.g.
for i; { echo $i; }
while ZSH implements and documents similar alternatives for the other loop-constructs, but with limitations. It states:
For the if, while and until commands, in both these cases the
test part of the loop must also be suitably delimited, such as by
[[ ... ]] or (( ... )), else the end of the test will not be recognized.
Question: What is the origin of this construct and why is
this not propagated to the other loop-constructs?
Update 1: There are some very useful and educational comments below
this post pointing out that this is an undocumented Bourne Shell feature which seems to be the result of a C-vs-sh language battle in the early days.
Update 2: When asking the question: Why is this language feature not documented? to the Gnu Bash mailinglist, I received the following answer from Chet Ramey (current lead-developer of GNU bash):
It's never been documented. The reason bash supports it (undocumented) is
because it was an undocumented Bourne shell feature that we implemented
for compatibility. At the time, 30+ years ago, there were scripts that used
it. I hope those scripts have gone into the dustbin of history, but who
knows how many are using this construct now.
I'm going to leave it undocumented; people should not be using it anyway.
Related questions/answers:
A bash loop with braces?
Hidden features of Bash (this answer)
[U&L] What is the purpose of the “do” keyword in Bash for loops?
Footnotes: [1] I did not find earlier versions, I do believe it predates this
[W]hy is this not propagated to the other loop-constructs?
Braced forms of while and until commands would be syntactically ambiguous because you can't separate test-commands from consequent-commands without having a distinctive delimiter between them as they are both defined by POSIX to be compound lists.
For example, a shell that supports such constructs can choose either one of the brace groups in the command below as consequent-commands and either way it would be a reasonable choice.
while true; { false; }; { break; }
Because of its ambiguous form, this command can be translated to either of the below; neither is a more accurate translation than the other, and they do completely different things.
while true; do
false
done
break
while true; { false; }; do
break
done
The for command is immune to this ambiguity because its first part—a variable name optionally followed by in and a list of words, or a special form of the (( compound command—can easily be distinguished from the brace group that forms its second part.
Given that we already have a consistent syntax for while and until commands, I don't really see any point in propagating this alternate form to them.
Wrt its origin, see:
Characteristical common properties of the traditional Bourne shells,
Stephen Bourne's talk at BSDCon,
Unix v7 source code, sh/cmd.c.
In powershell
$a = 4;
$a
will write 4 to the output stream.
Write-Ouput $a
writes 4 to output stream
Can you explain which operation is better performance wise if all I want to do is write variables to output stream?
If you just want to see what's in your variable, you should use a dollar sign with the name of your variable. This is most likely the fastest method to display the value of the variable.
UPDATE
Short answer: You should use just $a in order to write data in the output stream. It's the fastest way to do that.
Long answer with explanation:
If you write something like $a, then it goes directly to the output stream of the current PowerShell process, in other words, it goes to the output pipeline. This is because in PowerShell you have in general only three types of possible statements:
Definition statement: They are everything that you can define in your code for further use like functions or classes.
Assignment statement: You assign a value to a variable.
Putting-in-the-pipeline statement: everything else except 1 and 2.
Example for definition statements:
Function foo($bar)
{
#do stuff here
}
Example for assignment statements:
$foo = 'bar'
Example for putting-in-the-pipeline statement:
It is exactly what you asked! If you write $a, then it will be written to the output pipeline. That’s it.
Now about Write-Output. According to this code from the official PowerShell repository, it's just a wrapper for another PowerShell Cmdlet, namely for WriteObject. Therefore, the invocation of Write-Output costs more time then the invocation of WriteObject.
As Lee_Dailey mentioned in the first comment to your question, there is a wonderful article Let’s Kill Write-Output written by Mark Kraus in his blog. At first sight it may seem that it doesn't have much to do with your question, but this article explains what happens with objects that you wrote in your script and when they are not surrounded with different "printing functions" like Write-Out. There you can also find examples of how to "bypass" the use of the Write-Out in different situations that it brings advantages in performance.
I also recommend to read these two articles from Microsoft's DevBlogs:
Understanding Streams, Redirection, and Write-Host in PowerShell
Weekend Scripter: Welcome to the PowerShell Information Stream
They will definitely help to understand how streams are implemented and how they work in PowerShell.
If I have this command:
do_something
And requires 2 arguments argument_1 argument_2
argument_1 can be anything
argument_2 only can be true or false
A non-required third argument can be added argument_3
What is the standard to describe a script calling template?
I remember to have read somewhere about the different meanings of <> and [] and {}. For example:
<> mean required
[] mean non-required
{} contains a list of possible values
So for my example it will be
do_someting <argument_1> {true,false} [argument_3]
But I don't find any place that is confirming this syntax.
I would write the following:
do_someting argument_1 {true|false} [argument_3]
I think mendatory arguments don't need any kind of braces, it looks clearer like that. Furthermore, the pipe feels more standard to represent an "or" statement.
I am not aware of such standard, but you can have a look at Google recommendations or docopt for example. Also, here is a POSIX document which may interest you.
Is there some reason why bash 'variables' are different from variables in other 'normal' programming languages?
Is it due to the fact that they are set by the output of previous programs or have to be set by some kind of literal text, ie they have to be set by the output of some program or something outputting text through standard input/output or the console or such like?
I am at a loss to use the right vocabulary, but can anyone who can understands what I trying to say and perhaps use the right words or point me some docs where I can understand bash variable concepts better.
In most languages, variables can contain different kinds of values. For example, in Python a variable can be a number that you can do arithmetics on (a-1), an array or string you can split (a[3:]), or a custom, nested object (person.name.first_name).
In bash, you can't do any of this directly. If I understood you right, you asked why this is.
There are two reasons why you can't really do the same in bash.
One: environment variables are (conventionally) simple key=value strings, and the original sh was a pretty thin wrapper on top of the Unix process model. Bash works the same, for technical and compatibility reasons. Since all variables are (based on) strings, you can't really have rich, nested types.
This also means that you can't set a variable in a subshell/subscript you call. The variable won't be set in the parent script, because that's not how environment variables work.
Two: Original sh didn't separate code and data, since this makes it easier to work with interactively. Sh treated all non-special characters as literal. I.e. find / -name foo was considered four literal strings: a command and three arguments.
Bash can't just decide that find / -name now means "the value of the variable find divided by the negated value of variable name", since that would mean everyone's find commands would start breaking. This is why you can't have the simple dereferencing syntax other languages do.
Even $name-1 can't be used to substract, because it could just as easily be intended as part of $name-1-12-2012.tar.gz, a filename with a timestamp.
I would say it has to do with Bash functions. Bash functions cannot return a value, only a status code.
So with Bash you can have a function
foo ()
{
grep bar baz
}
But if you try to "save" the return value of the function
quux=$?
It is merely saving the exit status, not any value. Contrast this with a language such as Javascript, functions can actually return values.
foo ()
{
return document.getElementById("dog").getAttribute("cat");
}
and save like this
quux = foo();
I would like to check for the value of a node attribute. This case statement is what I have so far, and it works:
case node[:languages][:ruby][:host_cpu]
when "x86_64"
...
when "i686"
...
end
What I would like to do is use an if statement instead. This is what I tried:
if node[:languages][:ruby][:host_cpu]?("X86_64")
...
end
This is based on the following, Which worked.
if platform?("ubuntu")
...
end
However, my try didn't work. it gave a syntax error on the if line saying that there was an unexpected \n and $end was expected.
I found that there are two kinds of ways of performing an if. The first being the one I demonstrated above, which (apparently) only works with resources, and if_only, which works with nodes. like so
if_only {node[:languages]}
which seems to work only for checking the presence of nodes, and within a do context.
How do I check the value of a node using an if statement? One method does check values, but only of resources, the other checks nodes, but only for their presence, and not their values.
You are mixing up way to many different variants for conditionals, most of which are part of Chef, not Ruby. Let me try to describe the different options one by one.
Generally, a case is roughly comparable to a series of if and elsif statements. Your case above
case node[:languages][:ruby][:host_cpu]
when "x86_64"
...
when "i686"
...
end
is thus roughly equivalent to
if node[:languages][:ruby][:host_cpu] == "x86_64"
...
elsif node[:languages][:ruby][:host_cpu] == "i686"
...
end
As a side remark, case actually uses the === operator which is often not commutative but more powerful. For simple comparisons it works the same as == though. Both these variants are part of the Ruby language, in which you write your cookbooks.
The other options you mentioned are actually part of the API which Chef defined on top of Ruby. This is often called the Chef DSL (which stands for Domain Specific Language, i.e. an extension or adaption of a language, in this case Ruby for a specific usage domain, in this case configuration management.)
The platform? method is a method defined by Chef that checks whether the curent platform is one of the passed values. You can read more about that (and similar methods, e.g. the now recommended platform_family? method at the Chef docs for recipes in general and some often used ruby idioms.
As a side-remark: you might be surprised by the fact that Ruby allows the ? and ! characters to appear at the end of method names, which makes Ruby rather unique among similar languages in this regard. These characters are simply part of the method name and have no special meaning to the language. They are only used by convention to programmers to better identify the purpose of a method. If a method has a ? at the end, it is to be used to check some condition and is expected to return either a truthy or falsy value. Methods with a ! at the end often perform some potentially dangerous operation, e.g. change object in place, delete stuff, ... Again, this is only a convention and is not interpreted by the language.
The last option you mentioned, the only_if and by extension not_if are used to define conditionals on Chef resources to make sure they are only executed when a certain condition is true (or when using not_if, if it is false). As these attributes are used on Chef resources only, they are naturally also defined by Chef.
To understand why they are useful it is necessary to understand how a Chef run works. The details can be found at the description of the Anatomy of a Chef Run. What is important there is, that you basically have two execution phases: Resource Compilation and Convergence. In the first step, the actual code to define the resources is executed. Here, also the code in your case statement would be run. After all the recipes have been loaded and all the resources have been defined, Chef enters the second phase, the Convergence phase. There, the actual implementation of the resources which performs the changes (create files and directories, in stall packages, ...) is run. Only in this phase, the only_if and not_if conditions are checked.
In fact, you can observe the difference between
file "/tmp/helloworld"
action :create
content "hello world"
end
if File.exist?("/tmp/helloworld")
file "/tmp/foobar"
action :create
content "foobar"
end
end
and
file "/tmp/helloworld"
action :create
content "hello world"
end
file "/tmp/foobar"
action :create
content "foobar"
only_if{ File.exist?("/tmp/helloworld") }
end
In the first variant, the condition whether /tmp/foobar exists is checked during resource compilation. At this time, the code to actually create the /tmp/helloworld file has not been run, as it does that only in the Conversion step. Thus, during your first run, the /tmp/foobar file would not be created.
In the second variant however, the check is done with only_if which is evaluated during conversion. Here you will notice that both files get created in the first run.
If you want to read a bit more on how the definition of the conditionals works in terms of Ruby (and you definitely should), you can read about Ruby Blocks which are more or less pieces of code that can be passed around for later execution.