Concatenating strings fails when read from certain files - bash

I have a web application that is deployed to a server. I am trying to create a script that amoing other things reads the current version of the web application from a properties file that is deployed along with the application.
The file looks like this:
//other content
version=[version number]
build=[buildnumber]
//other content
I want to create a variable that looks like this: version-buildnumber
Here is my script for it:
VERSION_FILE=myfile
VERSION_LINE="$(grep "version=" $VERSION_FILE)"
VERSION=${VERSION_LINE#$"version="}
BUILDNUMBER_LINE=$(grep "build=" $VERSION_FILE)
BUILDNUMBER=${BUILDNUMBER_LINE#$"build="}
THEVERSION=${VERSION}-${BUILDNUMBER}
The strange thing is that this works in some cases but not in others.
The problem I get is when I am trying to concatenate the strings (i.e. the last line above). In some cases it works perfectly, but in others characters from one string replace the characters from the other instead of being placed afterwards.
It does not work in these cases:
When I read from the deployed file
If I copy the deployed file to another location and read from there
It does work in these cases:
If I write a file from scratch and read from that one.
If I create my own file and then copy the content from the deployed file into my created file.
I find this very strange. Is there someone out there recognizing this?

It is likely that your files have carriage returns in them. You can fix that by running dos2unix on the file.
You may also be able to do it on the fly on the strings you're retrieving.
Here are a couple of ways:
Do it with sed instead of grep:
VERSION_LINE="$(sed -n "/version=/{s///;s/\r//g;p}" $VERSION_FILE)"
and you won't need the Bash parameter expansion to strip the "version=".
OR
Do the grep as you have it now and do a second parameter expansion to strip the carriage return.
VERSION=${VERSION_LINE#$"version="}
VERSION=${VERSION//$'\r'}
By the way, I recommend habitually using lowercase or mixed case variable names in order to reduce the chance of name collisions.

Given this foo.txt:
//other content
version=[version number]
build=[buildnumber]
//other content
you can extract a version-build string more easily with awk:
awk -F'=' '$1 == "version" { version = $2}; $1 == "build" { build = $2}; END { print version"-"build}' foo.txt
I don't know why your script doesn't work. Can you provide an example of erroneous output?
From this sentence:
In some cases it works perfectly, but in others characters from one string replace the characters from the other instead of being placed afterwards.
I can't understand what's actually going on (I'm not a native English speaker so it's probably my fault).
Cheers,
Giacomo

Related

Sed replace unusual file extension arising from gmv

As a result of using gmv on a large nested directory to flatten in, I have a number of duplicate files separated out and with the extensions "._1_" "._2_" etc ( .... ._n_ )
eg "a.pdf.\_1\_"
ie its
a(dot)pdf(dot)(back slash)1(back slash)
as opposed to
a(dot)pdf(dot)1
which I want to reduce it back to "a.pdf"
I tried something like
sed -i .bak "s|.\_1\_||" *
which is usually reliable and doesn't require escape characters. However its giving me
"error: illegal byte sequence"
Grateful for help to fix. This is on Mac OSX terminal. Ideally I'd like a generic solution to fix ._*_ forms where the * varies 1 to 9
There are two challenges here.
How to deal with the duplicate basename (The suffixes '1', '2', ... mostly like added to designate different sections of a single file - may be different pages a PDF, etc. Performing rename that will strip the files may cause some important files to disappear.
How to deal with the "error: illegal byte sequence" which indicate that some special characters (unicode) are part of the file name. Usually ASCII characters with value >= \0xc0, which can not be decoded according to the current local. The fact that the file names are escaped (as per OP "a.pdf.\_1\_" may hint at additional characters, not displayed (assuming this was not added by the OP).
Proposed solution is to rename the file, and place the 'sequence' part, that make the file unique BEFORE the extension, allowing the extension to be used to determine file type.
a.pdf.1 => a.1.pdf
The rename command to perform this task is:
rename 's/(.).pdf.(_._)/$1$2.pdf/' .pdf.__
Adjust the file name list as needed, and use -n to verify before running.
rename -n s/.\_1\_// *.*_1_
works (remove the -n once tested).

Sed keep original indentation and camel-casing a variable

I have a simple sed script and I am replacing a bunch of lines in my application dynamically with a variable, the variable is a list of strings.My function works but does not keep the original indentation.the function deletes the line if it contains the certain string and replaces the line with a completely new line, I could not do a replace due to certain syntax restrictions.
How do I keep my original indentation when the line is replaced
Can I capitalize my variable and remove the underscore on the fly, i.e. the title is a capitalize and underscore removed version of the variableName, the list of items in the variable array is really long so I am trying to do this in one shot.
Ex: I want report_type -> Report Type done mid process
Is there a better way to solve this with sed? Thanks for any inputs much appreciated.
sed function is as follows
variableName=$1
sed -i "/name\=\"${variableName}\.name\" value\=model\.${variableName}\.name options\=\#lists\./c\\{\{\> \_dropdown title\=\"${variableName}\" required\=true name\=\"${variableName}\"\}\}" test
SAMPLE INPUT
{{> _select title="Report Type" required=true name="report_type.name" value=model.report_type.name options=#lists.report_type}}
SAMPLE EXPECTED OUPUT
{{> _dropdown title="Report Type" required=true name="report_type" value=model.report_type.name}}
sample input variable
report_type
Try this:
sed -E "s/^(\s+).*name\=\"(report_type)\.name\" value\=model\.report_type\.name options\=\#lists\..*$/\1\{\{\> \_dropdown title\=\"\2\" required\=true name\=\"\2\"\}\}/;T;s/\"(\w+)_(\w+)\"/\"\u\1 \u\2\"/g" input.txt > output.txt
I used "report_type" instead of ${variableName} for testing as an sed one-liner.
Please change back to ${variableName}.
Then go back to using -i (in addition to -E, which is for extended regex).
I am not sure whether I can do it without extended regex, let me know if that is necessary.
use s/// to replace fine tuned line
first capture group for the white space making the indentation
second capture group for the variable name
stop if that did not replace anything, T;
another s///
look for something consisting of only letters between "",
with a "_" between two parts,
seems safe enough because this step is only done on the already replaced line
replace by two parts, without "_"
\u for making camel case
Note:
Doing this on your sample input creates two very similar lines.
I assume that is intentional. Otherwise please provide desired output.
Using GNU sed version 4.2.1.
Interesting line of output:
{{> _dropdown title="Report Type" required=true name="Report Type"}}

Using sed to find and replace recursively

I am using a chef recipe to update a configuration file on my node.The contents of the file look something like follows:
server server1.domain.com
server server2.domain.com
I have a ruby array defined in my attribute file as follows:
default['servers'] = %w(xyz.domain.com abc.domain.com)
I want to use sed recursively to replace the server values in the file, such that my file is updated as such:
server xyz.domain.com
server abc.domain.com
I tried the following ruby loop in my recipe:
(node['servers']).each_with_index do |ntserver,index|
bash "server set" do
code <<-EOH
sed -i 's|server .*|server #{node['servers'].at(index)}|' /etc/ntp.conf
EOH
end
end
But after the chef-client is ran and the changes are applied respectively, the contents of configuration file are as follows:
server abc.domain.com
server abc.domain.com
I am new to sed command so can't figure out where i'm going wrong.
Any help will be appreciated.
By design you should not modify files with Chef. Instead you overwrite the whole file with cookbook_file resource or, if you need to insert some dynamic values into the file, with template resource.
The sed command (the way you use it) is quite simple; it only performs (inplace in the given file due to the -i option) a substitution of each string matching the pattern server .* by the string server #{node['servers'].at(index)}. It does this throughout the whole file, so each loop changes all occurrences in the whole file.
What bothers me is that you write that in the original version you've got server1.domain.com but in the pattern you've got server .* (meaning server, followed by a space , and any amount of other characters .*). Because of the space, this should not match anything, so nothing should be changed at all. But maybe you just put that space in there by mistake when posting your question. I'll assume that there was no such space in your actual code because this way it would fit the observed phenomenon.
So, to change only one line at a time, you should have a counter in your loop and have the number of the iteration in the search pattern, so that it is server1.* for the first iteration, server2.* for the second and so on. Then each iteration will change only exactly one line and you should get your required result.

BASH: Replacing special character groups

I have a rather tricky request...
We use a special application which is connected to a oracle database. For control reasons the application uses special characters which are defined by the application and saved in a long field of the database.
My task is to query the long field periodically and check for changes. To do that, I write the content by using a bash script in a file and compare the old and the new file with md5sum.
When there's a difference, I want to send the old file via mail. The problem is, that the old file contains these special characters and I don't know how to replace them with for example a string which describes them.
I tried to replace them on the basis of their ASCII code, but this didn't work. I've also tried to replace them by their appearance in the file. (They look like this: ^P ) This didn't work neither.
When viewing the file by text editor like nano the characters are visible like described above. But when using cat on the file, the content is only displayed until the first appearance of such a control character.
As far as I know there is know possibility to replace them while querying from the database because of the fact that the content is in a LONG field.
I hope you can help me.
Thank you in advance.
Marco
^P is the Control-P character, which is decimal 16 or hexadecimal 0x10, also known as the Data Link Escape (DLE) character in ASCII.
To replace all occurrences of 0x10 in a file with another string we can use our friend gsed:
gsed "s/\x10/Data Link Escape/g" yourfile.txt
This should replace all occurrences of characters containing the hex value 0x10 with the text string "Data Link Escape". You'll probably want to use a different string - this is just an example.
Depending on the system you're using you may be able to use the standard sed command if your version of sed recognizes the \xNN single-character escape codes. If there are multiple hex characters you need to replace you may want to create a file containing your sed commands, one for each hexadecmial character you need to replace, and tell sed or gsed to use the commands in the file - consult the sed or gsed man pages for how to do this.
Share and enjoy.
You can use xxd to change the string to its hex representation, then use xxd -r to convert back.
Or, you can use uuencode and uudecode.
One option is to run the file through cat -v. This replaces nonprinting characters with visible representations (using the ^ notation for control characters):
$ echo $'\x10\x12\x13\x14\x16' | cat -v
^P^R^S^T^V

Testing "framework" for scripts with nonstandard filenames

Here are many comments on some questions (especially for shell) that say basically one or more of the following:
This will fail on file names that contain spaces, newlines, etc,
This will fail if the file is a symbolic link (or not),
This will fail if the $filaneme is a directory and not regular file,
and so on.
While I understand that every script needs its own testing environment, but
these are some common things for what the script should be immune against.
So, my intention is to write a script what will create some directory hierarchy
with "specially crafted" file names for testing purposes.
The question is: what "special" file names are good for this test?
Currently I have (the script creates files and directories) with:
space in the file name
newline in the file name
file name that starts with one of:
- (like command argument)
# (comment char)
! (command history)
file name that contains one of:
| char (pipe)
() chars
* and ? (wildcards)
file name with unicode characters
all above for the directories
symbolic link to the directory
symbolic link to the file
Any other idea what I shouldn't miss?
What comes to my mind:
quotes in the filename single and double
the $ character at the start
several redirection characters like > < << <<<
the ~ char ($HOME)
the ';' (as command delimiter)
backslash in the filename \
basically, go thru ascii table and test all chars, if you think that you need this :)
Some another comments:
If you want test scripts for the stack-overflow questions, you should create one file with the OP's content (calling as the "basic file")
And the all above "special files" should be symlinks to the above basic file. With this method you can easily modify the content of the files (you need change only one - the basic).
Or, if symlinks not a solution for you use hard-links.
Not directly about special characters in the filenames, but it is good care about:
different case filenames, especially for images like image.jpg image.JPG, same filename only different extension
EDIT: Ideas from the comments:
Very long filenames, lots and lots of files, and very deep directory hierarchies (tripleee)

Resources