I am experiencing some trouble while reading a file in a bash script.
Here is the file I am trying to read :
--------------- (not in the file)
123.234 231.423
1223.23 132.134
--------------- (not in the file)
In this file, the 4 numbers are on two different lines and there is a line left blank at the end of the file. There is no "space" character at the end of each line.
When I am trying to read this file using this script :
for val in $(cat $myFile)
do
echo "$val ";
done
I do have the following result :
123.234
231.423
1223.23
32.134
When I add a space character after the variable, it erases the beginning of the last number
for val in $(cat ~/Windows/Trash/bashReadingBehavior/trashFile.out)
do
echo "$val ";
done
output :
123.234
231.423
1223.23
2.134
In fact, characters added after the last numbers are written at the beginning of the last numbers. I assume this is a behavior caused by an invisible character such as carriage return or something like this but I can't figure out how to solve this issue.
Your input file has DOS line endings. When you execute
echo "$val "
the value of $val ends with a carriage return, which when printed moves the cursor to the beginning of the line before the final two spaces are printed, which can overwrite whatever is already on the line.
You should use the following code to read from this file (don't iterate over the output of cat):
while read; do
REPLY=${REPLY%$'\r'} # Remove the carriage return from the line
for val in $REPLY; do
echo "$val"
done
done
Iterating over a line with a for loop like I show isn't really recommended either, but it is OK in this case if the line read from the file and stored in REPLY is known to be a space-separated list of numbers.
Related
The content of the script is:
#!/bin/bash
tempconf="/tmp/test.file"
while read line
do
echo $line
done < test.conf > $tempconf
The content of the test.conf is:
[PORT]
tcp_ports=7000-7200
udp_ports=7000-8000, 40000-49999
[H323H]
maxSendThreads=10
maxRecvThreads=10
[SDK]
appPwd=1111111
amsAddress=192.168.222.208:8888
The content of the output file "/tmp/test.file" is:
[PORT]
tcp_ports=7000-7200
udp_ports=7000-8000, 40000-49999
2
maxSendThreads=10
maxRecvThreads=10
[SDK]
appPwd=1111111
amsAddress=192.168.222.208:8888
The question is,why [H323H] turns out to be 2. I'll be appreciated if anyone can explain it to me.
[] has a special meaning for the shell, it just means "a single character taken from any of the characters between the brackets". So when you run
echo [H323H]
the shell looks for a file named or H, or 2, or 3... If at least one file matches, [H323H] is replaced with all the matching file names in the output; otherwise it's reproduced as is.
source: https://unix.stackexchange.com/a/259385
Using quotes around $line would solve your problem without the need to check for files matching those characters (which would make the script not very robust)
#!/bin/bash
tempconf="/tmp/test.file"
while read -r line
do
echo "$line"
done < test.conf > "$tempconf"
I am coding a shell script that reads text files and creates JSON key-value pairs based on them. The key is the filename and the value is a random line of the file.
The trouble is when I concatenate the key with the value in the global variable data.
When I run the code bellow:
data='{'
for file in $(ls _params)
do
key=${file%.txt}
f_line=$(($$%100))
value=$(sed "${f_line}q;d" "./_params/$file")
# assembles key-value pairs
data=$data\"$key\":\""value"\",
done
data=${data%?} # removes last comma
data="$data}"
echo $data
My output is: {"firstName":"value","lastName":"value"}
But changing the string "value" to the variable $value, as follows:
data='{'
for file in $(ls _params)
do
key=${file%.txt}
f_line=$(($$%100))
value=$(sed "${f_line}q;d" "./_params/$file")
# assembles key-value pairs
data=$data\"$key\":\"$value\",
done
data=${data%?} # removes last comma
data="$data}"
echo $data
The output gets confused: "}"lastName":"Adailee.
I wish to store in the $data variable something like: {"firstName":"Bettye","lastName":"Allison"}
Note: My bash version is 4.3.48.
Note: Inside my archive _params I have two files firstName.txt and lastName.txt both with random names each line.
As #ruakh suggests, the specific issue is your input files. Here are steps to repro your issue and verify this:
I created two a firstNames.txt file with A B C D repeated 100 times:
$ cat ABCD
A
B
C
D
$ for _ in $(seq 1 100); do cat ABCD >> _params/firstName.txt
And then similar with W X Y Z for lastNames.txt. Then I ran your script:
$ bash q.sh
{"firstName":"A","lastName":"W"
However, if I use unix2dos (from the dos2unix package) to convert this to \r\n line endings.
$ unix2dos _params/firstName.txt
unix2dos: converting file _params/firstName.txt to DOS format...
$ unix2dos _params/lastName.txt
unix2dos: converting file _params/lastName.txt to DOS format...
$ bash q.sh
"}"lastName":"W
So you could probably use dos2unix to fix your input files (or open vim and do :set ft=unix and then :x).
But I wanted to let you know about three other things.
$$ is not a random number, it's the PID of your current process.
best practice is not to parse ls, but to use globbing instead1
you can solve the fencepost problem without removing the comma you just placed by starting with the empty separator and setting it to comma after the first iteration of the loop.
Here is my suggestion for improving your script (once you fix the newlines in the input):
#!/bin/bash
data='{'
sep=""
for file in _params/*
do
key=${file%.txt}
file_length=$(wc -l < ${file})
f_line=$(( (RANDOM % file_length) + 1 ))
value=$(sed "${f_line}q;d" "${file}")
# assembles key-value pairs
data="${data}${sep} \"$key\":\"$value\""
sep=","
done
data="${data} }"
echo $data
$value apparently ends with a carriage return character (\r, U+000D). As a result, when you print it, the cursor moves back to the beginning of the line, and subsequent characters are printed starting at the first column, overwriting what was there before. (This doesn't affect the actual order of characters, of course; it's just displayed confusingly when you print it.)
To fix this, you can write
value="${value%$'\r'}"
to remove the trailing carriage return.
I'm constructing a bash script file a bit at a time. I'm learning as I
go. But I can't find anything online to help me at this point: I need to
extract a substring from a large string, and the two methods I found using ${} (curly brackets) just won't work.
The first, ${x#y}, doesn't do what it should.
The second, ${x:p} or ${x:p:n}, keeps reporting bad substitution.
It only seems to work with constants.
The ${#x} returns a string length as text, not as a number, meaning it does not work with either ${x:p} or ${x:p:n}.
Fact is, it's seems really hard to get bash to do much math at all. Except for the for statements. But that is just counting. And this isn't a task for a for loop.
I've consolidated my script file here as a means of helping you all understand what it is that I am doing. It's for working with PureBasic source files, but you only have to change the grep's "--include=" argument, and it can search other types of text files instead.
#!/bin/bash
home=$(echo ~) # Copy the user's path to a variable named home
len=${#home} # Showing how to find the length. Problem is, this is treated
# as a string, not a number. Can't find a way to make over into
# into a number.
echo $home "has length of" $len "characters."
read -p "Find what: " what # Intended to search PureBasic (*.pb?) source files for text matches
grep -rHn $what $home --include="*.pb*" --exclude-dir=".cache" --exclude-dir=".gvfs" > 1.tmp
while read line # this checks for and reads the next line
do # the closing 'done' has the file to be read appended with "<"
a0=$line # this is each line as read
a1=$(echo "$a0" | awk -F: '{print $1}') # this gets the full path before the first ':'
echo $a0 # Shows full line
echo $a1 # Shows just full path
q1=${line#a1}
echo $q1 # FAILED! No reported problem, but failed to extract $a1 from $line.
q1=${a0#a1}
echo $q1 # FAILED! No reported problem, but failed to extract $a1 from $a0.
break # Can't do a 'read -n 1', as it just reads 1 char from the next line.
# Can't do a pause, because it doesn't exist. So just run from the
# terminal so that after break we can see what's on the screen .
len=${#a1} # Can get the length of $a1, but only as a string
# q1=${line:len} # Right command, wrong variable
# q1=${line:$len} # Right command, right variable, but wrong variable type
# q1=${line:14} # Constants work, but all $home's aren't 14 characters long
done < 1.tmp
The following works:
x="/home/user/rest/of/path"
y="~${x#/home/user}"
echo $y
Will output
~/rest/of/path
If you want to use "/home/user" inside a variable, say prefix, you need to use $ after the #, i.e., ${x#$prefix}, which I think is your issue.
The hejp I got was most appreciated. I got it done, and here it is:
#!/bin/bash
len=${#HOME} # Showing how to find the length. Problem is, this is treated
# as a string, not a number. Can't find a way to make over into
# into a number.
echo $HOME "has length of" $len "characters."
while :
do
echo
read -p "Find what: " what # Intended to search PureBasic (*.pb?) source files for text matches
a0=""; > 0.tmp; > 1.tmp
grep -rHn $what $home --include="*.pb*" --exclude-dir=".cache" --exclude-dir=".gvfs" >> 0.tmp
while read line # this checks for and reads the next line
do # the closing 'done' has the file to be read appended with "<"
a1=$(echo $line | awk -F: '{print $1}') # this gets the full path before the first ':'
a2=${line#$a1":"} # renove path and first colon from rest of line
if [[ $a0 != $a1 ]]
then
echo >> 1.tmp
echo $a1":" >> 1.tmp
a0=$a1
fi
echo " "$a2 >> 1.tmp
done < 0.tmp
cat 1.tmp | less
done
What I don't have yet is an answer as to whether variables can be used in place of constants in the dollar-sign, curly brackets where you use colons to mark that you want a substring of that string returned, if it requires constants, then the only choice might be to generate a child scriot using the variables, which would appear to be constants in the child, execute there, then return the results in an environmental variable or temporary file. I did stuff like that with MSDOS a lot. Limitation here is that you have to then make the produced file executable as well using "chmod +x filename". Or call it using "/bin/bash filename".
Another bash limitation found it that you cannot use "sudo" in the script without discontinuing execution of the present script. I guess a way around that is use sudo to call /bin/bash to call a child script that you produced. I assume then that if the child completes, you return to the parent script where you stopped at. Unless you did "sudo -i", "sudo -su", or some other variation where you become super user. Then you likely need to do an "exit" to drop the super user overlay.
If you exit the child script still as super user, would typing "exit" but you back to completing the parent script? I suspect so, which makes for some interesting senarios.
Another question: If doing a "while read line", what can you do in bash to check for a keyboard key press? The "read" option is already taken while in this loop.
This question already has answers here:
Are shell scripts sensitive to encoding and line endings?
(14 answers)
Closed 5 years ago.
Here's an example of my problematic code:
#!/bin/bash
fileList='fileList.txt'
#IFS=$'\n'
while read filename
do
echo listing "$filename"
ls -ligG "$filename"
done < "$fileList"
echo "done."
#unset IFS
exit 0
The output is:
listing /some/long/path/README.TXT
ls: cannot access /some/long/pa
: No such file or directoryDME.TXT
Notice that ls cuts off the path. Also notice that the end of the path/filename is appended to the error message (after "No such file or directory").
I just tested it with a path exactly this long and it still gives the error:
/this/is/an/example/of/shorter/name.txt
Anyone know what's going on? I've been messing with this for hours already :-/
In response to torek's answer, here is more info:
First, here's the modified script based on torek's suggestions:
#!/bin/bash
fileList=/settings/Scripts/fileList.txt
while IFS=$'\n' read -r filename
do
printf 'listing %q\n' "$filename"
ls -ligG $filename
done < "$fileList"
echo "done."
exit 0
Here's the output of that:
# ./test.sh
listing $'/example/pathname/myfile.txt\r'
: No such file or directorypathname/myfile.txt
done.
Notice there is some craziness going on still.
Here's the file. It does exist.
ls -ligG /example/pathname/myfile.txt
106828 -rwxrwx--- 1 34 Mar 28 00:55 /example/pathname/myfile.txt
Based on the unusual behavior, I'm going to say the file has CRLF line terminators. Your file names actually have an invisible carriage return appended to the name. In echo, this doesn't show up, since it just jumps to the first column then prints a newline. However, ls tries to access the file including the hidden carriage return, and in its error message, the carriage return causes the error message partially overwrite your path.
To trim these chars away, you can use tr:
tr -d '\r' < fileList.txt > fileListTrimmed.txt
and try using that file instead.
That embedded newline is a clue: the error message should read ls: cannot access /some/long/path/README.TXT: No such file or directory (no newline after the "a" in "path"). Even if there were some mysterious truncation happening, the colon should happen right after the "a" in "path". It doesn't, so, the string is not what it seems to be.
Try:
printf 'listing %q\n' "$filename"
for printing the file name before invoking ls. Bash's built-in printf has a %q format that will quote funny characters.
I'm not sure what the intent of the commented-out IFS-setting is. Perhaps you want to prevent read from splitting at whitespace? You can put the IFS= in front of the read, and you might want to use read -r as well:
while IFS=$'\n' read -r filename; do ...; done < "$fileList"
I have a bunch of files that are incomplete: the last line is missing an EOL character.
What's the easiest way to add the newline, using any tool (awk maybe?)?
To add a newline at the end of a file:
echo >>file
To add a line at the end of every file in the current directory:
for x in *; do echo >>"$x"; done
If you don't know in advance whether each file ends in a newline, test the last character first. tail -c 1 prints the last character of a file. Since command substitution truncates any final newline, $(tail -c 1 <file) is empty if the file is empty or ends in a newline, and non-empty if the file ends in a non-newline character.
for x in *; do if [ -n "$(tail -c 1 <"$x")" ]; then echo >>"$x"; fi; done
Vim is great for that because if you do not open a file in binary mode, it will automatically end the file with the detected line ending.
So:
vim file -c 'wq'
should work, regardless of whether your files have Unix, Windows or Mac end of line style.
echo >> filename
Try it before mass use :)