concatenate two variables in bash - bash

I wanted to concatenate two variables, but it seems that there is some overwriting.
#!/bin/bash
NUMBER1=$(seq 1 900 | sort -R | head -1)
FIRST=$(sed -n ''$NUMBER1'p' names.txt)
echo ${FIRST}
echo "${FIRST}${NUMBER1}"
Where names.txt is a list of names.
For example when I run this code, I get the output as,
Gregoria
159goria
Notice $FIRST was partially overwritten by $NUMBER1 .
The correct output should have been,
Gregoria
Gregoria159
can someone please help me ?
Thanks

Your names.txt file has Windows line-endings, CR-LF. The CR (carriage return) is not being recognized as part of the new-line sequence by sed, so it stays on the end of the line Gregoria<CR>; consequently, the next characters get overprinted at the beginning of the line.
Use dos2unix or some equivalent to fix the line endings.

Related

Looping and grep writes output for the last line only

I am looping through the lines in a text file. And performing grep on each lines through directories. like below
while IFS="" read -r p || [ -n "$p" ]
do
echo "This is the field: $p"
grep -ilr $p * >> Result.txt
done < fields.txt
But the above writes the results for the last line in the file. And not for the other lines.
If i manually execute the command with the other lines, it works (which mean the match were found). Anything that i am missing here? Thanks
The fields.txt looks like this
annual_of_measure__c
attached_lobs__c
apple
When the file fields.txt
has DOS/Windows lineending convention consisting of two character (Carriage-Return AND Linefeed) and
that file is processed by Unix-Tools expecting Unix lineendings consisting of only one character (Linefeed)
then the line read by the read command and stored in the variable $p is in the first line annual_of_measure__c\r (note the additional \r for the Carriage-Return). Then grep will not find a match.
From your description in the question and the confirmation in the comments, it seems that the last line in fields.txt has no lineending at all, so the variable $p is the ordinary string apple and grep can find a match on the last line of the file.
There are tools for converting lineendings, e.g. see this answer or even more options in this answer.

Delete lines where 3rd character equals a number

I have a consistent file with numbers like
0123456
0234566
.
.
.
etc
With bash tools, command line preferable, how can I remove each line if the third digit equals 2 .
eg, with cut -c3 I can get the correct digit but I cannot combine it effectively with sed or something similar. I am not looking for a pattern, only the 3rd digit.
(I have done it in a script in python but I was wondering how its done through a one-line bash command). Thank you!
EDIT: Additionally, if I want to delete the lines where the third digit NOT equals to 2 (opposite question)
You can just do this with sed
sed -i '/^..2/d' file
If you want to do the opposite you can do:
sed -i '/^..[^2]/d' file
since you are dealing with a specific character.
I would use awk:
$ awk -F "" '$3!=2' file
0234566
by setting the field separator to "" (empty, just valid on GNU-awk), every character is stored in a different field. Then, saying $3 != 2 checks if the 3rd character is not 2 and, if so, the line is printed.
Or with pure bash, using Using shell parameter expansion ${parameter:offset:length}:
while IFS= read -r line
do
[ "${line:2:1}" != "2" ] && echo "$line"
done < file

bash script: write string with double quotes and blanks to file

I try to use sed to read a line from an ASCII file, parse it and write it slightly changed to a defined line number in an output file.
The line format in the input file is as follows:
linenumber:designator,"variable text content"
e.g.
3:string1,"this is text of string 1"
So the outfile should look as follows in line 3:
string1,"this is text of string 1"
The line includes the double quotes and the blanks. All old lines are moved one line down.
The user is responsible to provide a proper input file regarding the order of lines and has to consider that lines in the output file are moved down with each new line in the input file. The script does not know about any order except for the line number given in the input file.
A script shall read all lines and put the content of those lines into an outputfile at the given line numbers
including double quotes and blanks
without the line number part and the colon
The command I use successfully with the shell is e.g.:
sed -i '3istring1,"this is text of string 1"' outfile
No trouble with quotes, double quotes and blanks there.
Using the bash script
while read line
do
linenum=$(echo $line | cut -f1 -d:)
linestr=$(echo $line | cut -f2 -d:)
sedcmd="sed -i '"
sedcmd=${sedcmd}${linenum}
sedcmd=${sedcmd}i
sedcmd=${sedcmd}${linestr}
sedcmd=${sedcmd}"' outfile"
echo "---> $sedcmd"
$sedcmd
done < script/new_records.txt
shows exactly the same sed command with echo but returns with:
sed: -e expression #1, char 1: unknown command: `''
Apparently executing the sed command from within a bash script is different from executing it directly in the bash shell.
I tried a variety of escape sequences "\" before quotes, double quotes and blanks...but rather randomly, and neither of those was successful.
What do I have to do in order to write the string including blanks and double quotes to a specified line in a text file?
# Assuming OutFile exist and have enough line
while read ThisLine
do
LineNum=$(echo "${ThisLine}" | cut -f1 -d ":" )
echo "${ThisLine##*:}" > /tmp/LineContent.txt
sed -i -n "${LineNum} !{p;b;};r /tmp/LineContent.txt" OutFile
done < script/new_records.txt
Not the best thing because you assume lot of issue like enough line in outfile, no problem reading the line (what about escaped char in quoted string, ...) could occur
Okay, I'll give it a shot. If I understand what you're trying to do correctly, and if you're certain the code input file is not malformed, then
sed -i -f <(sed 's/:/i/' insertions.txt) datafile.txt
is the most straightforward way. This works because with an input specification of
number:text
all one has to do to is to replace the : with an i to get a sed command that says: "When handling line number, insert text". The <() bit is bash-style command substitution that expands to the name of a FIFO from which the output of the command can be read.
It might be prudent to guard against mistakes by saying something like
sed -i -f <(sed '/^[0-9]\+:/!d; s/:/i/' insertions.txt) datafile.txt
This removes all lines from insertions.txt that don't begin with a number followed by a colon because those are obviously broken.
Note that this all-in-one-go approach treats line numbers as they were in the input file. That is to say, given an insertions file with content
2:foo,"bar "
4:baz,"qux "
baz,"qux " will appear in line 5 of the output (before line 4 of the input). If this is not desired, sed will have to be called multiple times to handle each insertion individually, as in
while read insertion; do
sed -i "${insertion/:/i}" datafile.txt
done < insertions.txt
${insertion/:/i} is another bashism that replaces the first : in a shell variable with i and expands to the result, i.e., if insertion=1:2:3, then ${insertion/:/i} is 1i2:3.

Why does sed add a new line in OSX?

echo -n 'I hate cats' > cats.txt
sed -i '' 's/hate/love/' cats.txt
This changes the word in the file correctly, but also adds a newline to the end of the file. Why? This only happens in OSX, not Ubuntu etc. How can I stop it?
echo -n 'I hate cats' > cats.txt
This command will populate the contents of 'cats.txt' with the 11 characters between the single quotes. If you check the size of cats.txt at this stage it should be 11 bytes.
sed -i '' 's/hate/love/' cats.txt
This command will read the cats.txt file line by line, and replace it with a file where each line has had the first instance of 'hate' replaced by 'love' (if such an instance exists). The important part is understanding what a line is. From the sed man page:
Normally, sed cyclically copies a line of input, not including its
terminating newline character, into a pattern space, (unless there is
something left after a ``D'' function), applies all of the commands
with addresses that select that pattern space, copies the pattern
space to the standard output, appending a newline, and deletes the
pattern space.
Note the appending a newline part. On your system, sed is still interpreting your file as containing a single line, even though there is no terminating newline. So the output will be the 11 characters, plus the appended newline. On other platforms this would not necessarily be the case. Sometimes sed will completely skip the last line in a file (effectively deleting it) because it is not really a line! But in your case, sed is basically fixing the file for you (as a file with no lines in it, it is broken input to sed).
See more details here: Why should text files end with a newline?
See this question for an alternate approach to your problem: SED adds new line at the end
If you need a solution which will not add the newline, you can use gsed (brew install gnu-sed)
A good way to avoid this problem is to use perl instead of sed. Perl will respect the EOF newline, or lack thereof, that is in the original file.
echo -n 'I hate cats' > cats.txt
perl -pi -e 's/hate/love/' cats.txt
Note that GNU sed does not add the newline on Mac OS.
Another thing you can do is this:
echo -n 'I hate cats' > cats.txt
SAFE=$(cat cats.txt; echo x)
SAFE=$(printf "$SAFE" | sed -e 's/hate/love/')
SAFE=${SAFE%x}
That way if cats.txt ends in a newline it gets preserved. If it doesn't, it doesn't get one added on.
This worked for me. I didn't have to use an intermediate file.
OUTPUT=$( echo 'I hate cats' | sed 's/hate/love/' )
echo -n "$OUTPUT"

Trying to delete lines from file with sed -- what am I doing wrong?

I have a .csv file where I'd like to delete the lines between line 355686 and line 1048576.
I used the following command in Terminal (on MacOSx):
sed -i.bak -e '355686,1048576d' trips3.csv
This produces a file called trips3.csv.bak -- but it still has a total of 1,048,576 lines when I reopen it in Excel.
Any thoughts or suggestions you have are welcome and appreciated!
I suspect the problem is that excel is using carriage return (\r, octal 015) to separate records, while sed assumes lines are separated by linefeed (\n, octal 012); this means that sed will treat the entire file as one really long line. I don't think there's an easy way to get sed to get sed to recognize CR as a line delimiter, but it's easy with perl:
perl -n -015 -i.bak -e 'print if $. < 355686 || $. > 1048576' trips3.csv
(Note: if 1048576 is the number of "lines" in the file, you can leave off the || $. > 1048576 part.)
Not sure about the osx sed implementation, however the gnu sed implementation when passed the -i flag with a backup extension first copies the original file to the specified backup and modifies the original file in-place. You should expect to see a reduced number of lines in the original file trip3.csv
Some incantation that should do the job (if you have Ruby installed, obviously)
ruby -pe 'exit if $. > 355686' < trips3.csv > output.csv
If you prefer Perl/Python, just follow the documentation to do something similar and you should be fine. :)
Also, I'm using one of the Ruby one-liners, by Dave.
EDIT: Sorry, forgot to say that you need '> output.csv' to redirect stdout to a file.
awk '!(NR>355686 && NR <1048576)' your_file

Resources