I am looping through the lines in a text file. And performing grep on each lines through directories. like below
while IFS="" read -r p || [ -n "$p" ]
do
echo "This is the field: $p"
grep -ilr $p * >> Result.txt
done < fields.txt
But the above writes the results for the last line in the file. And not for the other lines.
If i manually execute the command with the other lines, it works (which mean the match were found). Anything that i am missing here? Thanks
The fields.txt looks like this
annual_of_measure__c
attached_lobs__c
apple
When the file fields.txt
has DOS/Windows lineending convention consisting of two character (Carriage-Return AND Linefeed) and
that file is processed by Unix-Tools expecting Unix lineendings consisting of only one character (Linefeed)
then the line read by the read command and stored in the variable $p is in the first line annual_of_measure__c\r (note the additional \r for the Carriage-Return). Then grep will not find a match.
From your description in the question and the confirmation in the comments, it seems that the last line in fields.txt has no lineending at all, so the variable $p is the ordinary string apple and grep can find a match on the last line of the file.
There are tools for converting lineendings, e.g. see this answer or even more options in this answer.
Related
Without using tools such as sed, grep, or awk, only standard shell, I need to retrieve line numbers of lines containing a pattern then, then for each line number retrieved, output lines:
[(line_retrieved) + 1 - (line_retrieved % 6)] to [line_retrieved + 6]
while skipping duplicate blocks of lines.
I was able to output lines of a file, specified by line number with sed "${START_LINE}, ${END_LINE}" file. I haven't found how to retrieve line numbers of lines containing a pattern yet.
Here you go.
Assuming your pattern is "giga.*" (for example if you want to catch "gigaaks") and the file is /tmp/1.txt then you'd do something like:
you would do:
pattern="giga.*"; linenum=1; while read line; do for word in `echo $line`; do if [[ "$word" =~ "$string_to_search" ]]; then echo $linenum:$line; fi; done; ((linenum++)); done < /tmp/1.txt | sort -u
Overview:
1) I set my pattern in pattern variable. Value can be exact string or an actual pattern.
2) You read the file line by line in outer loop and read that whole line into a variable $line.
3) Then you read the whole line to read all the words in that line, into a variable $word.
4) Then you use IF statement with =~ SHELL level direct regex match (using =~ operator)
5) If 4th statement is true, then you echo the $linenum (containing the line#) and the whole $line line.
6) You increment $linenum per every outer loop run (i.e. for each line).
7) sort -u will be used to print the line only ONCE if pattern/string occurs more than one time in a line's word(s).
It'll spit all the patterns with the full line, prefixed with the LINE# of each line from the file.
NOTE: if you want to do grep -i (ignore case), then you may try the actual grep -i or similar capability.
I need to see the last characters of bunch of text files (or alternatively test whether they are "}" and give a list of files that test negative ). Is there an easy way to do this from the command line.
(Ideally the solution works without reading the whole file from the start because in addition to there being many they can also be quite large.
P.S.: Any answer would be great but I would really appreciate if the function and syntax of everything in the answer can be fully explained.
It can be done fairly easily with tail and then string indexing in bash. For example, you obtain the last line in a file with, tail -n1 file. You will need to store the line in a variable using command-substitution, e.g.
lastln=$(tail -n1 file)
Then it is simply a matter of indexing the last characters, e.g.
echo ${lastln:(-1)}
(note: when indexing from the end of the string, you must put the offset (e.g. -1 in parenthesis (-1) -- or -- you must leave a space before the -1, e.g. echo ${lastln: -1} is also valid.)
You can try this:
for file in file1 file2; do tail -n 1 "$file" | grep -q '}$' || echo "$file"; done
where you should replace file1 file2 with the list of files you want to analyze, e.g. * or the like. Now what happens here? The outer part
for file in file1 file2; do ...; done
is a simple loop over the files, where inside the loop, you can refer to the current file as $file. Then,
tail -n 1 "$file"
prints the last line of the given file and
| grep -q '}$'
redirects the output to grep (turned into silent mode with -q), which looks for '}' immediatly followed by the end of the line ($). The return value of this command can be used to chain another action: when grep returns non-zero (indicating failure, i.e., the pattern is not matched), the last part
|| echo "$file"
is executed, resulting in the list of files you need.
I have a consistent file with numbers like
0123456
0234566
.
.
.
etc
With bash tools, command line preferable, how can I remove each line if the third digit equals 2 .
eg, with cut -c3 I can get the correct digit but I cannot combine it effectively with sed or something similar. I am not looking for a pattern, only the 3rd digit.
(I have done it in a script in python but I was wondering how its done through a one-line bash command). Thank you!
EDIT: Additionally, if I want to delete the lines where the third digit NOT equals to 2 (opposite question)
You can just do this with sed
sed -i '/^..2/d' file
If you want to do the opposite you can do:
sed -i '/^..[^2]/d' file
since you are dealing with a specific character.
I would use awk:
$ awk -F "" '$3!=2' file
0234566
by setting the field separator to "" (empty, just valid on GNU-awk), every character is stored in a different field. Then, saying $3 != 2 checks if the 3rd character is not 2 and, if so, the line is printed.
Or with pure bash, using Using shell parameter expansion ${parameter:offset:length}:
while IFS= read -r line
do
[ "${line:2:1}" != "2" ] && echo "$line"
done < file
I try to use sed to read a line from an ASCII file, parse it and write it slightly changed to a defined line number in an output file.
The line format in the input file is as follows:
linenumber:designator,"variable text content"
e.g.
3:string1,"this is text of string 1"
So the outfile should look as follows in line 3:
string1,"this is text of string 1"
The line includes the double quotes and the blanks. All old lines are moved one line down.
The user is responsible to provide a proper input file regarding the order of lines and has to consider that lines in the output file are moved down with each new line in the input file. The script does not know about any order except for the line number given in the input file.
A script shall read all lines and put the content of those lines into an outputfile at the given line numbers
including double quotes and blanks
without the line number part and the colon
The command I use successfully with the shell is e.g.:
sed -i '3istring1,"this is text of string 1"' outfile
No trouble with quotes, double quotes and blanks there.
Using the bash script
while read line
do
linenum=$(echo $line | cut -f1 -d:)
linestr=$(echo $line | cut -f2 -d:)
sedcmd="sed -i '"
sedcmd=${sedcmd}${linenum}
sedcmd=${sedcmd}i
sedcmd=${sedcmd}${linestr}
sedcmd=${sedcmd}"' outfile"
echo "---> $sedcmd"
$sedcmd
done < script/new_records.txt
shows exactly the same sed command with echo but returns with:
sed: -e expression #1, char 1: unknown command: `''
Apparently executing the sed command from within a bash script is different from executing it directly in the bash shell.
I tried a variety of escape sequences "\" before quotes, double quotes and blanks...but rather randomly, and neither of those was successful.
What do I have to do in order to write the string including blanks and double quotes to a specified line in a text file?
# Assuming OutFile exist and have enough line
while read ThisLine
do
LineNum=$(echo "${ThisLine}" | cut -f1 -d ":" )
echo "${ThisLine##*:}" > /tmp/LineContent.txt
sed -i -n "${LineNum} !{p;b;};r /tmp/LineContent.txt" OutFile
done < script/new_records.txt
Not the best thing because you assume lot of issue like enough line in outfile, no problem reading the line (what about escaped char in quoted string, ...) could occur
Okay, I'll give it a shot. If I understand what you're trying to do correctly, and if you're certain the code input file is not malformed, then
sed -i -f <(sed 's/:/i/' insertions.txt) datafile.txt
is the most straightforward way. This works because with an input specification of
number:text
all one has to do to is to replace the : with an i to get a sed command that says: "When handling line number, insert text". The <() bit is bash-style command substitution that expands to the name of a FIFO from which the output of the command can be read.
It might be prudent to guard against mistakes by saying something like
sed -i -f <(sed '/^[0-9]\+:/!d; s/:/i/' insertions.txt) datafile.txt
This removes all lines from insertions.txt that don't begin with a number followed by a colon because those are obviously broken.
Note that this all-in-one-go approach treats line numbers as they were in the input file. That is to say, given an insertions file with content
2:foo,"bar "
4:baz,"qux "
baz,"qux " will appear in line 5 of the output (before line 4 of the input). If this is not desired, sed will have to be called multiple times to handle each insertion individually, as in
while read insertion; do
sed -i "${insertion/:/i}" datafile.txt
done < insertions.txt
${insertion/:/i} is another bashism that replaces the first : in a shell variable with i and expands to the result, i.e., if insertion=1:2:3, then ${insertion/:/i} is 1i2:3.
I wanted to concatenate two variables, but it seems that there is some overwriting.
#!/bin/bash
NUMBER1=$(seq 1 900 | sort -R | head -1)
FIRST=$(sed -n ''$NUMBER1'p' names.txt)
echo ${FIRST}
echo "${FIRST}${NUMBER1}"
Where names.txt is a list of names.
For example when I run this code, I get the output as,
Gregoria
159goria
Notice $FIRST was partially overwritten by $NUMBER1 .
The correct output should have been,
Gregoria
Gregoria159
can someone please help me ?
Thanks
Your names.txt file has Windows line-endings, CR-LF. The CR (carriage return) is not being recognized as part of the new-line sequence by sed, so it stays on the end of the line Gregoria<CR>; consequently, the next characters get overprinted at the beginning of the line.
Use dos2unix or some equivalent to fix the line endings.