Reading files get stuck in bash - bash

I am running this command on my log files,
grep "." file | tr '|' '\n' | sed -r "s/(.{3}).*?\.cpp/\1TRY/g" | tr '\n''|'
It runs as expected i.e. keeping the first three letters same to files with .cpp extension, adds TRY to it.
So if input is: abcdef.cpp
ouput is: abcTRY
(keeping words without extension as it is)
But is stops running(gets stuck) after some time, any suggestions on what might be the problem.

Remove the non-greedy quantifier.
sed -r "s/^(.{3})[^.]*\.cpp/\1TRY/"

Related

I want to pipe grep output to sed for input

I'm trying to pipe the output of grep to sed so it will only edit specific files. I don't want sed to edit something without changing it. (Changing the modified date.)
I'm searching with grep and writing with sed. That's it
The thing I am trying to change is a dash, not the normal type, a special type. "-" is normal. "–" isn't normal
The code I currently have:
sed -i 's/– foobar/- foobar/g' * ; perl-rename 's/– foobar/- foobar/' *'– foobar'*
Sorry about the trouble, I'm inexperienced.
Are you sure about what you want to achieve? Let me explain you:
grep "string_in_file" <filelist> | sed <sed_script>
This is first showing the "string_in_file", preceeded by the filename.
If you launch a sed on this, then it will just show you the result of that sed-script on screen, but it will not change the files itself. In order to do this, you need the following:
grep -l "string_in_file" <filelist> | sed <sed_script_on_file>
The grep -l shows you some filenames, and the new sed_script_on_file needs to be a script, reading the file, and altering it.
Thank you all for helping, I'm sorry about not being fast in responding
After a bit of fiddling with the command, I got it:
grep -l 'old' * | xargs -d '\n' sed -i 's/old/new/'
This should only touch files that contain old and leave all other files.
This might be what you're trying to do if your file names don't contain newlines:
grep -l -- 'old' * | xargs sed -i 's/old/new/'

For Loop Issues with CAT and tr

I have about 700 text files that consist of config output which uses various special characters. I am using this script to remove the special characters so I can then run a different script referencing an SED file to remove the commands that should be there leaving what should not be in the config.
I got the below from Remove all special characters and case from string in bash but am hitting a wall.
When I run the script it continues to loop and writes the script into the output file. Ideally, it just takes out the special characters and creates a new file with the updated information. I have not gotten to the point to remove the previous text file since it probably wont be needed. Any insight is greatly appreciated.
for file in *.txt for file in *.txt
do
cat * | tr -cd '[:alnum:]\n\r' | tr '[:upper:]' '[:lower:]' >> "$file" >> "$file".new_file.txt
done
A less-broken version of this might look like:
#!/usr/bin/env bash
for file in *.txt; do
[[ $file = *.new_file.txt ]] && continue ## skip files created by this same script
tr -cd '[:alnum:]\n\r' <"$file" \
| tr '[:upper:]' '[:lower:]' \
>> "$file".new_file.txt
done
Note:
We're referring to the "$file" variable being set by for.
We aren't using cat. It slows your script down with no compensating benefits whatsoever. Instead, using <"$file" redirects from the specific input file being iterated over at present.
We're skipping files that already have .new_file.txt extensions.
We only have one output redirection (to the new_file.txt version of the file; you can't safely write to the file you're using as input in the same pipeline).
Using GNU sed:
sed -i 's/[^[:alnum:]\n\r]//g;s/./\l&/g' *.txt

Grep (fgrep) bash exact match end of line

I have the below example file
d41d8cd98f00b204e9800998ecf8427e /home/abid/Testing/FileNamesTest/apersand $ file
d41d8cd98f00b204e9800998ecf8427e /home/abid/Testing/FileNamesTest/file[with square brackets]
d41d8cd98f00b204e9800998ecf8427e /home/abid/Testing/FileNamesTest/~$tempfile
017a3635ccb76250b2036d6aea330c80 /home/abid/Testing/FileNamesTest/FileThree
217a3635ccb76250b2036d6aea330c80 /home/abid/Testing/FileNamesTest/FileThreeDays
d41d8cd98f00b204e9800998ecf8427e /home/abid/Testing/FileNamesTest/single quote's
I want to grep the last part of the file (the file name) but I'm after an exact match for the last part of the line (the file name)
grep FileThree$ files.md5
017a3635ccb76250b2036d6aea330c80 /home/abid/Testing/FileNamesTest/FileThree
gives back an exact match and doesnt find "FileThreeDays" which is what I'm after but because some of the file names contains square brackets it I'm having to use grep -F or fgrep. However using fgrep like the above doesnt work it returns nothing.
How can I exact match the last part of the line using fgrep whilst still honoring the special characters above ~ / $ / ' / [ ] etc...or any other method using maybe awk...
Further....
using fgrep withou return both these files I only want an exact match (using the use of the $ above with grep), but $ with fgrep doesnt return anything.
grep -F FileThree files.md5
017a3635ccb76250b2036d6aea330c80 /home/abid/Testing/FileNamesTest/FileThree
217a3635ccb76250b2036d6aea330c80 /home/abid/Testing/FileNamesTest/FileThreeDays
I can't tell all the details from your question, but it sounds like you can use grep and just escape the special characters: grep 'File\[Three\]Days$'
If you want to use fgrep, though, you can use some tr tricks to help you. If all you want is the filename (without the directory name), you can do something like
cat files.md5 | tr '/' '\n' | fgrep FileThreeDays
That tr command replaces slashes with newlines, so it will put each filename on its own line. That means that fgrep will only find the filename when it searches for FileThreeDays.
If you want the full filename with directory, it's a little trickier, but a similar approach will work. Assuming that there's always a double space between the SHA and the filename, and that there aren't any filenames with double spaces or tab characters in them, you can try something like this:
sed 's/ /\t' files.md5 | tr '\t' '\n' | fgrep FileThreeDays
That sed command converts the double spaces to tabs. The tr command turns those tabs into newlines (the same trick as above).
I would use awk:
awk '{$1="";print}' file
$1="" cuts the first column to an empty string, and print prints the modified line - which only contains the filename now.
However, this leaves a blank space at the start of each line. If you care about it and want to remove it, set the output field separator to an empty string:
awk '{$1="";print}' OFS="" file

Unable to get sed to replace commas with a word in my CSV

Hello I am using bash to create CSV file by extracting data from an html file using grep. The problem is after getting the data then using sed to take out , in it and put a word like My_com it gose a crazy on me. here is my code.
time=$(grep -oP 'data-context-item-time=.*.data-context-item-views' index.html \
| cut -d'"' -f2)
title=$(grep -oP 'data-context-item-title=.*.data-context-item-id' index.html |\
cut -d'"' -f2)
sed "s/,/\My_commoms/g" $title
echo "$user,$views,$time,$title" >> test
I keep getting this error
sed: can't read Flipping: No such file or directory
sed: can't read the: No such file or directory
and so on
any advice on what wrong with my code
You can't use sed on text directly on the command line like that; sed expects a file, so it is reading your text as a file name. Try this for your second to last line:
echo $title | sed 's/,/My_com/g'
that way sed sees the text on a file (stdin in this case). Also note that I've used single quotes in the argument to sed; in this case I don't think it will make any difference, but in general it is good practice to make sure bash doesn't mess with the command at all.
If you don't want to use the echo | sed chain, you might also be able to rewrite it like this:
sed 's/,/My_com/g' <<< "$title"
I think that only works in bash, not dash etc. This is called a 'here-string', and bash passes the stuff on the right of the <<< to the command on its stdin, so you get the same effect.

Removing non-displaying characters from a file

$ cat weirdo
Lunch now?
$ cat weirdo | grep Lunch
$ vi weirdo
^#L^#u^#n^#c^#h^# ^#n^#o^#w^#?^#
I have some files that contain text with some non-printing characters like ^# which cause my greps to fail (as above).
How can I get my grep work? Is there some way that does not require altering the files?
It looks like your file is encoded in UTF-16 rather than an 8-bit character set. The '^#' is a notation for ASCII NUL '\0', which usually spoils string matching.
One technique for loss-less handling of this would be to use a filter to convert UTF-16 to UTF-8, and then using grep on the output - hypothetically, if the command was 'utf16-utf8', you'd write:
utf16-utf8 weirdo | grep Lunch
As an appallingly crude approximation to 'utf16-utf8', you could consider:
tr -d '\0' < weirdo | grep Lunch
This deletes ASCII NUL characters from the input file and lets grep operate on the 'cleaned up' output. In theory, it might give you false positives; in practice, it probably won't.
The tr command is made for that:
cat weirdo | tr -cd '[:print:]\r\n\t' | grep Lunch
You may have some success with the strings(1) tool like in:
strings file | grep Launch
See man strings for more details.
you can try
awk '{gsub(/[^[:print:]]/,"") }1' file

Resources