Programmatically delete all text between 2 characters in osx terminal - macos

I have a thousand of txt files
1.txt
2.txt
3.txt
in each files, several times I have tags among my text:
{somethinghere...blablabla} than the text I want to keep than again {somethinghere...blablabla}
I'm not very pratical in mac osx command line, can someone help me to write a command opening each file, parsing it, and deleting all text included by two "{"?
To be clear:
First of all I need to open each file, than parse the text. When the loop finds a "{" it starts deleting till it founds a "}". When done parsing it saves and close the file. That's what I need to do.

$ sed -i.bak -e 's#{[^}]*}##g' *.txt
-i.bak make a backup copy of each modified files. If you don't want backups, on OsX use -i'' (the quotes are not necessary on Linux)
in substitutions, the delimiter can be another character than /, here I choose #, so : s#<REGEX>#<REMPLACEMENT># (the basic form for substitutions are s///)
In the regex, we search a litteral { and all but not a } with [^}]. * means 0 or more occurences. Last, we search the closing } and we replace the matching part by nothing, so it delete what was matching
the g modifier #the end means not only one match but all

Related

Weird txt behavior

I have a centos server. I cloned a GitHub repository. And I have .txt file in that repository which contains 1 line. For some reason it does that:
[root#0-0-0-0 Some]# cat some.txt
some text[root#0-0-0-0 Some]#
And also while read i; do echo "$i"; done < some.txt don't see that line. What could cause that? And how to avoid it. If I edit it with vim adding a new line and then deleting that new line (so it still contains only one line) it starts to work properly.
The text file has no newline character at the end of it. Some programs will treat it as a valid text file whose last line doesn't happen to end in a newline. Others (apparently including bash's built-in read command, at least by default) will treat it as invalid, and perhaps ignore the last line (which isn't considered a "line" because it's not marked as one).
vim's default behavior is to quietly add a newline to the end of a file if you modify and save it.
You can add a newline to a file that lacks one by editing it with vim (or another editor that behaves similarly), or by adding it from the shell:
echo '' >> some.txt
In general, it's a good idea to ensure that text files end in a newline character in the first place, at least if they're intended to be used on UNIX-like systems.

How do I grep for all lines without a "#" character in the line

I have a text file open in BBEdit/InDesign with email addresses on some lines (about a third of the lines) and name and date stuff on the other lines. I just want to keep the lines that have emails and remove all the others.
A simple pattern I can see to eliminate all the lines apart from those with email addresses on them is to have a negative match for the # character.
I can't use grep -v pattern because the Find and Replace implementation of grep dialogue box just has the fields for Find pattern and Replace pattern. grep -something options don't exist in this context.
Note, I am note trying to construct a valid email address test at all, just using the presence of one (or more) # character to allow a line to stay, all other lines must be deleted from the list.
The closest I got was a pattern which hits only the email address lines (opposite outcome of my goal):
^((\w+|[ \.])\w+)[?#].*$
I tried various combination of ^.*[^#].*$ and more sophisticated /w and [/w|\.] in parentheses and escaping the # with [^\#] and negative look forwards like (!?).
I want to find these non-email address lines and delete them using any of these apps on OS X BBEdit/InDesign. I will use the command line if I have to. There must be a way using in-app Find and Replace with grep though I'd expect.
As stated in the comments grep -v # filename lists all lines without an # symbol. You could also use grep # filename > new_filename
The file new_filename will consist only of lines with #. You can use this new file or delete all lines in the old file and paste contents of new file into it.

Unable to remove a value from a text file using -sed

I'm trying to remove an ID number from a text file using a series of commands (using terminal), but they don't seem to be working. I need to remove the number and the associated "ID" text
Text in File:
{"id":"098765432"}
Commands I've been using (but don't seem to be working):
sed -i.bak 's/"id":[0-9]\{1,\},//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{1,\}",//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{9,\}",//g' ./Filename.txt
sed -i.bak 's/"id":[0-9]\{9,\},//g' ./Filename.txt
sed -i.bak 's/"[0-9]\{1,\}",//g' ./Filename.txt
Thanks for the help :)
As #Wintermute already noted in the comment, the problem is in the comma before //. However, I am going to explain the whole line, just so the others may understand it completely, in case something is not clear to those who come across this question later.
So, the proper command that will satisfy your requirement is:
sed -i.bak 's/"id":"[0-9]\{1,\}"//g' ./Filename.txt
sed is the command that calls stream editor.
Flag -i is the flag used to represent editing files in place (it makes backup if extension is supplied). In this case, extension written is .bak and indeed the backup file (containing initial context of our file) is created with the original name + the extension provided.
Argument 's/"id":"[0-9]{1,}"//g' is the argument given to the sed command.
Since this argument (regular expression in it) was the cause of the problem, I am going to explain it in detail.
First part we should notice is that its structure is s/Regex/Replacement/g where
Regex = "id":"[0-9]{1,}"
Replacement = nothing (literally nothing, not even blank space)
So basically, as described by Bruce Barnett, s stands for substitution. Regex is the part we will replace with the Replacement. At the end, letter g means that we will change more than just one occurrence of this regex per line (without g, it would replace just the first occurrence in every line, no matter how many are there).
And at the end we have ./Filename.txt, which is the source file we are applying this command on (./ means that the file is in the same directory from where we are running this command).
About the regex used ("id":"[0-9]{1,}"):
It starts with the literals ("id":") and this part will match literally any part in the file which is exactly the same as this one. Next, we have ([0-9]{1,}), which means that we want to, in addition to the first part, look for the at least one occurrence of a number (but it can be more of them, as the matched example from the question shows).
Now you may understand why comma caused this problem. There is no comma in the original text in the file. Thus, none of the commands tried (since all of them contain comma) worked. Of course, some of them have even more reasons.
EDIT: As #ghoti pointed out, replacement is not a regex. It is the string we will put at the place(s) that are found by our regex expression. So in this case, our replacement is blank string (since we want to delete the specified part).

Reading data from file to execute Shell Script

I have a 'testfiles' files that has list of files
Ex-
Tc1
Tc2
calling above file in script
test=`cat testfiles`
for ts in $test
do
feed.sh $ts >>results
done
This script runs fine when there only 1 test file in 'testfiles',but when there are multiple files ,it fails with 'file not found'
Let me know if this is correct approach
you ll have to read files one by one since you are taking testfiles='Tc1 Tc2' cat is searching for file named 'Tc1 Tc2' which does not exist so use cut command with " " as the delimiter and rad files one by one in a loop.or u can use sed command also to seperate file names
Your approach should work if the filenames have no spaces or other tricky characters. An approach that handles spaces in file names successfully is:
while IFS= read -r ts
do
feed.sh "$ts" >>results
done <testfiles
If your file names have newline characters in them, then the above won't work and you would need to create testfiles with the names separated by a null character in place of a newline.
Let's consider the original code. When bash substitutes for $test in the for statement, all the file names appear on the same line and bash will perform word splitting which will make a mess of any file names containing white space. The same happens on the line feed.sh $ts. Since $ts is not quoted, it will also undergo word splitting.

search a pattern in each line and append it at the end of that line

I have a file with the following entries:
folder1/a_b.csv folder1/generated/
folder2/folder3/a_b1.csv folder12/generated/
folder4/b_c.csv folder123/generated/
folder5/d.csv folder1/new_folder/generated/
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/
I want to copy the csv file name from each line, paste them at the end of that line and append it with ".org". Hence, the changed file would look like
folder1/a_b.csv folder1/generated/a_b.csv.org
folder2/folder3/a_b1.csv folder12/generated/a_b1.csv.org
folder4/b_c.csv folder123/generated/b_c.csv.org
folder5/d.csv folder1/new_folder/generated/d.csv.org
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/12.csv.org
Basically, I am looking for a command in vim or sed using which I can search a pattern in each line and append it at the end of that line. Is it possible?
Thanks in advance.
Vim
Here's how to do this in Vim:
:%s/\([^/]*\.csv\)\( .*\)/&\1.org/
This global (:%) substitution matches the filename (characters that don't contain /, ending in .csv), and captures \(...\) it. It then matches the rest of the line, and captures that, too.
As a replacement, first keep the original match & (or \0), then append the first capture (\1) with the additional suffix.
sed
Though the regular expression syntax is somewhat different than in Vim, the identical expression can be used with sed:
sed -e 's/\([^/]*\.csv\)\( .*\)/&\1.org/' input
Alternatives
It looks like you want to do file renaming in batches. On Linux, the mmv command-line tool is well suited for that; you'll probably find many similar tools on the web, too.
This might work for you (GNU sed):
sed -r 's|/([^ ]*) .*|&\1.org|' file

Resources