Remove block-comments from a file with a bash script - bash

There is a way to remove from a file all rows wrapped between /* and */ using a bash script?
I use percona to generate a sql script to syncronize two databases, a development one to a production one. Percona generates a well formatted SQL script but full of comments which make increase file size. So, just to make easier upload operation I'd prefer to remove all the unnecessary.
EDIT ON January 10th
I solved with this code:
sed -r ':a; s%(.*)/\*.*\*/%\1%; ta; /\/\*/ !b; N; ba' <FILE_TO_CLEAN>
thanks all

Using sed:
sed '/\/\*.*\*\// d; /\/\*/,/\*\// d' file
The command d tells sed to delete patterns matching the preceeding expression. The first expression /\/\*.*\*\// matches one-line comments, the second one /\/\*/,/\*\// comments that range multiple lines (this is implied by the ,).
I don't know if this works 100%, but as far as I tried, it did the job.

-Try this script- it should help removing the comments, since are the same as C++ Here you can see another sed example to remove HTML comments

Related

Adding quotes to variating characters in bash

I am trying to use the sed function in order to add double quotes for anything in between a matched pattern and a comma to break of the pattern. At the moment I am extracting the following data from cloudflare and I am trying to modify it to line protocol;
count=24043,clientIP=x.x.x.x,clientRequestPath=/abc/abs/abc.php
count=3935,clientIP=y.y.y.y,clientRequestPath=/abc/abc/abc/abc.html
count=3698,clientIP=z.z.z.z,clientRequestPath=/abc/abc/abc/abc.html
I have already converted to this format from JSON output with a bunch of sed functions to modify it, however, I am unable to get to the bottom of it to put the data for clientIP and clientRequestPath in inverted commas.
My expected output has to be;
count=24043,clientIP="x.x.x.x",clientRequestPath="/abc/abs/abc.php"
count=3935,clientIP="y.y.y.y",clientRequestPath="/abc/abc/abc/abc.html"
count=3698,clientIP="z.z.z.z",clientRequestPath="/abc/abc/abc/abc.html"
This data will be imported into InfluxDB, count will be a float whilst clientIP and clientRequestPath will be strings, hence why I need them to be in inverted commas as at the moment I am getting errors since they arent as they should be.
Is anyone available to provided to adequate 'sed' function to do is?
This might work for you (GNU sed):
sed -E 's/=([^0-9][^,]*)/="\1"/g' file
Enclose any string following a = does not begin with a integer upto a , in double quotes, globally.
here is a solution using a SED script to allow for multiple operations on a source file.
assuming your source data is in a file "from.dat"
create a sed script to run multiple commands
cat script.sed
s/clientIP=/clientIP=\"/
s/,clientRequestPath/\",clientRequestPath/
execute multiple-command sed script on data file redirecting the output file "to.dat"
sed -f script.sed from.dat > to.dat
cat to.dat (only showing one line)
count=24043,clientIP="x.x.x.x",clientRequestPath=/abc/abs/abc.php

Substitution of substring doesn't work in bash (tried sed, ${a/b/c/})

Before to write, of course I read many other similar cases. Example I used #!/bin/bash instead of #!/bin/sh
I have a very simple script that reads lines from a template file and wants to replace some keywords with real data. Example the string <NAME> will be replaced with a real name. In the example I want to replace it with the word Giuseppe. I tried 2 solutions but they don't work.
#!/bin/bash
#read the template and change variable information
while read LINE
do
sed 'LINE/<NAME>/Giuseppe' #error: sed: -e expression #1, char 2: extra characters after command
${LINE/<NAME>/Giuseppe} #error: WORD(*) command not found
done < template_mail.txt
(*) WORD is the first word found in the line
I am sorry if the question is too basic, but I cannot see the error and the error message is not helping.
EDIT1:
The input file should not be changed, i want to use it for every mail. Every time i read it, i will change with a different name according to the receiver.
EDIT2:
Thanks your answers i am closer to the solution. My example was a simplified case, but i want to change also other data. I want to do multiple substitutions to the same string, but BASH allows me only to make one substitution. In all programming languages i used, i was able to substitute from a string, but BASH makes this very difficult for me. The following lines don't work:
CUSTOM_MAIL=$(sed 's/<NAME>/Giuseppe/' template_mail.txt) # from file it's ok
CUSTOM_MAIL=$(sed 's/<VALUE>/30/' CUSTOM_MAIL) # from variable doesn't work
I want to modify CUSTOM_MAIL a few times in order to include a few real informations.
CUSTOM_MAIL=$(sed 's/<VALUE1>/value1/' template_mail.txt)
${CUSTOM_MAIL/'<VALUE2>'/'value2'}
${CUSTOM_MAIL/'<VALUE3>'/'value3'}
${CUSTOM_MAIL/'<VALUE4>'/'value4'}
What's the way?
No need to do the loop manually. sed command itself runs the expression on each line of provided file:
sed 's/<NAME>/Giuseppe/' template_mail.txt > output_file.txt
You might need g modifier if there are more appearances of the <NAME> string on one line: s/<NAME>/Giuseppe/g

use sed with for loop to delete lines from text file

I am essentially trying to use sed to remove a few lines within a text document. To clean it up. But I'm not getting it right at all. Missing something and I have no idea what...
#!/bin/bash
items[0]='X-Received:'
items[1]='Path:'
items[2]='NNTP-Posting-Date:'
items[3]='Organization:'
items[4]='MIME-Version:'
items[5]='References:'
items[6]='In-Reply-To:'
items[7]='Message-ID:'
items[8]='Lines:'
items[9]='X-Trace:'
items[10]='X-Complaints-To:'
items[11]='X-DMCA-Complaints-To:'
items[12]='X-Abuse-and-DMCA-Info:'
items[13]='X-Postfilter:'
items[14]='Bytes:'
items[15]='X-Original-Bytes:'
items[16]='Content-Type:'
items[17]='Content-Transfer-Encoding:'
items[18]='Xref:'
for f in "${items[#]}"; do
sed '/${f}/d' "$1"
done
What I am thinking, incorrectly it seems, is that I can setup a for loop to check each item in the array that I want removed from the text file. But it's simply not working. Any idea. Sure this is basic and simple and yet I can't figure it out.
Thanks,
Marek
Much better to create a single sed script, rather than generate 19 small scripts in sequence.
Fortunately, generating a script by joining the array elements is moderately easy in Bash:
regex=$(printf '\|%s' "${items[#]}")
regex=${regex#'\|'}
sed "/^$regex/d" "$1"
(Notice also the addition of ^ to the final regex -- I assume you only want to match at beginning of line.)
Properly, you should not delete any lines from the message body, so the script should leave anything after the first empty line alone:
sed "1,/^\$/!b;/$regex/d" "$1"
Add -i if you want in-place editing of the target file.

Replace last line of XML file

Looking for help creating a script that will replace the last line of an XML file with a tag. I have a few hundred files so I'm looking for something that will process them in a loop. I've managed to rename the files sequentially like this:
posts1.xml
posts2.xml
posts3.xml
etc...
to make it easier to loop through. But I have no idea how to write a script to do this. I'm open to using either Linux or Windows (but i would guess that Linux is better for this kind of task).
So if you want to append a line to every file:
sed -i '$a<YOUR_SHINY_NEW_TAG>' *xml
To replace the last line:
sed -i '$s/.*/<YOUR_SHINY_NEW_TAG>/' *xml
But do note, sed is not the ideal tool to modify xml.
XMLStarlet is a command-line toolkit for performing XML parsing and manipulations. Note that as an XML-aware toolkit, it'll respect XML structure, character encoding and entity substitution.
Check out the ed command to see how to modify documents. You can wrap this in a standard bash loop.
e.g. in a doc consisting of a chain of <elem>s, you can add a following <added>5</added>:
mkdir new
for x in *.xml; do
xmlstarlet ed -a "//elem[count(//elem)]" -t elem -n added -v 5 $x > new/$x
done
Linux way using sed:
To edit the last line of the file in place, you can use sed:
sed -i '$s_pattern_replacement_' filename
To change the whole line to "replacement" use $s_.*_replacement_. Be sure to escape any _'s in replacement with a \.
To loop over files, just use for:
for f in /path/posts*.xml; do sed -i '$s_.*_replacement_' $f; done
This, however, is a dirty way as it's not aware of the XML structure, whereas the XML structure is not affected by newlines. You have to be sure the last line of the files contains exactly what you expect it to.
It makes little to no difference whether you're on Linux, Windows or MacOS
The question is what language do you want to use?
The following is an example in c# (not optimized, but read it as speudocode):
string rootDirectory = #"c:\myfiles";
var files = Directory.GetFiles(rootDirectory, "*.xml");
foreach (var file in files)
{
var lines = File.ReadAllLines(file);
lines[lines.Length - 1] = "whatever you want here";
File.WriteAllLines(file, lines);
}
You can compile this and run it on Windows, Linux, etc..
Or you could do the same in Python.
Of course this method does not actually parse the XML,
but you just wanted to replace the last line right?

bash templating

i have a template, with a var LINK
and a data file, links.txt, with one url per line
how in bash i can substitute LINK with the content of links.txt?
if i do
#!/bin/bash
LINKS=$(cat links.txt)
sed "s/LINKS/$LINK/g" template.xml
two problem:
$LINKS has the content of links.txt without newline
sed: 1: "s/LINKS/http://test ...": bad flag in substitute command: '/'
sed is not escaping the // in the links.txt file
thanks
Use some better language instead. I'd write a solution for bash + awk... but that's simply too much effort to go into. (See http://www.gnu.org/manual/gawk/gawk.html#Getline_002fVariable_002fFile if you really want to do that)
Just use any language where you don't have to mix control and content text. For example in python:
#!/usr/bin/env python
links = open('links.txt').read()
template = open('template.xml').read()
print template.replace('LINKS', links)
Watch out if you're trying to force sed solution with some other separator - you'll get into the same problems unless you find something disallowed in urls (but are you verifying that?) If you don't, you already have another problem - links can contain < and > and break your xml.
You can do this using ed:
ed template.xml <<EOF
/LINKS/d
.r links.txt
w output.txt
EOF
The first command will go to the line
containing LINKS and delete it.
The second line will insert the
contents of links.txt on the current
line.
The third command will write the file
to output.txt (if you omit output.txt
the edits will be saved to
template.xml).
Try running sed twice. On the first run, replace / with \/. The second run will be the same as what you currently have.
The character following the 's' in the sed command ends up the separator, so you'll want to use a character that is not present in the value of $LINK. For example, you could try a comma:
sed "s,LINKS,${LINK}\n,g" template.xml
Note that I also added a \n to add an additional newline.
Another option is to escape the forward slashes in $LINK, possibly using sed. If you don't have guarantees about the characters in $LINK, this may be safer.

Resources