Unable to remove a value from a text file using -sed - bash

I'm trying to remove an ID number from a text file using a series of commands (using terminal), but they don't seem to be working. I need to remove the number and the associated "ID" text
Text in File:
{"id":"098765432"}
Commands I've been using (but don't seem to be working):
sed -i.bak 's/"id":[0-9]\{1,\},//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{1,\}",//g' ./Filename.txt
sed -i.bak 's/"id":"[0-9]\{9,\}",//g' ./Filename.txt
sed -i.bak 's/"id":[0-9]\{9,\},//g' ./Filename.txt
sed -i.bak 's/"[0-9]\{1,\}",//g' ./Filename.txt
Thanks for the help :)

As #Wintermute already noted in the comment, the problem is in the comma before //. However, I am going to explain the whole line, just so the others may understand it completely, in case something is not clear to those who come across this question later.
So, the proper command that will satisfy your requirement is:
sed -i.bak 's/"id":"[0-9]\{1,\}"//g' ./Filename.txt
sed is the command that calls stream editor.
Flag -i is the flag used to represent editing files in place (it makes backup if extension is supplied). In this case, extension written is .bak and indeed the backup file (containing initial context of our file) is created with the original name + the extension provided.
Argument 's/"id":"[0-9]{1,}"//g' is the argument given to the sed command.
Since this argument (regular expression in it) was the cause of the problem, I am going to explain it in detail.
First part we should notice is that its structure is s/Regex/Replacement/g where
Regex = "id":"[0-9]{1,}"
Replacement = nothing (literally nothing, not even blank space)
So basically, as described by Bruce Barnett, s stands for substitution. Regex is the part we will replace with the Replacement. At the end, letter g means that we will change more than just one occurrence of this regex per line (without g, it would replace just the first occurrence in every line, no matter how many are there).
And at the end we have ./Filename.txt, which is the source file we are applying this command on (./ means that the file is in the same directory from where we are running this command).
About the regex used ("id":"[0-9]{1,}"):
It starts with the literals ("id":") and this part will match literally any part in the file which is exactly the same as this one. Next, we have ([0-9]{1,}), which means that we want to, in addition to the first part, look for the at least one occurrence of a number (but it can be more of them, as the matched example from the question shows).
Now you may understand why comma caused this problem. There is no comma in the original text in the file. Thus, none of the commands tried (since all of them contain comma) worked. Of course, some of them have even more reasons.
EDIT: As #ghoti pointed out, replacement is not a regex. It is the string we will put at the place(s) that are found by our regex expression. So in this case, our replacement is blank string (since we want to delete the specified part).

Related

sed command for inserting text inside single quote

Suppose there's a text file with the following line:
export MYSQL_ADMIN=''
I want to insert text inside that single quote using the sed command, so that it changes to something like this for example:
export MYSQL_ADMIN='abc1'
What is the appropriate sed command for that in Linux?
I tried
sed -i -e ''/MYSQL_ADMIN/s/''/'abc1'/g"
but it didn't work.
Something like sed -i "s;export MYSQL_ADMIN=.*;export MYSQL_ADMIN='abc1';" /path/to/file.ext
-i modify file in place
s means substitute,
First block is what you are matching as an regular expression - the .* matches everything to the end of the line, this ensures you don't keep any text on that line after the substitue - and second block is what you are replacing with that match.
Always check the file after each run of sed if there is no error and check what changed.
To get the single quotes to print you may have to do ""'"" like ""'""abc1""'""
It is important to understand that although
I want to insert text inside that single quote using the sed command
is a perfectly good characterization of the effect you want to achieve, it does not map directly onto operations from sed's repertoire. With sed, the appropriate tool for most line modifications is the s command, which substitutes specified text for one or more matches to a specified regular expression. That would be the most natural thing to use for your case.
Additionally, it is important with sed to understand how and when to bind commands to specific lines. If you don't do that for a given command then it is applied to all lines. Sometimes that's fine, but other times it will produce unwanted results.
I tried
sed -i -e ''/MYSQL_ADMIN/s/''/'abc1'/g"
but it didn't work.
The two leading single quotes in that sed expression match each other, leaving the trailing double quote unmatched. Also, you do not specify the name of the file to modify. This variation would at least be valid shell syntax, and it would have the desired effect on the specified line appearing in file my_script:
sed -i -e "/MYSQL_ADMIN/s/''/'abc1'/g" my_script
That might also make other, unwanted changes, however.
You need to make some assumptions about the content of the file in order to do such a thing at all. The above depends on the text MYSQL_ADMIN and '' to appear on the same line only in the line(s) you want to modify. That may turn out to hold, but it seems unnecessarily risky. An assumption more likely to hold in general would be that there will be only one assignment to variable MYSQL_ADMIN, or that it is acceptable to modify all such assignments that assign a single-quote-delimited empty value.
Going with the latter, one might end up with this:
sed -i -e "s/\<MYSQL_ADMIN=''\(\s\|$\)/MYSQL_ADMIN='abc1'\1/g" my_script
The pattern \<MYSQL_ADMIN=''\(\s\|$\) improves on your plain MYSQL_ADMIN in these significant ways:
the \< causes it to match only immediately after a word boundary -- start of line, whitesepace, or punctuation. This prevents substitutions for other variables whose names happen to end with MYSQL_ADMIN. If you prefer, it would be even stronger to instead anchor the match to the beginning of the line with ^.
including the ='' in the pattern distinguishes between MYSQL_ADMIN and variables whose names contain that as an initial substring. It also ensures that the '' that gets replaced, if any, goes with the variable and does not merely appear somewhere else on the line.
the \(\s\|$\) both matches and captures either a whitespace character or the empty string at the end of a line. This distinguishes between assignments of an empty value and assignments of values that are merely prefixed by '' (which is valid if the file is a shell script). Having included it in the match, the capture allows the matched text, if any, to be preserved in the output (via the \1 in the replacement).
Because that matches the whole assignment, a complete assignment must appear in the replacement, too. On the other hand, this means that (probably) you can apply the command to every line, as shown, with no particular loss of efficiency relative to the previous command.
Even that might produce changes you didn't want, however, such as in comment lines or quoted text.

Sed keep original indentation and camel-casing a variable

I have a simple sed script and I am replacing a bunch of lines in my application dynamically with a variable, the variable is a list of strings.My function works but does not keep the original indentation.the function deletes the line if it contains the certain string and replaces the line with a completely new line, I could not do a replace due to certain syntax restrictions.
How do I keep my original indentation when the line is replaced
Can I capitalize my variable and remove the underscore on the fly, i.e. the title is a capitalize and underscore removed version of the variableName, the list of items in the variable array is really long so I am trying to do this in one shot.
Ex: I want report_type -> Report Type done mid process
Is there a better way to solve this with sed? Thanks for any inputs much appreciated.
sed function is as follows
variableName=$1
sed -i "/name\=\"${variableName}\.name\" value\=model\.${variableName}\.name options\=\#lists\./c\\{\{\> \_dropdown title\=\"${variableName}\" required\=true name\=\"${variableName}\"\}\}" test
SAMPLE INPUT
{{> _select title="Report Type" required=true name="report_type.name" value=model.report_type.name options=#lists.report_type}}
SAMPLE EXPECTED OUPUT
{{> _dropdown title="Report Type" required=true name="report_type" value=model.report_type.name}}
sample input variable
report_type
Try this:
sed -E "s/^(\s+).*name\=\"(report_type)\.name\" value\=model\.report_type\.name options\=\#lists\..*$/\1\{\{\> \_dropdown title\=\"\2\" required\=true name\=\"\2\"\}\}/;T;s/\"(\w+)_(\w+)\"/\"\u\1 \u\2\"/g" input.txt > output.txt
I used "report_type" instead of ${variableName} for testing as an sed one-liner.
Please change back to ${variableName}.
Then go back to using -i (in addition to -E, which is for extended regex).
I am not sure whether I can do it without extended regex, let me know if that is necessary.
use s/// to replace fine tuned line
first capture group for the white space making the indentation
second capture group for the variable name
stop if that did not replace anything, T;
another s///
look for something consisting of only letters between "",
with a "_" between two parts,
seems safe enough because this step is only done on the already replaced line
replace by two parts, without "_"
\u for making camel case
Note:
Doing this on your sample input creates two very similar lines.
I assume that is intentional. Otherwise please provide desired output.
Using GNU sed version 4.2.1.
Interesting line of output:
{{> _dropdown title="Report Type" required=true name="Report Type"}}

Using sed to replace text within a java properties file

I have a java properties file that looks like the following:
SiteUrlEndpoint=google.com/mySite
I want to use sed -i to inline replace the url but keep the context path that comes out of it. So for example if I wanted to change the properties file above to use amazon.com then the result would look like:
SiteUrlEndpoint=amazon.com/mySite
I am having trouble with sed to only replace the url and keeping the context path when replacing it inline.
My attempt:
sed -i 's:^[ \t]*siteUrlEndpoint[ \t]*=\([ \t]*.*\)[/]*$:siteUrlEndpoint = 'amazon.com':' file
You can do it with two backreferences, e.g.
sed -i.bak 's|^\(SiteUrlEndpoint=\).*/\(.*\)|\1amazon.com/\2|' file
note: the match of text up to / is greedy. If you have multiple parts of the path following the domain, you probably want to preserve all path components. To make it non-greedy, you could use the following instead
sed -i.bak 's|^\(SiteUrlEndpoint=\)[^/]*/\(.*\)|\1amazon.com/\2|' file
(you can add i.bak to create a backup of the original in file.bak)
To accomplish the same thing, you can match SiteUrlEndpoint= at the beginning of the line first, and then use a single backreference for the change, e.g.
sed -i.bak '/^SiteUrlEndpoint=/s|=[^/]*\(/.*\)|=amazon.com\1|' file
For example, given a file sites containing:
$ cat sites
SiteUrlEndpoint=google.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
You can change google.com to amazon.com with (using non-greedy form of first example):
$ sed -i 's|^\(SiteUrlEndpoint=\)[^/]*/\(.*\)|\1amazon.com/\2|' sites
Confirming:
$ cat sites
SiteUrlEndpoint=amazon.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
and
$ cat sites.bak
SiteUrlEndpoint=google.com/path/to/mySite
SiteUrlSomeOther=google.com/mySite
Explanation (first form)
sed -i.bak 's|^\(SiteUrlEndpoint=\) - locate & save
SiteUrlEndpoint=
[^/]*/ - match any folowing characters up to first / (non-greedy -
adjust as needed)
\(.*\) - match and save anything following /
|\1amazon.com/\2|' - full replacement (explanation below)
\1 - first back-reference containing SiteUrlEndpoint=
amazon.com - self-explanatory
/\2 - the '/' second back-reference of everything that followed.
Look over all the solutions and let me know if you have questions.
Regular expressions are hard, especially with complex regular expressions and/or large input files where unexpected changes are to be avoided.
Therefore I strongly recommend using sed -i.bak to keep a backup of the original file to then run a diff on both of them to see what changed.
Assuming that
You only want to change things after the tag siteUrlEndpoint (case insensitive)
You want to change the URL to amazon.com while leaving the path intact
I came up with this solution:
sed -i.bak 's;^\([ \t]*siteurlendpoint[ \t]*=[ \t]*\)[^/]*\(.*\);\1amazon.com\2;Ig' infile
I used a semicolon instead of your colon, that's just my preference when I don't want to use / ;)
Then I wrapped both the leading white spaces and siteurlendpoint as well as everything from the first / onwards into brackets \( \) so that I can take them again in the replacement with \1 and \2. That way I keep the indentation and the capitalisation of SiteUrlEndpoint intact.
For the search options I added an I to the g to make the search case insensitive. I am not sure how standard this option is, you might have to see whether your sed understands it.
The actual part that I want to replace I have just any character not including the next /: [^/]*
As for your line:
Your search term only searches for siteUrlEndpoint with lower case s. Since in your examples you wrote it with capital S, it wouldn't have triggered.
The final [/]*$ doesn't make any sense at all. "This line can end in zero or more of any of these caracters: /."
You precede this [/]*$ with .* which means: zero or more of any character at all.
The single quotes around 'amazon.com' might interfere with the single quotes around the whole search/replace term. It seems to work, but it is sloppy, and will fail if there are ever any spaces in there. It doesn't seem to serve any purpose anyway (except if you want to replace amazon.com with some environment variable like $NEWSITE) so I don't know why you're doing that.
Keep a backreference to the part just before the domain - then match and replace the domain - you can add the -i option after verifying the output of the sed command
url=amazon.com
sed -r 's/\b(SiteUrlEndpoint\s*=\s*)[^/]+/\1'$url'/'
Keep it simple:
$ sed -E 's/(SiteUrlEndpoint=)[^.]+/\1amazon/' file
SiteUrlEndpoint=amazon.com/mySite

Understanding 'sed' command

I am currently trying to install GCC-4.1.2 on my machine: Fedora 20.
In the instruction, the first three commands involve using 'sed' commands, for Makefile modification. However, I am having difficulty in using those commands properly for my case. The website link for GCC-4.1.2.
The commands are:
sed -i 's/install_to_$(INSTALL_DEST) //' libiberty/Makefile.in &&
sed -i 's#\./fixinc\.sh#-c true#' gcc/Makefile.in &&
sed -i 's/#have_mktemp_command#/yes/' gcc/gccbug.in &&
I am trying to understand them by reading the 'sed' man page, but it is not so easy to do so. Any help/tip would be appreciated!
First, the shell part: &&. That just chains the commands together, so each subsequent line will only be run if the prior one is run successfully.
sed -i means "run these commands inline on the file", that is, modify the file directly instead of printing the changed contents to STDOUT. Each sed command here (the string) is a substitute command, which we can tell because the command starts with s.
Substitute looks for a piece of text in the file, and then replaces it. So the order is always s/needle/replacement/. See how the first and last lines have those same forward-slashes? That's the traditional delimiter between the command (substitute), the needle to find in the haystack (install_to_$(INSTALL_DEST), and the text to replace it with ().
So, the first one looks for the string and deletes it (the empty replacement). The last one looks for #have_mktemp_command# and replaces it with yes.
The middle one is a bit weird. See how it starts with s# instead of s/? Well, sed will let you use any delimiter you like to separate the needle from the replacement. Since this needle had a / in it (\./fixinc\.sh), it made sense to use a different delimiter than /. It will replace the text ./fixinc.sh with -c true.
Last note: Why does the second needle have \. instead of .? Well, in a Regular Expression like the needle is (but not used in your example), some characters are magical and do magical fairy dust operations. One of those magic characters is .. To avoid the magic, we put a \ in front of it, escaping away from the magic. (The magic is "match any character", and we want a literal period. That's why.)

search a pattern in each line and append it at the end of that line

I have a file with the following entries:
folder1/a_b.csv folder1/generated/
folder2/folder3/a_b1.csv folder12/generated/
folder4/b_c.csv folder123/generated/
folder5/d.csv folder1/new_folder/generated/
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/
I want to copy the csv file name from each line, paste them at the end of that line and append it with ".org". Hence, the changed file would look like
folder1/a_b.csv folder1/generated/a_b.csv.org
folder2/folder3/a_b1.csv folder12/generated/a_b1.csv.org
folder4/b_c.csv folder123/generated/b_c.csv.org
folder5/d.csv folder1/new_folder/generated/d.csv.org
folder6/12.csv folder/anotherfolder/morefolder/evenmorefolder/generated/12.csv.org
Basically, I am looking for a command in vim or sed using which I can search a pattern in each line and append it at the end of that line. Is it possible?
Thanks in advance.
Vim
Here's how to do this in Vim:
:%s/\([^/]*\.csv\)\( .*\)/&\1.org/
This global (:%) substitution matches the filename (characters that don't contain /, ending in .csv), and captures \(...\) it. It then matches the rest of the line, and captures that, too.
As a replacement, first keep the original match & (or \0), then append the first capture (\1) with the additional suffix.
sed
Though the regular expression syntax is somewhat different than in Vim, the identical expression can be used with sed:
sed -e 's/\([^/]*\.csv\)\( .*\)/&\1.org/' input
Alternatives
It looks like you want to do file renaming in batches. On Linux, the mmv command-line tool is well suited for that; you'll probably find many similar tools on the web, too.
This might work for you (GNU sed):
sed -r 's|/([^ ]*) .*|&\1.org|' file

Resources