How to remove this special character at the end of the file - bash

This is the output of cat command and I don't know what this special character is called that is at the end of the file to even search for. How to remove this special character in bash?
EDIT:
Here is the actual xml file(I am just copy pasting):
<?xml version="1.0" encoding="UTF-8"?>
<Package xmlns="http://soap.sforce.com/2006/04/metadata">
<types>
<name>ApexClass</name>
<members>CreditNotesManager</members>
<members>CreditNotesManagerTest</members>
</types>
<version>47.0</version>
</Package>%

It's unclear how the % (percent sign) is ending up in your file; it's easy to remove with sed:
sed -i '' 's/\(</.*>\)%.*/\1/g' file.xml
This will remove the percent and re-save your file. If you want to do a dry-run omit the -i '' portion as this is tells sed to save the file in-line.
As mentioned in the comments, there are many ways to do it. Just be sure you aren't removing something that you want to keep.

If it is just at the last line, this should work. Using ed(1)
printf '%s\n' '$s/%//' w | ed -s file.xml

If you don't need to save changes, you could use grep:
grep -v "%" <file.xml
This uses grep along with it's inverse matching flag -v. This method will remove all instances of the character % and print the result to STOUT. The < character is a method to tell grep which file you're talking about.
EDIT: actually you don't even need the redirection, so:
grep -v "%" file.xml

This is actually a feature of zsh, not bash.
To disable it, unsetopt prompt_cr prompt_sp
The reverse prompt character showing up means that line had an end-of-file before a final ascii linefeed (newline) character.
How to remove this special character at the end of the file

Related

Replacing/removing square brackets in a string

I have the following text in a file:
Names of students
[Name:Anna]
[Name:Bob]
[Name:Carla]
[Name:Daniel]
[ThisShouldNotBeBeRemoved]
End of all names
Blablabla
I want to remove all lines of the text file where there is an occurrence of the string in the format of [Name:xxx], xxx being a name as a string of any length and consisting of any characters.
I have tried the following, but it wasn't successful:
$ sed '/\[Name:*\]/d' file > new-file
Is there any other way I could approach this?
I would use grep with -v
-v, --invert-match
Invert the sense of matching, to select non-matching lines. (-v is specified by POSIX.)
grep -v "\[Name:"
You need to use .* not just * ...
sed '/\[Name:.*\]/d' file > new-file
* on it's own is meaningless in this particular circumstance. Adding . before it signifies "match any character zero or more times" — which I think is what you're wanting to do.
If you wanted to do an in-place edit to the original file without re-directing to a new one:
Linux:
sed -i '/\[Name:.*\]/d' file
macOS:
sed -i '' '/\[Name:.*\]/d' file
* note - this overwrites the original file.
You missed out something,
sed '/\[Name:.*\]/d' file > new-file
This would remove your lines that match.
.* This matches any character zero or more than once.
sed '/\[Name:[[:alpha:]]+\]/d' file
Names of students
[ThisShouldNotBeBeRemoved]
End of all names
Blablabla
OR if you don't want to create new file then try this,
sed -i '/[Name:.*]/d' file

How to remove all lines in a file containing a variable, only when located on a line somewhere between braces in BASH?

I am trying to remove all of the matches of $word from a file, but only on lines where $word is placed somewhere within { and } which also appear on the same line, e.g.:
{The cat liked} the fish.
The mouse {did not like} the cat.
The {cat did not} like the spider.
If $word is set to "cat", then lines 1 and 3 are deleted, because "cat" appears between the { and }. If $word is set to "like", then lines 1 and 2 are deleted, because this search term appears on those lines between the { and }. Line 3 is not deleted, because like appears outside of the braces.
The braces are never nested.
The braces never appear split across lines.
I have tried various things, but these all returned errors:
sed -i "/\{*$word*\}/d" ./file.txt
sed -i "/\{.*$word.*\}/d" ./file.txt
sed -i "/\{(.*)$word(.*)\}/d" ./file.txt
How can I remove all of the lines in a file containing a variable, but only when the found variable was on a line and found between two braces?
sed -i "/{.*$word.*}/d" ./file.txt
\{ in sed actually have a special meaning, not the literal {, you should just write a { to represent the literal character. (which would be confusing if you are well familiar with perl regex ...)
Edit:
Be careful with -i, if this is in a script, and accidently $word is not defined or set to empty string, this command will delete all lines containing { no matter what between }.
I would take the answer that #cybeliak gave a little further. If you really want to match cat and not, say scat, then you need to delimit your expression with word boundaries:
sed '/{.*[[:<:]]'$word'[[:>:]].*}/d'
Note - I prefer to use ' ' style quotes to prevent any unintended side-effects...
As an aside, I am a big fan of not using the -i flag. Pipe the result into a different file and confirm for yourself that it's good, before deleting the original.
Much easier to do with awk:
awk -v s="cat" -F '[{}]' '!($2 ~ s)' file
The mouse {did not like} the cat.
awk -v s="like" -F '[{}]' '!($2 ~ s)' file
The {cat did not} like the spider.
This might work for you (GNU sed):
sed -i '/{[^}]*'"$word"'[^}]*}/d' file
N.B. $wordshould not contain } or /.

Using sed to pull a value from an XML file, and I get whitespace at the beginning. How can I avoid that?

I'm using sed to put a value contained in an XML file into a variable.
The file looks something like this:
<?xml version="1.0" encoding="UTF-8"?>
<DataBase>
<DataBaseName>fry</DataBaseName>
</DataBase>
I'm doing this:
dbName=$(sed -n 's|<DataBaseName>\(.*\)</DataBaseName>|\1|p' path/to/DataBase.xml)
And it grabs fry correctly, however it has a tab at the beginning. What am I doing wrong in my sed command?
Try:
dbName=$(sed -n 's|\s*<DataBaseName>\(.*\)</DataBaseName>|\1|p' path/to/DataBase.xml)
(You need to match the possible whitespace characters, too.
You are not matching complete line.
Try this:
dbName=$(sed -n 's|[ \t]*<DataBaseName>\(.*\)</DataBaseName>|\1|p' path/to/DataBase.xml)
By skipping tabs and spaces, only <DataBaseName> and </DataBaseName> are removed from line. All other characters remain. Check this tutorial for more.
For example, if you modify your file to:
<DataBaseName>fry</DataBaseName>something
This command:
sed -n 's|[ \t]*<DataBaseName>\(.*\)</DataBaseName>|\1|p' sed_file
will output:
frysomething
simply because your regex didn't matched something.

Trying to delete lines from file with sed -- what am I doing wrong?

I have a .csv file where I'd like to delete the lines between line 355686 and line 1048576.
I used the following command in Terminal (on MacOSx):
sed -i.bak -e '355686,1048576d' trips3.csv
This produces a file called trips3.csv.bak -- but it still has a total of 1,048,576 lines when I reopen it in Excel.
Any thoughts or suggestions you have are welcome and appreciated!
I suspect the problem is that excel is using carriage return (\r, octal 015) to separate records, while sed assumes lines are separated by linefeed (\n, octal 012); this means that sed will treat the entire file as one really long line. I don't think there's an easy way to get sed to get sed to recognize CR as a line delimiter, but it's easy with perl:
perl -n -015 -i.bak -e 'print if $. < 355686 || $. > 1048576' trips3.csv
(Note: if 1048576 is the number of "lines" in the file, you can leave off the || $. > 1048576 part.)
Not sure about the osx sed implementation, however the gnu sed implementation when passed the -i flag with a backup extension first copies the original file to the specified backup and modifies the original file in-place. You should expect to see a reduced number of lines in the original file trip3.csv
Some incantation that should do the job (if you have Ruby installed, obviously)
ruby -pe 'exit if $. > 355686' < trips3.csv > output.csv
If you prefer Perl/Python, just follow the documentation to do something similar and you should be fine. :)
Also, I'm using one of the Ruby one-liners, by Dave.
EDIT: Sorry, forgot to say that you need '> output.csv' to redirect stdout to a file.
awk '!(NR>355686 && NR <1048576)' your_file

How to append to specific lines in a flat file using shell script

I have a flat file that contains something like this:
11|30646|654387|020751520
11|23861|876521|018277154
11|30645|765418|016658304
Using shell script, I would like to append a string to certain lines in this file, if those lines contain a specific string.
For example, in the above file, for lines containing 23861, I would like to append a string "Processed" at the end, so that the file becomes:
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
I could use sed to append the string to all lines in the file, but how do I do it for specific lines ?
I'd do it this way
sed '/\|23861\|/{s/$/|Something/;}' file
This is similar to Marcelo's answer but doesn't require extended expressions and is, I think, a little cleaner.
First, match lines having 23861 between pipes
/\|23861\|/
Then, on those lines, replace the end-of-line with the string |Something
{s/$/|Something/;}
If you want to do more than one of these you could simply list them
sed '/\|23861\|/{s/$/|Something/;};/\|30645\|/{s/$/|SomethingElse/;}' file
Use the following awk-script:
$ awk '/23861/ { $0=$0 "|Processed" } {print}' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
or, using sed:
$ sed 's/\(.*23861.*$\)/\1|Processed/' input
11|30646|654387|020751520
11|23861|876521|018277154|Processed
11|30645|765418|016658304
Use the substitution command:
sed -i~ -E 's/(\|23861\|.*)/\1|Processed/' flat.file
(Note: the -i~ performs the substitution in-place. Just leave it out if you don't want to modify the original file.)
You can use the shell
while read -r line
do
case "$line" in
*23681*) line="$line|Processed";;
esac
echo "$line"
done < file > tempo && mv tempo file
sed is just a stream version of ed, which has a similar command set but was designed to edit files in place (allegedly interactively, but you wouldn't want to use it that way unless all you had was one of these). Something like
field_2_value=23861
appended_text='|processed'
line_match_regex="^[^|]*|$field_2_value|"
ed "$file" <<EOF
g/$line_match_regex/s/$/$appended_text/
wq
EOF
should get you there.
Note that the $ in .../s/$/... is not expanded by the shell, as are $line_match_regex and $appended_text, because there's no such thing as $/ - instead it's passed through as-is to ed, which interprets it as text to substitute ($ being regex-speak for "end of line").
The syntax to do the same job in sed, should you ever want to do this to a stream rather than a file in place, is very similar except that you don't need the leading g before the regex address:
sed -e "/$line_match_regex/s/$/$appended_text/" "$input_file" >"$output_file"
You need to be sure that the values you put in field_2_value and appended_text never contain slashes, because ed's g and s commands use those for delimiters.
If they might do, and you're using bash or some other shell that allows ${name//search/replace} parameter expansion syntax, you could fix them up on the fly by substituting \/ for every / during expansion of those variables. Because bash also uses / as a substitution delimiter and also uses \ as a character escape, this ends up looking horrible:
appended_text='|n/a'
ed "$file" <<EOF
g/${line_match_regex//\//\\/}/s/$/${appended_text//\//\\/}/
wq
EOF
but it does work. Nnote that both ed and sed require a trailing / after the replacement text in s/search/replace/ while bash's ${name//search/replace} syntax doesn't.

Resources