Unix: Remove date from filename using sed without modifying existing one

Unix: Remove date from filename using sed without modifying existing one - bash

I have a legacy code which transmits the file only if it has date within the command. But client transmission doesnt want date to be appended to filename. legacy code cannot be modified since many other transmision depends on it. So my requirement is i want to have date parameter in the command but again the same has to be removed using a single command.
Condition in legacy code:
grep '\`date' $COMMAND
Note: COMMAND will contain the complete command defined below and not the filename (not CMD output).
So ideally my command should have `date added. I added a command like this below.
CMD=`echo prefix_filename.txt | sed 's/^prefix_//'`_`date +%m%d%Y`
The above command is used to remove prefix_ and send filename. Here i get output as filename.txt_09232016. Since legacy code logic only checks if command has `date in it, i added it. Is there a way to remove the date again in the same command so that my output will be filename.txt
Current output:
filename.txt_09232016
Expected output:
filename.txt

Get the file name before date part:
echo 'filename.txt_09232016' | grep -o '^.*\.txt'
Or remove date from the end of the file:
echo 'filename.txt_09232016' | sed 's/_[0-9]\+$//'

There are a number of things you can do to improve/simplify your code. The main thing is that bash have very nice built-in string manipulation. Another is that you should probably use $(...) instead of `...` notation:
CMD=`echo prefix_filename.txt | sed 's/^prefix_//'`_`date +%m%d%Y`
Can be replaced with
ORIG=prefix_filename.txt
CMD=${ORIG#prefix_}_$(date +%m%d%Y)
Continuing,
echo $CMD
NODATE=${CMD%_*}
echo $NODATE
This prints
filename.txt_09232016
filename.txt
The construct ${var#pattern} removes the shortest occurrence of pattern from the start of your variable: in this case, prefix_. Similarly, the construct ${var%pattern} removes the shortest occurrence of pattern from the end of your string: in this case _*.
In the first case, you could have used ${var##pattern} since prefix_ is a fixed string. However, in the second case you could not use ${var%%pattern}, since you want to make sure you only truncate starting at the last underscore, not the first one and the date is specified as a dynamic pattern.
Just as an FYI, the links point to www.tldp.org, which has the best Bash manual I have come across by far. It gets dense sometimes, but the explanations are generally worth it in the end.

Just do that:
echo filename.txt_09232016 | sed s/_[^_]*$//
Here, you are replacing (by nothing) ' _ ' and all subsequent characters, until the end of the string ($), since they are all different (^) of ' _ '.

Related

How to remove duplicate with bash script command xargs when the string has some quotes ""?

I am a newbie in bash script.
Here is my environment:
Mac OS X Catalina
/bin/bash
I found here a mix of several commands to remove the duplicate string in a string.
I needed for my program which updates the .zhrc profile file.
Here is my code:
#!/bin/bash
a='export PATH="/Library/Frameworks/Python.framework/Versions/3.8/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/local/bin:"'
myvariable=$(echo "$a" | tr ':' '\n' | sort | uniq | xargs)
echo "myvariable : $myvariable"
Here is the output:
xargs: unterminated quote
myvariable :
After some test, I know that the source of the issue is due to some quotes "" inside my variable '$a'.
Why am I so sure?
Because when I execute this code for example:
#!/bin/bash
a="/Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home:/Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home"
myvariable=$(echo "$a" | tr ':' '\n' | sort | uniq | xargs)
echo "myvariable : $myvariable"
where $a doesn't contain any quotes, I get the correct output:
myvariable : /Library/Java/JavaVirtualMachines/jdk1.8.0_271.jdk/Contents/Home
I tried to search for a solution for "xargs: unterminated quote" but each answer found on the web is for a particular case which doesn't correspond to my problem.
As I am a newbie and this line command is using several complex commands, I was wondering if anyone know the magic trick to make it work.

Basically, you want to remove duplicates from a colon-separated list.
I don't know if this is considered cheating, but I would do this in another language and invoke it from bash. First I would write a script for this purpose in zsh: It accepts as parameter a string with colon separtors and outputs a colon-separated list with duplicates removed:
#!/bin/zsh
original=${1?Parameter missing} # Original string
# Auxiliary array, which is set up to act like a Set, i.e. without
# duplicates
typeset -aU nodups_array
# Split the original strings on the colons and store the pieces
# into the array, thereby removing duplicates. The core idea for
# this is stolen from:
# https://stackoverflow.com/questions/2930238/split-string-with-zsh-as-in-python
nodups_array=("${(#s/:/)original}")
# Join the array back with colons and write the resulting string
# to stdout.
echo ${(j':')nodups_array}
If we call this script nodups_string, you can invoke it in your bash-setting as:
#!/bin/bash
a_path="/Library/Frameworks/Python.framework/Versions/3.8/bin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/opt/local/bin:"
nodups_a_path=$(nodups_string "$a_path")
my_variable="export PATH=$nodups_a_path"
echo "myvariable : $myvariable"
The overall effect would be literally what you asked for. However, there is still an open problem I should point out: If one of the PATH components happens to contain a space, the resulting export statement can not validly be executed. This problem is also inherent into your original problem; you just didn't mention it. You could do something like
my_variable=export\ PATH='"'$nodups_a_path"'"'
to avoid this. Of course, I wonder why you take such an effort to generat a syntactically valid export command, instead of simply building the PATH by directly where it is needed.
Side note: If you would use zsh as your shell instead of bash, and only want to keep your PATH free of duplicates, a simple
typeset -iU path
would suffice, and zsh takes care of the rest.

With awk:
awk -v RS=[:\"] 'NR > 1 { pth[$0]="" } END { for (i in pth) { if (i !~ /[[:space:]]+/ && i != "" ) { printf "%s:",i } } }' <<< "$a"
Set the record separator to : and double quotes. Then when the number record is greater than one, set up an array called pth with the path as the index. At the end, loop through the array, re printing the paths separated with :

Create variable by combining text + another variable

Long story short, I'm trying to grep a value contained in the first column of a text file by using a variable.
Here's a sample of the script, with the grep command that doesn't work:
for ii in `cat list.txt`
do
grep '^$ii' >outfile.txt
done
Contents of list.txt :
123,"first product",description,20.456789
456,"second product",description,30.123456
789,"third product",description,40.123456
If I perform grep '^123' list.txt, it produces the correct output... Just the first line of list.txt.
If I try to use the variable (ie grep '^ii' list.txt) I get a "^ii command not found" error. I tried to combine text with the variable to get it to work:
VAR1= "'^"$ii"'"
but the VAR1 variable contained a carriage return after the $ii variable:
'^123
'
I've tried a laundry list of things to remove the cr/lr (ie sed & awk), but to no avail. There has to be an easier way to perform the grep command using the variable. I would prefer to stay with the grep command because it works perfectly when performing it manually.

You have things mixed in the command grep '^ii' list.txt. The character ^ is for the beginning of the line and a $ is for the value of a variable.
When you want to grep for 123 in the variable ii at the beginning of the line, use
ii="123"
grep "^$ii" list.txt
(You should use double quotes here)
Good moment for learning good habits: Continue in variable names in lowercase (well done) and use curly braces (don't harm and are needed in other cases) :
ii="123"
grep "^${ii}" list.txt
Now we both are forgetting something: Our grep will also match
1234,"4-digit product",description,11.1111. Include a , in the grep:
ii="123"
grep "^${ii}," list.txt
And how did you get the "^ii command not found" error ? I think you used backquotes (old way for nesting a command, better is echo "example: $(date)") and you wrote
grep `^ii` list.txt # wrong !

#!/bin/sh
# Read every character before the first comma into the variable ii.
while IFS=, read ii rest; do
# Echo the value of ii. If these values are what you want, you're done; no
# need for grep.
echo "ii = $ii"
# If you want to find something associated with these values in another
# file, however, you can grep the file for the values. Use double quotes so
# that the value of $ii is substituted in the argument to grep.
grep "^$ii" some_other_file.txt >outfile.txt
done <list.txt

how to remove comments from a bash script

I'm trying to make a script that is getting a script file as a param. It should remove comments from the file and pipe it to another script. (with no temp file if possible)
at the beginning I was thinkig of doing this
cut -d"#" -f1 $1 | ./script_name
but it also clears a part of lines which aren't comments, because there are a few commands which uses # in them (counting string chars for example).
is there a way of doing it without a temp file?

You can use inline sed with better regex:
sed -i.bak '/^[[:blank:]]*#/d "$1"
^[[:blank:]]*# will match # only if is preceded by optional spaces at each line
-i.bak option will inline edit the input file with .bak as the extension of the backup file in case something goes wrong.

Here's one very bash-specific way of stripping comments from a script file. It also strips the she-bang line, if there was one (after all, it's a comment), and does some reformatting:
tmp_="() {
$(<script_name)
}" bash -c 'declare -f tmp_' | tail -n+2
This converts the script into a function, and uses the bash built-in declare to pretty-print the resulting function (the tail removes the function name, but not the surrounding braces; a more complicated post-process could remove them, too, if that were judged necessary).
The pretty-printing is done in a child bash process both to avoid polluting the execution environment with the temporary function and because the subprocess will effectively recognize the string value of the variable as a function.
Update:
Sadly, post shellshock the above no longer works. However, for patched bashes, the following probably does:
env "BASH_FUNC_tmp_%%=() {
$(<script_name)
}" bash -c 'declare -f tmp_' | tail -n+2
Also, note that this method does not strip comments which are internal to command or process substitution.

Yes, in general this type of problem can be solved without a temporary file.
This however will also depend on the complexity of the parsing required to determine when the comment delimiter character doesn't in fact introduce a comment.

Using python3 and install pygments
from pygments.lexers.shell import BashLexer
from pygments.token import Token, is_token_subtype
def delete_comments(fname):
src = open(fname, "r").read()
dst = open(fname, "w")
for token in BashLexer().get_tokens(src):
if not (is_token_subtype(token[0], Token.Comment)):
dst.write(token[1])
if token[0] == Token.Comment.Hashbang:
dst.write(token[1])
if __name__ == "__main__":
delete_comments("/path/to/your/shellScript.sh")

Match exact word in bash script, extract number from string

I'm trying to create a very simple bash script that will open new link base on the input command
Use case #1
$ ./myscript longname55445
It should take the number 55445 and then assign that to a variable which will later be use to open new link based on the given number.
Use case #2
$ ./myscript l55445
It should do the exact same thing as above by taking the number and then open the same link.
Use case #3
$ ./myscript 55445
If no prefix given then we just simply open that same link as a fallback.
So far this is what I have
#!/bin/sh
BASE_URL=http://api.domain.com
input=$1
command=${input:0:1}
if [ "$command" == "longname" ]; then
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
elseif [ "$command" == "l" ]; then
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
else
number=${input:1:${#input}}
url="$BASE_URL?id="$number
open $url
fi
But this will always fallback to the elseif there.
I'm using zsh at the moment.

input=$1
command=${input:0:1}
sets command to the first character of the first argument. It's not possible for a one character string to be equal to an eight-character string ("longname"), so the if condition must always fail.
Furthermore, both your elseif and your else clauses set
number=${input:1:${#input}}
Which you could have written more simply as
number=${input:1}
But in both cases, you're dropping the first character of input. Presumably in the else case, you wanted the entire first argument.

see whether this construct is helpful for your purpose:
#!/bin/bash
name="longname55445"
echo "${name##*[A-Za-z]}"
this assumes a letter adjacent to number.
The following is NOT another way to write the same, because it is wrong.
Please see comments below by mklement0, who noticed this. Mea culpa.
echo "${name##*[:letter:]}"

You have command=${input:0:1}
It takes the first single char, and you compare it to "longname", of course it will fail, and go to elseif.
The key problem is to check if the input is beginning with l or longnameor nothing. If in one of the 3 cases, take the trailing numbers.
One grep line could do it, you can just grep on input and get the returned text:
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"l234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"longname234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"234"
234
kent$ grep -Po '(?<=longname|l|^)\d+' <<<"foobar234"
<we got nothing>

You can use regex matching in bash.
[[ $1 =~ [0-9]+ ]] && number=$BASH_REMATCH
You can also use regex matching in zsh.
[[ $1 =~ [0-9]+ ]] && number=$MATCH

Based on the OP's following clarification in a comment,
I'm only looking for the numbers [...] given in the input.
the solution can be simplified as follows:
#!/bin/bash
BASE_URL='http://api.domain.com'
# Strip all non-digits from the 1st argument to get the desired number.
number=$(tr -dC '[:digit:]' <<<"$1")
open "$BASE_URL?id=$number"
Note the use of a bash shebang, given the use of 'bashism' <<< (which could easily be restated in a POSIX-compliant manner).
Similarly, the OP's original code should use a bash shebang, too, due to use of non-POSIX substring extraction syntax.
However, judging by the use of open to open a URL, the OP appears to be on OSX, where sh is essentially bash (though invocation as sh does change behavior), so it'll still work there. Generally, though, it's safer to be explicit about the required shell.

Unix Shell Script to take multiple files from standard input (csh)

Using either the for loop or the pipe (both work with one filename), I need to figure out how to accept unlimited specified files from standard input. I have tried regular expressions, and various wildcard forms. The two main issues I'm running into: only the first file is put through the script or every single file in the directory is put through. This is an assignment for a basic Unix Course and my problem thus far is over-complication. Based on the rest of the semester, there's a simple fix for what I'm wanting to do and here I've spent two hours perusing hundreds of websites and posts making my head spin.
EDIT: The command line prompt would be something like this ~/dir/script currentWord newWord fileName1 fileName2 fileName3
#!/bin/csh
set currentWord=$1
set newWord=$2
set fileName=$3
if { grep -q $1 *$3 } then
sed -i.bak -e "s/$1/$2/g" $3
else
echo "The string is not found."
endif
#grep -q $1 $3 | sed -i.bak -e "s/$1/$2/g" $3

You can access the command line arguments using $argv[]. To loop over them but skip the first two, you can use this construct:
foreach file ($argv[3-])
# do stuff here, eg
echo $file
end
You shouldn't use csh though, if you have been instructed to do so by your professor I would question this.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio