Reverse DNS-style string - bash

In a script running in a Debian environment, what is a good way to reverse a DNS-style string?
For example, if my script has:
example.org
What would be a good way to reverse it, so that the string would read:
org.example
A longer example:
www.example.org
should reverse to:
org.example.www

You could use an iterative approach to build a reversed address:
Initialize the reversed result to empty string
While the string contains .
Extract the last segment chopping off everything from the start until a dot using parameter expansion ${var##*.}
Chop off the last segment with another parameter expansion ${var%.*}
Append to the reversed result the previously saved last segment
Here's one way to implement using pure Bash features:
rdns() {
local s=$1
local reversed last
while [[ "$s" == *.* ]]; do
last=${s##*.}
s=${s%.*}
reversed=$reversed$last.
done
reversed=$reversed$s
echo "$reversed"
}
rdns example
rdns example.org
rdns www.example.org
Outputs:
example
org.example
org.example.www

this might do
s='www.example.org'
echo $s | tr '.' '\n' | tac | paste -sd.

You can try
sed -E 's/\./\n/g;s/$/\n/;:A;s/([^\n]*)\n(.*)(\n)(.*)/\2\3.\1\4/;tA;s/\n//'

Related

Linux bash parsing URL

How to parse the url, for example: https://download.virtualbox.org/virtualbox/6.1.36/VirtualBox-6.1.36-152435-Win.exe
So that only virtualbox.org/virtualbox/6.1.36 remains?
TEST_URLS=(
https://download.virtualbox.org/virtualbox/6.1.36/VirtualBox-6.1.36-152435-Win.exe
https://github.com/notepad-plus-plus/notepad-plus-plus/releases/download/v8.4.4/npp.8.4.4.Installer.x64.exe
https://downloads.sourceforge.net/project/libtirpc/libtirpc/1.3.1/libtirpc-1.3.1.tar.bz2
)
for url in "${TEST_URLS[#]}"; do
without_proto="${url#*:\/\/}"
without_auth="${without_proto##*#}"
[[ $without_auth =~ ^([^:\/]+)(:[[:digit:]]+\/|:|\/)?(.*) ]]
PROJECT_HOST="${BASH_REMATCH[1]}"
PROJECT_PATH="${BASH_REMATCH[3]}"
echo "given: $url"
echo " -> host: $PROJECT_HOST path: $PROJECT_PATH"
done
Using sed to match whether a sub domain is present (no matter how deep) or not.
$ sed -E 's~[^/]*//(([^.]*\.)+)?([^.]*\.[a-z]+/[^0-9]*[0-9.]+).*~\3~' <<< "${TEST_URLS[0]}"
virtualbox.org/virtualbox/6.1.36
Or in a loop
for url in "${TEST_URLS[#]}"; do
sed -E 's~[^/]*//(([^.]*\.)+)?([^.]*\.[a-z]+/[^0-9]*[0-9.]+).*~\3~' <<< "$url"
done
virtualbox.org/virtualbox/6.1.36
github.com/notepad-plus-plus/notepad-plus-plus/releases/download/v8.4.4
sourceforge.net/project/libtirpc/libtirpc/1.3.1
With your shown samples here is an awk solution. Written and tested in GNU awk.
awk '
match($0,/https?:\/\/([^/]*)(\/.*)\//,arr){
num=split(arr[1],arr1,"/")
if(num>2){
for(i=2;i<=num;i++){
firstVal=(firstVal?firstVal:"") arr1[i]
}
}
else{
firstVal=arr[1]
}
print firstVal arr[2]
}
' Input_file
Explanation: Using awk's match function here. Using GNU awk version of it, where it supports capturing groups getting stored into an array, making use of that functionality here. Using regex https?:\/\/([^/]*)(\/.*) could be also written as ^https?:\/\/([^/]*)(\/.*) where its getting created 2 capturing groups and creating arr also. Then checking if elements are more than 2 then keep last 2 else keep first 2(domain names one), then printing values as per requirement.
I tought about regex but cut makes this work easy.
url=https://download.virtualbox.org/virtualbox/6.1.36/VirtualBox-6.1.36-152435-Win.exe
echo $url | grep -Po '([^\/]*)(?=[0-9\.]*)(.*)\/' | cut -d '/' -f 3-
Result
virtualbox.org/virtualbox/6.1.36
So, if I am correct in assuming that you need to extract a string of the form...
hostname.tld/dirname
...where tld is the top-level domain and dirname is the path to the file.
So filtering out any url scheme and subdomains at the beginning, then also filtering out any file basename at the end?
All solutions have assumptions. Assuming one of the original thee letter top level domains ie. .com, .org, .net, .int, .edu, .gov, .mil.
This possible solution uses sed with the -r option for the regular expressions extension.
It creates two filters and uses them to chop off the ends that you don't want (hopefully).
It also uses a capture group in filter_end, so as to keep the / in the filter.
test_urls=(
'https://download.virtualbox.org/virtualbox/6.1.36/VirtualBox-6.1.36-152435-Win.exe'
'https://github.com/notepad-plus-plus/notepad-plus-plus/releases/download/v8.4.4/npp.8.4.4.Installer.x64.exe'
'https://downloads.sourceforge.net/project/libtirpc/libtirpc/1.3.1/libtirpc-1.3.1.tar.bz2'
)
for url in ${test_urls[#]}
do
filter_start=$(
echo "$url" | \
sed -r 's/([^.\/][a-z]+\.[a-z]{2,})\/.*//' )
filter_end=$(
echo "$url" | \
sed 's/.*\(\/\)/\1/g' )
out_string="${url#$filter_start}"
out_string="${out_string%$filter_end}"
echo "$out_string"
done
Output:
virtualbox.org/virtualbox/6.1.36
github.com/notepad-plus-plus/notepad-plus-plus/releases/download/v8.4.4
sourceforge.net/project/libtirpc/libtirpc/1.3.1

What ##*/ does in bash? [duplicate]

I have a string like this:
/var/cpanel/users/joebloggs:DNS9=domain.example
I need to extract the username (joebloggs) from this string and store it in a variable.
The format of the string will always be the same with exception of joebloggs and domain.example so I am thinking the string can be split twice using cut?
The first split would split by : and we would store the first part in a variable to pass to the second split function.
The second split would split by / and store the last word (joebloggs) into a variable
I know how to do this in PHP using arrays and splits but I am a bit lost in bash.
To extract joebloggs from this string in bash using parameter expansion without any extra processes...
MYVAR="/var/cpanel/users/joebloggs:DNS9=domain.example"
NAME=${MYVAR%:*} # retain the part before the colon
NAME=${NAME##*/} # retain the part after the last slash
echo $NAME
Doesn't depend on joebloggs being at a particular depth in the path.
Summary
An overview of a few parameter expansion modes, for reference...
${MYVAR#pattern} # delete shortest match of pattern from the beginning
${MYVAR##pattern} # delete longest match of pattern from the beginning
${MYVAR%pattern} # delete shortest match of pattern from the end
${MYVAR%%pattern} # delete longest match of pattern from the end
So # means match from the beginning (think of a comment line) and % means from the end. One instance means shortest and two instances means longest.
You can get substrings based on position using numbers:
${MYVAR:3} # Remove the first three chars (leaving 4..end)
${MYVAR::3} # Return the first three characters
${MYVAR:3:5} # The next five characters after removing the first 3 (chars 4-9)
You can also replace particular strings or patterns using:
${MYVAR/search/replace}
The pattern is in the same format as file-name matching, so * (any characters) is common, often followed by a particular symbol like / or .
Examples:
Given a variable like
MYVAR="users/joebloggs/domain.example"
Remove the path leaving file name (all characters up to a slash):
echo ${MYVAR##*/}
domain.example
Remove the file name, leaving the path (delete shortest match after last /):
echo ${MYVAR%/*}
users/joebloggs
Get just the file extension (remove all before last period):
echo ${MYVAR##*.}
example
NOTE: To do two operations, you can't combine them, but have to assign to an intermediate variable. So to get the file name without path or extension:
NAME=${MYVAR##*/} # remove part before last slash
echo ${NAME%.*} # from the new var remove the part after the last period
domain
Define a function like this:
getUserName() {
echo $1 | cut -d : -f 1 | xargs basename
}
And pass the string as a parameter:
userName=$(getUserName "/var/cpanel/users/joebloggs:DNS9=domain.example")
echo $userName
What about sed? That will work in a single command:
sed 's#.*/\([^:]*\).*#\1#' <<<$string
The # are being used for regex dividers instead of / since the string has / in it.
.*/ grabs the string up to the last backslash.
\( .. \) marks a capture group. This is \([^:]*\).
The [^:] says any character _except a colon, and the * means zero or more.
.* means the rest of the line.
\1 means substitute what was found in the first (and only) capture group. This is the name.
Here's the breakdown matching the string with the regular expression:
/var/cpanel/users/ joebloggs :DNS9=domain.example joebloggs
sed 's#.*/ \([^:]*\) .* #\1 #'
Using a single Awk:
... | awk -F '[/:]' '{print $5}'
That is, using as field separator either / or :, the username is always in field 5.
To store it in a variable:
username=$(... | awk -F '[/:]' '{print $5}')
A more flexible implementation with sed that doesn't require username to be field 5:
... | sed -e s/:.*// -e s?.*/??
That is, delete everything from : and beyond, and then delete everything up until the last /. sed is probably faster too than awk, so this alternative is definitely better.
Using a single sed
echo "/var/cpanel/users/joebloggs:DNS9=domain.example" | sed 's/.*\/\(.*\):.*/\1/'
I like to chain together awk using different delimitators set with the -F argument. First, split the string on /users/ and then on :
txt="/var/cpanel/users/joebloggs:DNS9=domain.com"
echo $txt | awk -F"/users/" '{print$2}' | awk -F: '{print $1}'
$2 gives the text after the delim, $1 the text before it.
I know I'm a little late to the party and there's already good answers, but here's my method of doing something like this.
DIR="/var/cpanel/users/joebloggs:DNS9=domain.example"
echo ${DIR} | rev | cut -d'/' -f 1 | rev | cut -d':' -f1

Bash matching part of string

Say I have a string like
s1="sxfn://xfn.oxbr.ac.uk:8843/xfn/mech2?XFN=/castor/
xf.oxbr.ac.uk/prod/oxbr.ac.uk/disk/xf20.m.ac.uk/prod/v1.8/pienug_ib-2/reco_c21_dr3809_r35057.dst"
or
s2="sxfn://xfn.gla.ac.uk:8841/xfn/mech2?XFN=/castor/
xf.gla.ac.uk/space/disk1/prod/v1.8/pienug_ib-2/reco_c21_dr3809_r35057.dst"
and I want in my script to extract the last part starting from prod/ i.e. "prod/v1.8/pienug_ib-2/reco_c21_dr3809_r35057.dst". Note that $s1 contains two occurrences of "prod/".
What is the most elegant way to do this in bash?
Using BASH string manipulations you can do:
echo "prod/${s1##*prod/}"
prod/v1.8/pienug_ib-2/reco_c21_dr3809_r35057.dst
echo "prod/${s2##*prod/}"
prod/v1.8/pienug_ib-2/reco_c21_dr3809_r35057.dst
With awk (which is a little overpowered for this, but it may be helpful if you have a file full of these strings you need to parse:
echo "sxfn://xfn.gla.ac.uk:8841/xfn/mech2?XFN=/castor/xf.gla.ac.uk/space/disk1/prod/v1.8/pienug_ib-2/reco_c21_dr3809_r35057.dst" | awk -F"\/prod" '{print "/prod"$NF}'
That's splitting the string by '/prod' then printing out the '/prod' delimiter and the last token in the string ($NF)
sed can do it nicely:
s1="sxfn://xfn.oxbr.ac.uk:8843/xfn/mech2?XFN=/castor/xf.oxbr.ac.uk/prod/oxbr.ac.uk/disk/xf20.m.ac.uk/prod/v1.8/pienug_ib-2/reco_c21_dr3809_r35057.dst"
echo "$s1" | sed 's/.*\/prod/\/prod/'
this relies on the earger matching of the .* part up front.

How to split string into component parts in Linux Bash/Shell

I'm writing the second version of my post-receive git hook.
I have a GL_REPO variable which conforms to:
/project.name/vhost-type/versioncodename
It may or may not have a trailing and/or preceding slash.
My current code misunderstood the function of the following code, and as a result it clearly duplicates $versioncodename into each variable:
# regex out project codename
PROJECT_NAME=${GL_REPO##*/}
echo "project codename is: $PROJECT_NAME"
# extract server target vhost-type -fix required
VHOST_TYPE=${GL_REPO##*/}
echo "server target is: $VHOST_TYPE"
# get server project - fix required
PROJECT_CODENAME=${GL_REPO##*/}
echo "server project is: $PROJECT_CODENAME"
What is the correct method for taking these elements one at a time from the back of the string, or guaranteeing that a three part string allocates these variables?
I guess it might be better to split into an array?
#!/bin/bash
GL_REPO=/project.name/vhost-type/versioncodename
GL_REPO=${GL_REPO#/} # remove preceding slash, if any
IFS=/ read -a arr <<< "$GL_REPO"
PROJECT_NAME="${arr[0]}"
VHOST_TYPE="${arr[1]}"
PROJECT_CODENAME="${arr[2]}"
UPDATE: an alternative solution by anishsane:
IFS=/ read PROJECT_NAME VHOST_TYPE PROJECT_CODENAME <<< "$GL_REPO"
You can use cut with a field separator to pull out items by order:
NAME=$(echo $GL_REPO | cut -d / -f 1)
You can repeat the same for other fields. The trailing/leading slash you can ignore (you'll get a NAME field being empty, for example) or you can strip off a leading slash with ${GL_REPO##/} (similarly, you can strip off a trailing slash with ${GL_REPO%%/}).
This is another way:
GL_REPO="/project.name/vhost-type/versioncodename"
GL_REPO="${GL_REPO/#\//}"
#^replace preceding slash (if any) with empty string.
IFS="/" arr=($GL_REPO)
echo "PN: ${arr[0]} VHT: ${arr[1]} VC: ${arr[2]}"
Using Bash Pattern Matching:
GL_REPO="/project.name/vhost-type/versioncodename"
patt="([^/]+)/([^/]+)/([^/]+)"
[[ $GL_REPO =~ $patt ]]
echo "PN: ${BASH_REMATCH[1]} VHT: ${BASH_REMATCH[2]} VC: ${BASH_REMATCH[3]}"

extract ip address from variable string

I'm trying to create a bash script which will be able to change the "allow from" ip address in the phpmyadmin command file (which im still not sure is possible to do) and restart apache
I'm currently trying to extract an ip address from a variable and after searching the web I still have no clue, here is what I have so far...
#bash shell script
#!/bin/bash
clear
echo "Get client IP address"
ip=$(last -i)
echo $ip
exit
echo "restart apache"
/etc/init.d/apache2 reload
I've tried adding the following line with no luck
ip=$(head -n 1 $ip)
If anyone can tell me how I can extract the first instance of an IP address from the variables $ip I would appreciate it very much.
ip=$(last -i | head -n 1 | awk '{print $3}')
Update:
ip=$(last -i | grep -Pom 1 '[0-9.]{7,15}')
You can use grep with read:
read ip < <(last -i | grep -o '[0-9]\+[.][0-9]\+[.][0-9]\+[.][0-9]\+')
read ip < <(last -i | grep -Eo '[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+')
\b may also be helpful there. Just not sure about its compatibility.
And yet another:
ip=$(last -i | gawk 'BEGIN { RS = "[ \t\n]"; FS = "." } /^([0-9]+[.]){3}[0-9]+$/ && ! rshift(or(or($1, $2), or($3, $4)), 8) { print ; exit; }')
To get the first instance you can just do:
ip=$(last -i -1 | awk '{print $3}')
I'd just do
ip=$(last -i -1 | grep -Po '(\d+\.){3}\d+')
The above uses grep with Perl Compatible Regular Expressions which lets us use \d for digits. The regular expression looks for three repetitions of [0-9] followed by a dot (so, for example 123.45.123), then another stretch of digits. The -o flag causes grep to only print the matching line.
This approach has the advantage of working even when the number of fields per line changes (as is often the case, for example with system boot as the 2nd field). However, it needs GNU grep so if you need a more portable solution, use #konsolebox's answer instead.
Using bash only :
read -ra array < <(last -i)
ip="${array[2]}"
Or :
read -ra array < <(last -1 -i)
ip="${array[2]}"
Or if you're a nitpicker (and have a grep with -P), you can test the next:
while read -r testline
do
echo "input :=$testline="
read ip < <(grep -oP '\b(?:(?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2})[.](?:25[0-5]|2[0-4][0-9]|[0-1]?[0-9]{1,2}))\b' <<< "$testline")
echo "result:=${ip:=NOTFOUND}="
echo
done <<EOF
some bla bla 127.0.0.1 some
10.10.10.10
bad one 300.200.300.400
some other bla 127.0.0.1 some another 10.1.1.0
10.10.10.10 10.1.1.0
bad one 300.200.300.400 and good 192.168.1.1
above is empty and no ip here too
EOF
It skips wrong ip adr, like 800.1.1.1 so, for the above test prints:
input :=some bla bla 127.0.0.1 some=
result:=127.0.0.1=
input :=10.10.10.10=
result:=10.10.10.10=
input :=bad one 300.200.300.400=
result:=NOTFOUND=
input :=some other bla 127.0.0.1 some another 10.1.1.0=
result:=127.0.0.1=
input :=10.10.10.10 10.1.1.0=
result:=10.10.10.10=
input :=bad one 300.200.300.400 and good 192.168.1.1=
result:=192.168.1.1=
input :==
result:=NOTFOUND=
input :=above is empty and no ip here too=
result:=NOTFOUND=
The \b is needed to skip matching an ip, like: 610.10.10.10, what is containing a valid ip (10.10.10.10).
The regex is taken from: https://metacpan.org/pod/Regexp::Common::net
Since I happen to have needed to do something in the same ballpark, here is a basic regular expression and an extended regular expression to loosly match an IP address (v4) making sure that there are 4 sequences of 1-3 numbers delimited by a 3 '.'.
# Basic Regular Expression to loosly match an IP address:
bre_match_ip="[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}"
# Extended Regular Expression to loosly match an IP address:
ere_match_ip="[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
Of course when matching IP (v4) addresses from a file (say HTML) it's quite easy to inadvertently match a version string or an url which contains versioning as part of its file path. The following is some Awk code I wrote a while ago for use in a Bash script to extract valid unique (no duplicates) IP addresses from a file. It avoids version numbers whether in the text or as part of an url and makes sure the IP numbers are in range.
I appreciate that this is overkill for the original poster and that it is not tailored for his needs but someone doing a search may come across this answer and find the fairly comprehensive nature of the code useful. The Awk code is thankfully well commented as it uses some slightly obscure aspects of Awk that the casual Awk user would probably not be familiar with.
awkExtractIPAddresses='
BEGIN {
# Regex to match an IP address like sequence (even if too long to be an IP).
# This is deliberately a loose match, the END section will check for IP
# address validity.
ipLikeSequence = "[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+[0-9.]*";
# Regex to match a number sequence longer than 3 digits.
digitSequenceTooLongNotIP = "[0-9][0-9][0-9][0-9]+";
# Regex to match an IP address like sequence which is a version number.
# Equivalent to "(version|ver|v)[ .:]*" if "tolower($0)" was used.
versioningNotIP = "[Vv]([Ee][Rr]([Ss][Ii][Oo][Nn])?)?[ .:]*" ipLikeSequence;
# Regexes to match IP address like sequences next to forward slashes, to
# avoid version numbers in urls: e.g. http://web.com/libs/1.6.1.0/file.js
beginsWithFwdSlashNotIP = "[/]" ipLikeSequence;
endsWithFwdSlashNotIP = ipLikeSequence "[/]";
}
{
# Set line to the current line (more efficient than using $0 below).
line = $0;
# Replace sequences on line which will interfere with extracting genuine
# IPs. Use a replacement char and not the empty string to avoid accidentally
# creating a valid IP address from digits on either side of the removed
# sections. Use "/" as the replacement char for the 2 "FwdSlash" regexes so
# that multiple number dot slash sequences all get removed, as using "x"
# could result in inadvertently leaving such a sequence in place.
# e.g. "/lib1.6.1.0/1.2.3.4/5.6.7.8/file.js" leaves "/lib1.6.1.0xx/file.js"
gsub(digitSequenceTooLongNotIP, "x", line);
gsub(versioningNotIP, "x", line);
gsub(beginsWithFwdSlashNotIP, "/", line);
gsub(endsWithFwdSlashNotIP, "/", line);
# Loop through the current line matching IP address like sequences and
# storing them in the index of the array ipUniqueMatches. By using ipMatch
# as the array index duplicates are avoided and the values can be easily
# retrieved by the for loop in the END section. match() automatically sets
# the built in variables RSTART and RLENGTH.
while (match(line, ipLikeSequence))
{
ipMatch = substr(line, RSTART, RLENGTH);
ipUniqueMatches[ipMatch];
line = substr(line, RSTART + RLENGTH + 1);
}
}
END {
# Define some IP address related constants.
ipRangeMin = 0;
ipRangeMax = 255;
ipNumSegments = 4;
ipDelimiter = ".";
# Loop through the ipUniqueMatches array and print any valid IP addresses.
# The awk "for each" type of loop is different from the norm. It provides
# the indexes of the array and NOT the values of the array elements which
# is more usual in this type of loop.
for (ipMatch in ipUniqueMatches)
{
numSegments = split(ipMatch, ipSegments, ipDelimiter);
if (numSegments == ipNumSegments &&
ipSegments[1] >= ipRangeMin && ipSegments[1] <= ipRangeMax &&
ipSegments[2] >= ipRangeMin && ipSegments[2] <= ipRangeMax &&
ipSegments[3] >= ipRangeMin && ipSegments[3] <= ipRangeMax &&
ipSegments[4] >= ipRangeMin && ipSegments[4] <= ipRangeMax)
{
print ipMatch;
}
}
}'
# Extract valid IP addresses from $fileName, they will each be separated
# by a new line.
awkValidIpAddresses=$(awk "$awkExtractIPAddresses" < "$fileName")
I hope this is of interest.
You could use Awk.
ip=$(awk '{if(NR == 1) {print $3; exit;}}' < <(last -i))

Resources