vigenere decryption - encryption in bash [duplicate] - bash

I'm trying to make encryption and decryption with a Vigenère cipher.
It's a part of a greater task where the Vigenère cipher plays a small part. I got this encryption script from bash to work. The problem is how i get to use the same code in reverse to decrypt the code.
#!/usr/local/bin/bash
# vigenere.sh
# http://en.wikipedia.org/wiki/Vigen%C3%A8re_cipher
a="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
[[ "${*/-d/}" != "" ]] &&
echo "Usage: $0 [-d]" && exit 1
m=${1:+-}
printf "string: ";read t
printf "keyphrase: ";read -s k
printf "\n"
for ((i=0;i<${#t};i++)); do
p1=${a%%${t:$i:1}*}
p2=${a%%${k:$((i%${#k})):1}*}
d="${d}${a:$(((${#p1}${m:-+}${#p2})%${#a})):1}"
done
echo "$d"

To see what it does just launch it with bash -x option set, for example if script is saved in vig.sh:
bash -x vig.sh
basically, a to store the uppercase alphabet
-d, is an optional parameter to decrypt when set m will bet set to -
this reads from input t to store the source string, k the key
printf "string: ";read t
printf "keyphrase: ";read -s k
the following makes a loop over character indices of variable t
for ((i=0;i<${#t};i++)); do
p1 contains the alphabet with the suffix begining with current character from t removed
p1=${a%%${t:$i:1}*}
p2 does the same with current character from key (with a modulo to avoid out of bounds)
then the sum or difference (when -d option is set) between p1 and p2 lengths is used to get the character in alphabet and appended to d.
Examples
vig.sh
string: HELLOWORLD
keyphrase: FOO
-> MSZQCKTFZI
vig.sh -d
string: MSZQCKTFZI
keyphrase: FOO
-> HELLOWORLD

to run the decryption, type ./vig.sh -d in the commmand. That should do it since you have declare the [[ "${*/-d/}" != "" ]] in the code.

Related

How to iterate over multiple variables and echo them using Shell Script?

Consider the below variables which are dynamic and might change each time. Sometimes there might even be 5 variables, But the length of all the variables will be the same every time.
var1='a b c d e... upto z'
var2='1 2 3 4 5... upto 26'
var3='I II III IV V... upto XXVI'
I am looking for a generalized approach to iterate the variables in a for loop & My desired output should be like below.
a,1,I
b,2,II
c,3,III
d,4,IV
e,5,V
.
.
goes on upto
z,26,XXVI
If I use nested loops, then I get all possible combinations which is not the expected outcome.
Also, I know how to make this work for 2 variables using for loop and shift using below link
https://unix.stackexchange.com/questions/390283/how-to-iterate-two-variables-in-a-sh-script
With paste
paste -d , <(tr ' ' '\n' <<<"$var1") <(tr ' ' '\n' <<<"$var2") <(tr ' ' '\n' <<<"$var3")
a,1,I
b,2,II
c,3,III
d,4,IV
e...z,5...26,V...XXVI
But clearly having to add other parameter substitutions for more varN's is not scalable.
You need to "zip" two variables at a time.
var1='a b c d e...z'
var2='1 2 3 4 5...26'
var3='I II III IV V...XXVI'
zip_var1_var2 () {
set $var1
for v2 in $var2; do
echo "$1,$v2"
shift
done
}
zip_var12_var3 () {
set $(zip_var1_var2)
for v3 in $var3; do
echo "$1,$v3"
shift
done
}
for x in $(zip_var12_var3); do
echo "$x"
done
If you are willing to use eval and are sure it is safe to do so, you can write a single function like
zip () {
if [ $# -eq 1 ]; then
eval echo \$$1
return
fi
a1=$1
shift
x=$*
set $(eval echo \$$a1)
for v in $(zip $x); do
printf '=== %s\n' "$1,$v" >&2
echo "$1,$v"
shift
done
}
zip var1 var2 var3 # Note the arguments are the *names* of the variables to zip
If you can use arrays, then (for example, in bash)
var1=(a b c d e)
var2=(1 2 3 4 5)
var3=(I II III IV V)
for i in "${!var1[#]}"; do
printf '%s,%s,%s\n' "${var1[i]}" "${var2[i]}" "${var3[i]}"
done
Use this Perl one-liner:
perl -le '#in = map { [split] } #ARGV; for $i ( 0..$#{ $in[0] } ) { print join ",", map { $in[$_][$i] } 0..$#in; }' "$var1" "$var2" "$var3"
Prints:
a,1,I
b,2,II
c,3,III
d,4,IV
e,5,V
z,26,XXVI
The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-l : Strip the input line separator ("\n" on *NIX by default) before executing the code in-line, and append it when printing.
The input variables must be quoted with double quotes "like so", to keep the blank-separated words from being treated as separate arguments.
#ARGV is an array of the command line arguments, here $var1, $var2, $var3.
#in is an array of 3 elements, each element being a reference to an array obtained as a result of splitting the corresponding element of #ARGV on whitespace. Note that split splits the string on whitespace by default, but you can specify a different delimiter, it accepts regexes.
The subsequent for loop prints #in elements separated by comma.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches
perldoc perlvar: Perl predefined variables
The following is (almost) a copy of this answer with a few tweaks that make it fit this question.
The Original Question
First let’s assign a few variables to play with, 26 tokens in each of them:
var1="$(echo {a..z})"
var2="$(echo {1..26})"
var3="$(echo I II III IV \
V{,I,II,III} IX \
X{,I,II,III} XIV \
XV{,I,II,III} XIX \
XX{,I,II,III} XXIV \
XXV XXVI)"
var4="$(echo {A..Z})"
var5="$(echo {010101..262626..10101})"
Now we want a “magic” function that zips an arbitrary number of variables, ideally in pure Bash:
zip_vars var1 # a trivial test
zip_vars var{1..2} # a slightly less trivial test
zip_vars var{1..3} # the original question
zip_vars var{1..4} # more vars, becasuse we can
zip_vars var{1..5} # more vars, because why not
What could zip_vars look like? Here’s one in pure Bash, without any external commands:
zip_vars() {
local var
for var in "$#"; do
local -a "array_${var}"
local -n array_ref="array_${var}"
array_ref=(${!var})
local -ar "array_${var}"
done
local -n array_ref="array_${1}"
local -ir size="${#array_ref[#]}"
local -i i
local output
for ((i = 0; i < size; ++i)); do
output=
for var in "$#"; do
local -n array_ref="array_${var}"
output+=",${array_ref[i]}"
done
printf '%s\n' "${output:1}"
done
}
How it works:
It splits all variables (passed by reference (by variable name)) into arrays. For each variable varX it creates a local array array_varX.
It would be actually way easier if the input variables were already Bash arrays to start with (see below), but … we stick with the original question initially.
It determines the size of the first array and then blindly expects all arrays to be of that size.
For each index i from 0 to size - 1 it concatenates the ith elements of all arrays, separated by ,.
Arrays Make Things Easier
If you use Bash arrays from the very start, the script will be shorter and look simpler and there won’t be any string-to-array conversions.
zip_arrays() {
local -n array_ref="$1"
local -ir size="${#array_ref[#]}"
local -i i
local output
for ((i = 0; i < size; ++i)); do
output=
for arr in "$#"; do
local -n array_ref="$arr"
output+=",${array_ref[i]}"
done
printf '%s\n' "${output:1}"
done
}
arr1=({a..z})
arr2=({1..26})
arr3=( I II III IV
V{,I,II,III} IX
X{,I,II,III} XIV
XV{,I,II,III} XIX
XX{,I,II,III} XXIV
XXV
XXVI)
arr4=({A..Z})
arr5=({010101..262626..10101})
zip_arrays arr1 # a trivial test
zip_arrays arr{1..2} # a slightly less trivial test
zip_arrays arr{1..3} # (almost) the original question
zip_arrays arr{1..4} # more arrays, becasuse we can
zip_arrays arr{1..5} # more arrays, because why not

Is it possible to save perl hash into bash array?

I have done some processing in perl, and got the result in perl's hash data structure. Usually in bash, when I try to retrieve result from other script like
output=$(perl -E '...')
I got the output in string. Is it possible to save the result in bash array?
Assuming a perl variable hash is an associative array, please try:
declare -A "output=($(perl -e '
$hash{"foo"} = "xx"; # just an example
$hash{"bar"} = "yy"; # ditto
for (keys %hash) {print "[\"$_\"]=\"$hash{$_}\"\n"}'))"
for i in "${!output[#]}"; do
echo "$i => ${output[$i]}" # see the result
done
The outermost double quotes around output=.. is required to tell declare
to evaluate the argument.
[Update]
Considering tripleee's comment, here is a robust version against special characters:
mapfile -d "" -t a < <(perl -e '
$hash{"baz"} = "boo"; # example
$hash{"foo"} = "x\"x"; # example with a double quote
$hash{"bar"} = "y\ny"; # example with a newline
print join("\0", %hash), "\0"') # use a nul byte as a delimiter
declare -A output # bash associative array
for ((i = 0; i < ${#a[#]}; i+=2 )); do
output[${a[i]}]=${a[i+1]} # key and value pair
done
for i in "${!output[#]}"; do
echo "$i => ${output[$i]}" # see the result
done
The conversion from perl variables to bash variables works only if they are free of null bytes (\n), as perl can store null bytes in strings, but bash cannot.
At least, we can use that limitation to print the hash in perl with null delimiters and safely parse it in bash again:
declare -A "array=($(
perl -e 'print join("\0", %hash), "\0"' |
xargs -0 printf '[%q]=%q '
))"
Please note that neither %q nor -0 are specified by posix. For a more portable solution see tshiono's answer.
If the hash is very big such that ARG_MAX might be exceeded you should ensure that xargs does not split a key value pair across two calls to printf. To do so, add the option -n2 (or any other number 2n where you are sure that n key value pairs never exceed ARG_MAX).

How to loop through a range of characters in a bash script using ASCII values?

I am trying to write a bash script which will read two letter variables (startletter/stopletter) and after that I need to print from the start letter to the stop letter with a for or something else. How can I do that?
I tried to do
#! /bin/bash
echo "give start letter"
read start
echo "give stop letter" read stop
But none of the for constructs work
#for value in {a..z}
#for value in {$start..$stop}
#for (( i=$start; i<=$stop; i++)) do echo "Letter: $c" done
This question is very well explained in BashFAQ/071 How do I convert an ASCII character to its decimal (or hexadecimal) value and back?
# POSIX
# chr() - converts decimal value to its ASCII character representation
# ord() - converts ASCII character to its decimal value
chr () {
local val
[ "$1" -lt 256 ] || return 1
printf -v val %o "$1"; printf "\\$val "
# That one requires bash 3.1 or above.
}
ord() {
# POSIX
LC_CTYPE=C printf %d "'$1"
}
Re-using them for your requirement, a proper script would be written as
read -p "Input two variables: " startLetter stopLetter
[[ -z "$startLetter" || -z "$stopLetter" ]] && { printf 'one of the inputs is empty\n' >&2 ; }
asciiStart=$(ord "$startLetter")
asciiStop=$(ord "$stopLetter")
for ((i=asciiStart; i<=asciiStop; i++)); do
chr "$i"
done
Would print the letters as expected.
Adding it to community-wiki since this is also a cross-site duplicate from Unix.SE - Bash script to get ASCII values for alphabet
In case you feel adventurous and want to use zsh instead of bash, you can use the following:
For zsh versions below 5.0.7 you can use the BRACE_CCL option:
(snip man zshall) If a brace expression matches none of the above forms, it is left
unchanged, unless the option BRACE_CCL (an abbreviation for 'brace character class') is set. In that case, it is expanded to a list of the individual characters between the braces sorted into the order of the characters in the ASCII character set (multibyte characters are not currently handled). The syntax is similar to a [...] expression in filename generation: - is treated specially to denote a range of characters, but ^ or ! as the first character is treated normally. For example, {abcdef0-9}
expands to 16 words 0 1 2 3 4 5 6 7 8 9 a b c d e f.
#!/usr/bin/env zsh
setopt brace_ccl
echo "give start letter"
read cstart
echo "give stop letter"
read cstop
for char in {${cstart}-${cstop}}; do echo $char; done
For zsh versions from 5.0.7 onwards you can use the default brace expansion :
An expression of the form {c1..c2}, where c1 and c2 are single characters (which may be multibyte characters), is expanded to every character in the range from c1 to c2 in whatever character sequence is used internally. For characters with code points below 128 this is US ASCII (this is the only case most users will need). If any intervening character is not printable, appropriate quotation is used to render it printable. If the character sequence is reversed, the output is in reverse order, e.g. {d..a} is substituted as d c b a.
#!/usr/bin/env zsh
echo "give start letter"
read cstart
echo "give stop letter"
read cstop
for char in {${cstart}..${cend}; do echo $char; done
More information on zsh can be found here and the quick reference

Shell script to find possible string sequences

I have a text file in the following format:
A Apple
A Ant
B Bat
B Ball
The number of definitions of each character can be any number.
I am writing a shell script which will receive inputs like "A B". The output of the shell script I am expecting is the possible string sequences which can be created.
For input "A B", the outputs will be:
Apple Bat
Apple Ball
Ant Bat
Ant Ball
I tried arrays, It is not working as expected. Can anyone help with some ideas on how to solve this issue?
Use associative arrays to accomplish this:
#!/usr/bin/env bash
first_letter=$1
second_letter=$2
declare -A words # declare associative array
while read -r alphabet word; do # read ignores blank lines in input file
words+=(["$word"]="$alphabet") # key = word, value = alphabet
done < words.txt
for word1 in "${!words[#]}"; do
alphabet1="${words[$word1]}"
[[ $alphabet1 != $first_letter ]] && continue
for word2 in "${!words[#]}"; do
alphabet2="${words[$word2]}"
[[ $alphabet2 != $second_letter ]] && continue
printf "$word1 $word2\n" # print matching word pairs
done
done
Output with A B passed in as arguments (with the content in your question):
Apple Ball
Apple Bat
Ant Ball
Ant Bat
You may want to refer to this post for more info on associative arrays:
Appending to a hash table in Bash

How to urlencode data for curl command?

I am trying to write a bash script for testing that takes a parameter and sends it through curl to web site. I need to url encode the value to make sure that special characters are processed properly. What is the best way to do this?
Here is my basic script so far:
#!/bin/bash
host=${1:?'bad host'}
value=$2
shift
shift
curl -v -d "param=${value}" http://${host}/somepath $#
Use curl --data-urlencode; from man curl:
This posts data, similar to the other --data options with the exception that this performs URL-encoding. To be CGI-compliant, the <data> part should begin with a name followed by a separator and a content specification.
Example usage:
curl \
--data-urlencode "paramName=value" \
--data-urlencode "secondParam=value" \
http://example.com
See the man page for more info.
This requires curl 7.18.0 or newer (released January 2008). Use curl -V to check which version you have.
You can as well encode the query string:
curl --get \
--data-urlencode "p1=value 1" \
--data-urlencode "p2=value 2" \
http://example.com
# http://example.com?p1=value%201&p2=value%202
Another option is to use jq:
$ printf %s 'input text'|jq -sRr #uri
input%20text
$ jq -rn --arg x 'input text' '$x|#uri'
input%20text
-r (--raw-output) outputs the raw contents of strings instead of JSON string literals. -n (--null-input) doesn't read input from STDIN.
-R (--raw-input) treats input lines as strings instead of parsing them as JSON, and -sR (--slurp --raw-input) reads the input into a single string. You can replace -sRr with -Rr if your input only contains a single line or if you don't want to replace linefeeds with %0A:
$ printf %s\\n multiple\ lines of\ text|jq -Rr #uri
multiple%20lines
of%20text
$ printf %s\\n multiple\ lines of\ text|jq -sRr #uri
multiple%20lines%0Aof%20text%0A
Or this percent-encodes all bytes:
xxd -p|tr -d \\n|sed 's/../%&/g'
Here is the pure BASH answer.
Update: Since many changes have been discussed, I have placed this on https://github.com/sfinktah/bash/blob/master/rawurlencode.inc.sh for anybody to issue a PR against.
Note: This solution is not intended to encode unicode or multi-byte characters - which are quite outside BASH's humble native capabilities. It's only intended to encode symbols that would otherwise ruin argument passing in POST or GET requests, e.g. '&', '=' and so forth.
Very Important Note: DO NOT ATTEMPT TO WRITE YOUR OWN UNICODE CONVERSION FUNCTION, IN ANY LANGUAGE, EVER. See end of answer.
rawurlencode() {
local string="${1}"
local strlen=${#string}
local encoded=""
local pos c o
for (( pos=0 ; pos<strlen ; pos++ )); do
c=${string:$pos:1}
case "$c" in
[-_.~a-zA-Z0-9] ) o="${c}" ;;
* ) printf -v o '%%%02x' "'$c"
esac
encoded+="${o}"
done
echo "${encoded}" # You can either set a return variable (FASTER)
REPLY="${encoded}" #+or echo the result (EASIER)... or both... :p
}
You can use it in two ways:
easier: echo http://url/q?=$( rawurlencode "$args" )
faster: rawurlencode "$args"; echo http://url/q?${REPLY}
[edited]
Here's the matching rawurldecode() function, which - with all modesty - is awesome.
# Returns a string in which the sequences with percent (%) signs followed by
# two hex digits have been replaced with literal characters.
rawurldecode() {
# This is perhaps a risky gambit, but since all escape characters must be
# encoded, we can replace %NN with \xNN and pass the lot to printf -b, which
# will decode hex for us
printf -v REPLY '%b' "${1//%/\\x}" # You can either set a return variable (FASTER)
echo "${REPLY}" #+or echo the result (EASIER)... or both... :p
}
With the matching set, we can now perform some simple tests:
$ diff rawurlencode.inc.sh \
<( rawurldecode "$( rawurlencode "$( cat rawurlencode.inc.sh )" )" ) \
&& echo Matched
Output: Matched
And if you really really feel that you need an external tool (well, it will go a lot faster, and might do binary files and such...) I found this on my OpenWRT router...
replace_value=$(echo $replace_value | sed -f /usr/lib/ddns/url_escape.sed)
Where url_escape.sed was a file that contained these rules:
# sed url escaping
s:%:%25:g
s: :%20:g
s:<:%3C:g
s:>:%3E:g
s:#:%23:g
s:{:%7B:g
s:}:%7D:g
s:|:%7C:g
s:\\:%5C:g
s:\^:%5E:g
s:~:%7E:g
s:\[:%5B:g
s:\]:%5D:g
s:`:%60:g
s:;:%3B:g
s:/:%2F:g
s:?:%3F:g
s^:^%3A^g
s:#:%40:g
s:=:%3D:g
s:&:%26:g
s:\$:%24:g
s:\!:%21:g
s:\*:%2A:g
While it is not impossible to write such a script in BASH (probably using xxd and a very lengthy ruleset) capable of handing UTF-8 input, there are faster and more reliable ways. Attempting to decode UTF-8 into UTF-32 is a non-trivial task to do with accuracy, though very easy to do inaccurately such that you think it works until the day it doesn't.
Even the Unicode Consortium removed their sample code after discovering it was no longer 100% compatible with the actual standard.
The Unicode standard is constantly evolving, and has become extremely nuanced. Any implementation you can whip together will not be properly compliant, and if by some extreme effort you managed it, it wouldn't stay compliant.
Use Perl's URI::Escape module and uri_escape function in the second line of your bash script:
...
value="$(perl -MURI::Escape -e 'print uri_escape($ARGV[0]);' "$2")"
...
Edit: Fix quoting problems, as suggested by Chris Johnsen in the comments. Thanks!
One of variants, may be ugly, but simple:
urlencode() {
local data
if [[ $# != 1 ]]; then
echo "Usage: $0 string-to-urlencode"
return 1
fi
data="$(curl -s -o /dev/null -w %{url_effective} --get --data-urlencode "$1" "")"
if [[ $? != 3 ]]; then
echo "Unexpected error" 1>&2
return 2
fi
echo "${data##/?}"
return 0
}
Here is the one-liner version for example (as suggested by Bruno):
date | curl -Gso /dev/null -w %{url_effective} --data-urlencode #- "" | cut -c 3-
# If you experience the trailing %0A, use
date | curl -Gso /dev/null -w %{url_effective} --data-urlencode #- "" | sed -E 's/..(.*).../\1/'
for the sake of completeness, many solutions using sed or awk only translate a special set of characters and are hence quite large by code size and also dont translate other special characters that should be encoded.
a safe way to urlencode would be to just encode every single byte - even those that would've been allowed.
echo -ne 'some random\nbytes' | xxd -plain | tr -d '\n' | sed 's/\(..\)/%\1/g'
xxd is taking care here that the input is handled as bytes and not characters.
edit:
xxd comes with the vim-common package in Debian and I was just on a system where it was not installed and I didnt want to install it. The altornative is to use hexdump from the bsdmainutils package in Debian. According to the following graph, bsdmainutils and vim-common should have an about equal likelihood to be installed:
http://qa.debian.org/popcon-png.php?packages=vim-common%2Cbsdmainutils&show_installed=1&want_legend=1&want_ticks=1
but nevertheless here a version which uses hexdump instead of xxd and allows to avoid the tr call:
echo -ne 'some random\nbytes' | hexdump -v -e '/1 "%02x"' | sed 's/\(..\)/%\1/g'
I find it more readable in python:
encoded_value=$(python3 -c "import urllib.parse; print urllib.parse.quote('''$value''')")
the triple ' ensures that single quotes in value won't hurt. urllib is in the standard library. It work for example for this crazy (real world) url:
"http://www.rai.it/dl/audio/" "1264165523944Ho servito il re d'Inghilterra - Puntata 7
I've found the following snippet useful to stick it into a chain of program calls, where URI::Escape might not be installed:
perl -p -e 's/([^A-Za-z0-9])/sprintf("%%%02X", ord($1))/seg'
(source)
If you wish to run GET request and use pure curl just add --get to #Jacob's solution.
Here is an example:
curl -v --get --data-urlencode "access_token=$(cat .fb_access_token)" https://graph.facebook.com/me/feed
This may be the best one:
after=$(echo -e "$before" | od -An -tx1 | tr ' ' % | xargs printf "%s")
Direct link to awk version : http://www.shelldorado.com/scripts/cmds/urlencode
I used it for years and it works like a charm
:
##########################################################################
# Title : urlencode - encode URL data
# Author : Heiner Steven (heiner.steven#odn.de)
# Date : 2000-03-15
# Requires : awk
# Categories : File Conversion, WWW, CGI
# SCCS-Id. : #(#) urlencode 1.4 06/10/29
##########################################################################
# Description
# Encode data according to
# RFC 1738: "Uniform Resource Locators (URL)" and
# RFC 1866: "Hypertext Markup Language - 2.0" (HTML)
#
# This encoding is used i.e. for the MIME type
# "application/x-www-form-urlencoded"
#
# Notes
# o The default behaviour is not to encode the line endings. This
# may not be what was intended, because the result will be
# multiple lines of output (which cannot be used in an URL or a
# HTTP "POST" request). If the desired output should be one
# line, use the "-l" option.
#
# o The "-l" option assumes, that the end-of-line is denoted by
# the character LF (ASCII 10). This is not true for Windows or
# Mac systems, where the end of a line is denoted by the two
# characters CR LF (ASCII 13 10).
# We use this for symmetry; data processed in the following way:
# cat | urlencode -l | urldecode -l
# should (and will) result in the original data
#
# o Large lines (or binary files) will break many AWK
# implementations. If you get the message
# awk: record `...' too long
# record number xxx
# consider using GNU AWK (gawk).
#
# o urlencode will always terminate it's output with an EOL
# character
#
# Thanks to Stefan Brozinski for pointing out a bug related to non-standard
# locales.
#
# See also
# urldecode
##########################################################################
PN=`basename "$0"` # Program name
VER='1.4'
: ${AWK=awk}
Usage () {
echo >&2 "$PN - encode URL data, $VER
usage: $PN [-l] [file ...]
-l: encode line endings (result will be one line of output)
The default is to encode each input line on its own."
exit 1
}
Msg () {
for MsgLine
do echo "$PN: $MsgLine" >&2
done
}
Fatal () { Msg "$#"; exit 1; }
set -- `getopt hl "$#" 2>/dev/null` || Usage
[ $# -lt 1 ] && Usage # "getopt" detected an error
EncodeEOL=no
while [ $# -gt 0 ]
do
case "$1" in
-l) EncodeEOL=yes;;
--) shift; break;;
-h) Usage;;
-*) Usage;;
*) break;; # First file name
esac
shift
done
LANG=C export LANG
$AWK '
BEGIN {
# We assume an awk implementation that is just plain dumb.
# We will convert an character to its ASCII value with the
# table ord[], and produce two-digit hexadecimal output
# without the printf("%02X") feature.
EOL = "%0A" # "end of line" string (encoded)
split ("1 2 3 4 5 6 7 8 9 A B C D E F", hextab, " ")
hextab [0] = 0
for ( i=1; i<=255; ++i ) ord [ sprintf ("%c", i) "" ] = i + 0
if ("'"$EncodeEOL"'" == "yes") EncodeEOL = 1; else EncodeEOL = 0
}
{
encoded = ""
for ( i=1; i<=length ($0); ++i ) {
c = substr ($0, i, 1)
if ( c ~ /[a-zA-Z0-9.-]/ ) {
encoded = encoded c # safe character
} else if ( c == " " ) {
encoded = encoded "+" # special handling
} else {
# unsafe character, encode it as a two-digit hex-number
lo = ord [c] % 16
hi = int (ord [c] / 16);
encoded = encoded "%" hextab [hi] hextab [lo]
}
}
if ( EncodeEOL ) {
printf ("%s", encoded EOL)
} else {
print encoded
}
}
END {
#if ( EncodeEOL ) print ""
}
' "$#"
Here's a Bash solution which doesn't invoke any external programs:
uriencode() {
s="${1//'%'/%25}"
s="${s//' '/%20}"
s="${s//'"'/%22}"
s="${s//'#'/%23}"
s="${s//'$'/%24}"
s="${s//'&'/%26}"
s="${s//'+'/%2B}"
s="${s//','/%2C}"
s="${s//'/'/%2F}"
s="${s//':'/%3A}"
s="${s//';'/%3B}"
s="${s//'='/%3D}"
s="${s//'?'/%3F}"
s="${s//'#'/%40}"
s="${s//'['/%5B}"
s="${s//']'/%5D}"
printf %s "$s"
}
url=$(echo "$1" | sed -e 's/%/%25/g' -e 's/ /%20/g' -e 's/!/%21/g' -e 's/"/%22/g' -e 's/#/%23/g' -e 's/\$/%24/g' -e 's/\&/%26/g' -e 's/'\''/%27/g' -e 's/(/%28/g' -e 's/)/%29/g' -e 's/\*/%2a/g' -e 's/+/%2b/g' -e 's/,/%2c/g' -e 's/-/%2d/g' -e 's/\./%2e/g' -e 's/\//%2f/g' -e 's/:/%3a/g' -e 's/;/%3b/g' -e 's//%3e/g' -e 's/?/%3f/g' -e 's/#/%40/g' -e 's/\[/%5b/g' -e 's/\\/%5c/g' -e 's/\]/%5d/g' -e 's/\^/%5e/g' -e 's/_/%5f/g' -e 's/`/%60/g' -e 's/{/%7b/g' -e 's/|/%7c/g' -e 's/}/%7d/g' -e 's/~/%7e/g')
this will encode the string inside of $1 and output it in $url. although you don't have to put it in a var if you want. BTW didn't include the sed for tab thought it would turn it into spaces
Using php from a shell script:
value="http://www.google.com"
encoded=$(php -r "echo rawurlencode('$value');")
# encoded = "http%3A%2F%2Fwww.google.com"
echo $(php -r "echo rawurldecode('$encoded');")
# returns: "http://www.google.com"
http://www.php.net/manual/en/function.rawurlencode.php
http://www.php.net/manual/en/function.rawurldecode.php
If you don't want to depend on Perl you can also use sed. It's a bit messy, as each character has to be escaped individually. Make a file with the following contents and call it urlencode.sed
s/%/%25/g
s/ /%20/g
s/ /%09/g
s/!/%21/g
s/"/%22/g
s/#/%23/g
s/\$/%24/g
s/\&/%26/g
s/'\''/%27/g
s/(/%28/g
s/)/%29/g
s/\*/%2a/g
s/+/%2b/g
s/,/%2c/g
s/-/%2d/g
s/\./%2e/g
s/\//%2f/g
s/:/%3a/g
s/;/%3b/g
s//%3e/g
s/?/%3f/g
s/#/%40/g
s/\[/%5b/g
s/\\/%5c/g
s/\]/%5d/g
s/\^/%5e/g
s/_/%5f/g
s/`/%60/g
s/{/%7b/g
s/|/%7c/g
s/}/%7d/g
s/~/%7e/g
s/ /%09/g
To use it do the following.
STR1=$(echo "https://www.example.com/change&$ ^this to?%checkthe#-functionality" | cut -d\? -f1)
STR2=$(echo "https://www.example.com/change&$ ^this to?%checkthe#-functionality" | cut -d\? -f2)
OUT2=$(echo "$STR2" | sed -f urlencode.sed)
echo "$STR1?$OUT2"
This will split the string into a part that needs encoding, and the part that is fine, encode the part that needs it, then stitches back together.
You can put that into a sh script for convenience, maybe have it take a parameter to encode, put it on your path and then you can just call:
urlencode https://www.exxample.com?isThisFun=HellNo
source
You can emulate javascript's encodeURIComponent in perl. Here's the command:
perl -pe 's/([^a-zA-Z0-9_.!~*()'\''-])/sprintf("%%%02X", ord($1))/ge'
You could set this as a bash alias in .bash_profile:
alias encodeURIComponent='perl -pe '\''s/([^a-zA-Z0-9_.!~*()'\''\'\'''\''-])/sprintf("%%%02X",ord($1))/ge'\'
Now you can pipe into encodeURIComponent:
$ echo -n 'hèllo wôrld!' | encodeURIComponent
h%C3%A8llo%20w%C3%B4rld!
Python 3 based on #sandro's good answer from 2010:
echo "Test & /me" | python -c "import urllib.parse;print (urllib.parse.quote(input()))"
Test%20%26%20/me
This nodejs-based answer will use encodeURIComponent on stdin:
uriencode_stdin() {
node -p 'encodeURIComponent(require("fs").readFileSync(0))'
}
echo -n $'hello\nwörld' | uriencode_stdin
hello%0Aw%C3%B6rld
For those of you looking for a solution that doesn't need perl, here is one that only needs hexdump and awk:
url_encode() {
[ $# -lt 1 ] && { return; }
encodedurl="$1";
# make sure hexdump exists, if not, just give back the url
[ ! -x "/usr/bin/hexdump" ] && { return; }
encodedurl=`
echo $encodedurl | hexdump -v -e '1/1 "%02x\t"' -e '1/1 "%_c\n"' |
LANG=C awk '
$1 == "20" { printf("%s", "+"); next } # space becomes plus
$1 ~ /0[adAD]/ { next } # strip newlines
$2 ~ /^[a-zA-Z0-9.*()\/-]$/ { printf("%s", $2); next } # pass through what we can
{ printf("%%%s", $1) } # take hex value of everything else
'`
}
Stitched together from a couple of places across the net and some local trial and error. It works great!
uni2ascii is very handy:
$ echo -ne '你好世界' | uni2ascii -aJ
%E4%BD%A0%E5%A5%BD%E4%B8%96%E7%95%8C
Simple PHP option:
echo 'part-that-needs-encoding' | php -R 'echo urlencode($argn);'
What would parse URLs better than javascript?
node -p "encodeURIComponent('$url')"
Here is a POSIX function to do that:
url_encode() {
awk 'BEGIN {
for (n = 0; n < 125; n++) {
m[sprintf("%c", n)] = n
}
n = 1
while (1) {
s = substr(ARGV[1], n, 1)
if (s == "") {
break
}
t = s ~ /[[:alnum:]_.!~*\47()-]/ ? t s : t sprintf("%%%02X", m[s])
n++
}
print t
}' "$1"
}
Example:
value=$(url_encode "$2")
The question is about doing this in bash and there's no need for python or perl as there is in fact a single command that does exactly what you want - "urlencode".
value=$(urlencode "${2}")
This is also much better, as the above perl answer, for example, doesn't encode all characters correctly. Try it with the long dash you get from Word and you get the wrong encoding.
Note, you need "gridsite-clients" installed to provide this command:
sudo apt install gridsite-clients
Here's the node version:
uriencode() {
node -p "encodeURIComponent('${1//\'/\\\'}')"
}
Another php approach:
echo "encode me" | php -r "echo urlencode(file_get_contents('php://stdin'));"
Here is my version for busybox ash shell for an embedded system, I originally adopted Orwellophile's variant:
urlencode()
{
local S="${1}"
local encoded=""
local ch
local o
for i in $(seq 0 $((${#S} - 1)) )
do
ch=${S:$i:1}
case "${ch}" in
[-_.~a-zA-Z0-9])
o="${ch}"
;;
*)
o=$(printf '%%%02x' "'$ch")
;;
esac
encoded="${encoded}${o}"
done
echo ${encoded}
}
urldecode()
{
# urldecode <string>
local url_encoded="${1//+/ }"
printf '%b' "${url_encoded//%/\\x}"
}
Ruby, for completeness
value="$(ruby -r cgi -e 'puts CGI.escape(ARGV[0])' "$2")"
Here's a one-line conversion using Lua, similar to blueyed's answer except with all the RFC 3986 Unreserved Characters left unencoded (like this answer):
url=$(echo 'print((arg[1]:gsub("([^%w%-%.%_%~])",function(c)return("%%%02X"):format(c:byte())end)))' | lua - "$1")
Additionally, you may need to ensure that newlines in your string are converted from LF to CRLF, in which case you can insert a gsub("\r?\n", "\r\n") in the chain before the percent-encoding.
Here's a variant that, in the non-standard style of application/x-www-form-urlencoded, does that newline normalization, as well as encoding spaces as '+' instead of '%20' (which could probably be added to the Perl snippet using a similar technique).
url=$(echo 'print((arg[1]:gsub("\r?\n", "\r\n"):gsub("([^%w%-%.%_%~ ]))",function(c)return("%%%02X"):format(c:byte())end):gsub(" ","+"))' | lua - "$1")
In this case, I needed to URL encode the hostname. Don't ask why. Being a minimalist, and a Perl fan, here's what I came up with.
url_encode()
{
echo -n "$1" | perl -pe 's/[^a-zA-Z0-9\/_.~-]/sprintf "%%%02x", ord($&)/ge'
}
Works perfectly for me.

Resources