I have two files, one with about 100 root domains, and second file with URLs only. Now I have to filter that URL list to get third file which contains only URLs that have domains from the list.
Example of URL list:
| URL |
| ------------------------------|
| http://github.com/name |
| http://stackoverflow.com/name2|
| http://stackoverflow.com/name3|
| http://www.linkedin.com/name3 |
Example of word list:
github.com
youtube.com
facebook.com
Resut:
| http://github.com/name |
My goal is to filter out whole row where URL contain specific word. This is what I tried:
for i in $(cat domains.csv);
do grep "$i" urls.csv >> filtered.csv ;
done
Result is strange, I've got some of the links, but not all of them that contain root domains from the first file. Then I tried to do the same thing with python and saw that bash doesn't do what I wanted, I've got better result with python script, but it takes more time to write python script than running bash commands.
How shoud I accomplish this with bash in further ?
Using grep:
grep -F -f domains.csv url.csv
Test Results:
$ cat wordlist
github.com
youtube.com
facebook.com
$ cat urllist
| URL |
| ------------------------------|
| http://github.com/name |
| http://stackoverflow.com/name2|
| http://stackoverflow.com/name3|
| http://www.linkedin.com/name3 |
$ grep -F -f wordlist urllist
| http://github.com/name |
Related
I've been trying to write a little script to sort image files in my Linux server.
I tried multiple solution found all over StackExchange but it never meets my requirements.
Explanation:
photo_folder are filled with images (various extensions).
Mostly, images are already in this folder.
But sometime, like the example below, images are hidden in one or multiple photo_subfolder and file names are often the same such as 1.jpg, 2.jpg... in each of them.
Basically, I would like to move all image files from photo_subfolder to their photo_folder and all duplicated filenames to be renamed before merging together.
Example:
|parent_folder
| |photo_folder
| | |photo_subfolder1
| | | 1.jpg
| | | 2.jpg
| | | 3.jpg
| | |photo_subfolder2
| | | 1.jpg
| | | 2.jpg
| | | 3.jpg
| | |photo_subfolder3
| | | 1.jpg
| | | 2.jpg
| | | 3.jpg
Expectation:
|parent_folder
| |photo_folder
| | 1_a.jpg
| | 2_a.jpg
| | 3_a.jpg
| | 1_b.jpg
| | 2_b.jpg
| | 3_b.jpg
| | 1_c.jpg
| | 2_c.jpg
| | 3_c.jpg
Note that files names are just an example. Could be anything.
Thank you!
You can replace the / of the subdirectories with another character, e.g. _ , and then cp/mv the original file to the parent directory.
I try to recreate an example of your directory tree here - very simple, but I hope it can be adapted to your case. Note that I am using bash.
#!/bin/bash
bd=parent
mkdir ${bd}
for i in $(seq 3); do
mkdir -p "${bd}/photoset_${i}/subset_${i}"
for j in $(seq 5); do
touch "${bd}/photoset_${i}/${j}.jpg"
touch "${bd}/photoset_${i}/${j}.png"
touch "${bd}/photoset_${i}/subset_${i}/${j}.jpg"
touch "${bd}/photoset_${i}/subset_${i}/${j}.gif"
done
done
Here is the script that will cp the files from the subdirectories to the parent directory. Basically
find all the files recursively in the subdirectories and loop on them
use sed to replace \ with '_' and store this in a variable new_filepath (I also remove the initial parent_, but this is optional)
copy (or move) the old filepath into parent with filename new_filepath
for xtension in jpg png gif; do
while IFS= read -r -d '' filepath; do
new_filepath=$(echo "${filepath}" | sed s#/#_#g)
cp "${filepath}" "${bd}/${new_filepath}"
done < <(find ${bd} -type f -name "*${xtension}" -print0)
done
ls ${bd}
If you want to remove also the additional parent_ from the new_filepath you can replace the new_filepath above with:
new_filepath=$(echo ${filepath} | sed s#/#_#g | sed s/${bd}_//g)
I assumed that you define all the possible extension in the script. Otherwise to find all the extensions in the directory tree you can use the following snippet from a previous answer
find . -type f -name '*.*' | sed 's|.*\.||' | sort -u
I have a list of ip addresses in a text file that I wish to use in a script.
Here is the code outputting the ip addresses in the text file
openstack server list | grep agent | awk '{print \$9}' >> ${STACK}_list.txt
I would like to retrieve the ip addresses and use in a loop by ssh'ing in to them but not sure how to do that
Please refer this post.
script to read a file with IP addresses and login
Might be helpful for you.
Thanks.
Subhadeep
You can use a regex to filter all ip addresses from the server list-output:
openstack server list | grep -o '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*'
You could pipe this output into a file if you need or to use this within a bash-script you could make something like this, without writing it into a file:
#!/bin/bash
#
ADDRESSES=$(openstack server list | grep -o '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*')
for ADDRESS in $ADDRESSES
do
echo "ip: $ADDRESS"
done
It reads all ip-addresses from ther server-list output and iterate within the for-loop over this output and prints each ip separate on the terminal. Instead of the echo you could insert your ssh-command.
Example-server on my deployment:
root#m1r1:~# openstack server list
+--------------------------------------+-----------------------+--------+--------------------------+----------------+--------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+-----------------------+--------+--------------------------+----------------+--------+
| 46d04a77-4d33-4bb3-8214-b1444eed33a3 | server1 | ACTIVE | l2-network=192.168.4.131 | cirros | XS |
| e9489aca-00c3-4fc9-afc5-515c08b17406 | server2 | ACTIVE | l2-network=192.168.4.61 | | XS |
| ea8cec6a-a8d5-4bbb-970e-aaf65d7374b2 | server3 | ACTIVE | l2-network=192.168.4.163 | cirros | S |
| 7d934ec4-1d53-467b-9220-d67b4b68a832 | server4 | ACTIVE | l2-network=192.168.4.184 | | XS |
| 74d3036e-372a-4566-8ba2-10a0760c5562 | server5 | ACTIVE | l2-network=192.168.4.232 | cirros | XS |
| e08e1637-f4df-478d-a478-6578d038cb22 | server6 | ACTIVE | l2-network=192.168.4.190 | | XS |
| 8307a481-679e-4df0-a64e-3a497b13ac81 | server7 | ACTIVE | l2-network=192.168.4.202 | | XS |
| 38d10b12-daa5-483e-b9a5-9a16ba14d841 | server8 | ACTIVE | l2-network=192.168.4.250 | cirros | XS |
+--------------------------------------+-----------------------+--------+--------------------------+----------------+--------+
Output of this example:
ip: 192.168.4.131
ip: 192.168.4.61
ip: 192.168.4.163
ip: 192.168.4.184
ip: 192.168.4.232
ip: 192.168.4.190
ip: 192.168.4.202
ip: 192.168.4.250
#!/bin/sh
openstack server list | grep -o '[0-9]*\.[0-9]*\.[0-9]*\.[0-9]*' > stack
while $(wc -l stack | cut -d' ' -f1) -gt 0 ]
do
ipnumber=$(sed -n '1p' stack)
echo "${ipnumber}"
sed -i '1d' stack
done
The echo command there is just a placeholder. You can replace it with ssh, or whatever else you want to do, with the IP number in the variable.
I have the following command that I use to rewrite some maxscale output to be able to use it in other software:
maxadmin list servers | sed -r 's/[^a-z 0-9]//gi;/^\s*$/d;1,3d;' | awk '$1=$1' | cut -d ' ' -f 1,5 | sed -e 's/ /":"/g' | sed -e 's/\(.*\)/"\1"/' | tr '\n' ',' | sed 's/.$/}\n/' | sed 's/^/{/'
I am thinking this is way to complex for what I want to do, but I am not able to see a simpler version of this myself. What I want is to rewrite this (output of maxadmin list servers):
Servers.
-------------------+-----------------+-------+-------------+--------------------
Server | Address | Port | Connections | Status
-------------------+-----------------+-------+-------------+--------------------
svr_node1 | 192.168.178.1 | 3306 | 0 | Master, Synced, Running
svr_node2 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
svr_node3 | 192.168.178.1 | 3306 | 0 | Slave, Synced, Running
-------------------+-----------------+-------+-------------+--------------------
Into this:
{"svrnode1":"Master","svrnode2":"Slave","svrnode3":"Slave"}
My command does a good job but as I said, there should be a simpler way with less sed commands being run hopefully.
You can use awk, like this:
json.awk
BEGIN {
printf "{"
}
# Everything after line for and before the last ------ line
# plus the last empty line (if any).
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9) # Remove trailing comma
printf "%s\"%s\":\"%s\"",s,$1,$9
s="," # Set comma separator after first iteration
}
END {
print "}"
}
Run it like this:
maxadmin list servers | awk -f json.awk
Output:
{"svr_node1":"Master","svr_node2":"Slave","svr_node3":"Slave"}
In comments there came up the question how to achieve that without an extra json.awk file:
maxadmin list servers | awk 'BEGIN{printf"{"}NR>4&&!/^([-]|$)/{sub(/,/,"",$9);printf"%s\"%s\":\"%s\"",s,$1,$9;s=","}END{print"}"}'
Ugly, but works. ;)
If you want to put this into a shell script, consider a multiline version like this:
maxadmin list servers | awk '
BEGIN{printf"{"}
NR>4&&!/^([-]|$)/{
sub(/,/,"",$9)
printf"%s\"%s\":\"%s\"",s,$1,$9
s=","
}
END{print"}"}'
I am running a command like this:
mycmd1 | mycmd2 | mycmd3 | lp
Is there a way to redirect stderr to a file for the whole pipe instead of repeating it for each command?
That is to say, I'd rather avoid doing this:
mycmd1 2>/myfile | mycmd2 2>/myfile | mycmd3 2>/myfile | lp 2>/myfile
Either
{ mycmd1 | mycmd2 | mycmd3 | lp; } 2>> logfile
or
( mycmd1 | mycmd2 | mycmd3 | lp ) 2>> logfile
will work. (The first version might be have a slightly faster (~1ms) startup time depending on the shell).
I tried the following, and it seems to work:
(mycmd1 | mycmd2 | mycmd3 | lp) 2>>/var/log/mylogfile.log
I use >> because I want to append to the logfile rather than overwriting it every time.
Is there a way to use shell script to get only the name and net from the result as below:
Result
6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;
Expected Result
name-erkoev4ja3rv: 10.1.1.4
$ input="6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;"
$ echo "$input" | sed -E 's,^[^|]+ \| ([^ ]+).* net=([0-9.]+).*$,\1: \2,g'
name-erkoev4ja3rv: 10.1.1.4
echo "6cb7f14e-6466-4211-9a09-2b8e7ad92703 | name-erkoev4ja3rv | 2e3900ff36574cf9937d88223403da77 | ACTIVE | Running | net0=10.1.1.2; ing-net=10.1.1.3; net=10.1.1.4;" | awk -F ' ' '{print $3}{print $13}'
Does this satisfy your case?