Synchronizing between multiple pexpect processes - pexpect

I am writing an application that requires to ssh and telnet to a device at the same time.
The pseudo code goes something like this.
p1 = pexpect.spawn("ssh to the device")
p1.send("run some command")
p1.expect("..")
p2 = pexpect.spawn("telnet to same device")
p2.send("run a command that can be run only through telnet")
p2.expect("..")
p1.send("run some other command")
p1.expect("..")
p2.send("run another command that can be run only through telnet")
p2.expect("..")`
If you notice, I need synchronization between two pexpect children
in order to run them one after the other.
I searched a lot but could not find any information.
please help.
thanks

Related

How to run 1 playbook for the same group by multiple plays aka threaded

Current setup that we do have ~2000 servers (in 1 group)
I would like to know if there is a way to run x.yml on all the group (where all the 2k servers are in ) but with multiple plays (threaded , or something)
ansible-playbook -i prod.ini -l my_group[50%] x.yml
ansible-playbook -i prod.ini -l my_group[other 50%] x.yml
solutions with awx or ansible-tower are not relevant.
using even 500-1000 forks didn't gave any improvement
try to combine forks, and the free strategy.
the default behavior of Ansible is:
Ansible runs each task on all hosts affected by a play before starting the next task on any host, using 5 forks.
So event if your increase the forks number, the tasks on special forks will still wait any host finish to go ahead. The free strategy allows each host to run until the end of the play as fast as it can
- hosts: all
strategy: free
tasks:
# ...
ansible-playbook -i prod.ini -f 500 -l my_group x.yml
As mentioned above, you should preferably increase fork and set the strategy to free. Increasing fork will help you run the playbook on more server and setting the strategy to free would allow you to run a task for servers independently without waiting for others.
Please refer to below doc for more clarifaction.
docs
resolved by using patterns my_group[:1000] and my_group[999:]
forks didnt give any time decrease in my case.
also free strategy did multiplied the time which was pretty weird.
also debugging free strategy summary is free difficult when u have 2k servers and about 50 tasks in playbook .
thanks everyone for sharing
much appreciated

pexpect timed out before script ends

I am using pexpect to connect to a remote server using ssh.
The following code works but I have to use time.sleep to make a delay.
Especially when I am sending a command to run a script on the remote server.
The script will take up to a minute to run and if I don't use a 60 seconds delay, then the script will end prematurely.
The same issue when I am using sftp to download a file. If the file is large, then it download partially.
Is there a way to control without using a delay?
#!/usr/bin/python3
import pexpect
import time
from subprocess import call
siteip = "131.235.111.111"
ssh_new_conn = 'Are you sure you want to continue connecting'
password = 'xxxxx'
child = pexpect.spawn('ssh admin#' + siteip)
time.sleep(1)
child.expect('admin#.* password:')
child.sendline('xxxxx')
time.sleep(2)
child.expect('admin#.*')
print('ssh to abcd - takes 60 seconds')
child.sendline('backuplog\r')
time.sleep(50)
child.sendline('pwd')
Many pexpect functions take an optional timeout= keyword, and the one you give in spawn() sets the default. Eg
child.expect('admin#',timeout=70)
You can use the value None to never timeout.

How to run code in a debugging session from VS code on a remote using an interactive session?

I am using a cluster (similar to slurm but using condor) and I wanted to run my code using VS code (its debugger specially) and it's remote sync extension.
I tried running it using my debugger in VS code but it didn't quite work as expected.
First I logged in to the cluster using VS code and remote sync as usual and that works just fine. Then I go ahead an get an interactive job with the command:
condor_submit -i request_cpus=4 request_gpus=1
then that successfully gives a node/gpu to use.
Once I have that I try to run the debugger but somehow it logs me out from the remote session (and it looks like it goes to the head node from the print statements). That's NOT what I want. I want to run my job in the interactive session in the node/gpu I was allocated. Why is VS code running it in the wrong place? How can I run it in the right place?
Some of the output from the integrated terminal:
source /home/miranda9/miniconda3/envs/automl-meta-learning/bin/activate
/home/miranda9/miniconda3/envs/automl-meta-learning/bin/python /home/miranda9/.vscode-server/extensions/ms-python.python-2020.2.60897-dev/pythonFiles/lib/python/new_ptvsd/wheels/ptvsd/launcher /home/miranda9/automl-meta-learning/automl/automl/meta_optimizers/differentiable_SGD.py
conda activate base
(automl-meta-learning) miranda9~/automl-meta-learning $ source /home/miranda9/miniconda3/envs/automl-meta-learning/bin/activate
(automl-meta-learning) miranda9~/automl-meta-learning $ /home/miranda9/miniconda3/envs/automl-meta-learning/bin/python /home/miranda9/.vscode-server/extensions/ms-python.python-2020.2.60897-dev/pythonFiles/lib/python/new_ptvsd/wheels/ptvsd/launcher /home/miranda9/automl-meta-learning/automl/automl/meta_optimizers/differentiable_SGD.py
--> main in differentiable SGD
hello world torch_utils!
vision-sched.cs.illinois.edu
Files already downloaded and verified
Files already downloaded and verified
Files already downloaded and verified
-> initialization of DiMO done!
---> i = 0, iteration/it 1 about to start
lp_norms(mdl) = 18.43514633178711
lp_norms(meta_optimized mdl) = 18.43514633178711
[e=0,it=1], train_loss: 2.304989814758301, train error: -1, test loss: -1, test error: -1
---> i = 1, iteration/it 2 about to start
lp_norms(mdl) = 18.470401763916016
lp_norms(meta_optimized mdl) = 18.470401763916016
[e=0,it=2], train_loss: 2.3068909645080566, train error: -1, test loss: -1, test error: -1
---> i = 2, iteration/it 3 about to start
lp_norms(mdl) = 18.548133850097656
lp_norms(meta_optimized mdl) = 18.548133850097656
[e=0,it=3], train_loss: 2.3019633293151855, train error: -1, test loss: -1, test error: -1
---> i = 0, iteration/it 1 about to start
lp_norms(mdl) = 18.65604019165039
lp_norms(meta_optimized mdl) = 18.65604019165039
[e=1,it=1], train_loss: 2.308889150619507, train error: -1, test loss: -1, test error: -1
---> i = 1, iteration/it 2 about to start
lp_norms(mdl) = 18.441967010498047
lp_norms(meta_optimized mdl) = 18.441967010498047
[e=1,it=2], train_loss: 2.300947666168213, train error: -1, test loss: -1, test error: -1
---> i = 2, iteration/it 3 about to start
lp_norms(mdl) = 18.545459747314453
lp_norms(meta_optimized mdl) = 18.545459747314453
[e=1,it=3], train_loss: 2.30662202835083, train error: -1, test loss: -1, test error: -1
-> DiMO done training!
--> Done with Main
(automl-meta-learning) miranda9~/automl-meta-learning $ conda activate base
(automl-meta-learning) miranda9~/automl-meta-learning $ hostname vision-sched.cs.illinois.edu
Doesn't even run without debugging mode
The problem is more serious than I thought. I can't run the debugger in the interactive session but I can't even "Run Without Debugging" without it switching to the Python Debug Console on it's own. So that means I have to run things manually with python main.py but that won't allow me to use the variable pane...which is a big loss!
What I am doing is switching my terminal to the conoder_ssh_to_job and then clicking the button Run Without Debugging (or ^F5 or Control + fn + f5) and although I made sure to be on the interactive session at the bottom in my integrated window it goes by itself to the Python Debugger window/pane which is not connected to the interactive session I requested from my cluster...
related:
gitissue: https://github.com/microsoft/vscode-remote-release/issues/1722
quora: https://qr.ae/TqCiu8
reddit: https://www.reddit.com/r/vscode/comments/f1giwi/how_to_run_code_in_a_debugging_session_from_vs/
You can try reversing the order of operations; first submitting the job, obtaining the name of the compute node allocated to you, then instructing VSCode to connect to the compute node rather than the login node.
So first would be
condor_submit -i request_cpus=4 request_gpus=1
and noting the name of the compute node. Assuming node001 in the following.
Then, open VSCode on your laptop, click on the Remote Development extension icon and choose "Remote SSH: Connect to Host...". Choose "+ Add new SSH host...". In the "Enter SSH command" box, add the following:
ssh -J vision-sched.cs.illinois.edu miranda9#node001
The VSCode will ask you which SSH configuration file it should update. Make sure to review that configuration: specify the SSH keys if needed, the user name, etc. Also make sure you have the vision-sched.cs.illinois.edu correctly configured in that file.
Then you can choose that host to connect to. VSCode will then execute on the compute node, and will be disconnected when the allocation finishes.
I stumbled upon a related issue recently (I wanted to use VsCode interactive Python capabilities on a compute node) and the above weren't working but this solved it:
ssh to the remote cluster ssh cluster
inside the remote cluster, add my public key to the authorized keys, so typically append the content of ~/.ssh/id_rsa.pub (local machine) to .ssh/authorized_keys (remote cluster)
allocate some resources inside the cluster (this particular cluster uses slurm and not condor so in this case I use something like srun --pty bash)
get the name of the compute node, typically visible in the command line as username#nodename). For argument's sake, let's imagine I get a generic name like node001
for simplicity on my local machine, modify the ~/.ssh/config file and edit it as:
Host cluster
# stuff written
Host node*
HostName %h
ProxyJump cluster
User $USERNAME
Now I'm able to ssh to it from my local machine (as long as the compute node is running) with ssh node001.
In VsCode this boils down to
CTRL+P > Remote-SSH: Connect to Host...
type in the name of the node, here node001
you get connected to the node, now every interactive python you run (including jupyter and jupytext) will have access to your allocated resources
I don't know how generic this solution is, I hope it'll help at least somebody !
Here is a simpler workaround:
on the remote server create a file named bash somewhere for example /home/myuser/pathto/bash
make it executable using chmod +x bash
write salloc [your desired options for the interactive job] in the bash file
In vscode Settings search for Automation Shell: Linux and click on the "Edit in settings.js"
change the line to "terminal.integrated.automationShell.linux": "/home/myuser/pathto/bash" and save it (use the absolute path. for example ~/pathto/bash didn't work for me)
Done :)
now every time you run the debugger it will first ask for the interactive job and the debugger will run on it. but take in to consider that this is also applied to tasks you run in tasks.json.
also you can use srun instead of salloc. for example srun --pty -t 2:00:00 --mem=8G bash

Bash commands putting out extra information which results into issues with scripts

Okay, hopefully I can explain this correctly as I have no idea what's causing this or how to resolve this.
For some reason bash commands (on a CentOS 6.x server) are displaying more information than "normally" and that causes issues with certain scripts. I have no clue if there is a name for this, but hopefully someone knows a solution for this.
First example.
Correct / good server:
[root#goodserver ~]# vzctl enter 3567
entered into CT 3567
[root#example /]#
(this is the correct behaviour)
Incorrect / bad server:
[root#badserver /]# vzctl enter 3127
Entering CT
entered into CT 3127
Open /dev/pts/0
[root#example /]#
With the "bad" server it will display more information as usual, like:
Entering CT
Open /dev/pts/0
It's like it parsing extra information on what it's doing.
Ofcourse the above is purely something cosmetic, however with several bash scripts we use, these issues are really issues.
A part of the script we use, uses the following command (there are more, but this is mainly a example of what's wrong):
DOMAIN=`vzctl exec $VEID 'hostname -d'`
The result of the above information is parsed in /etc/named.conf.
On the GOOD server it would be added in the named.conf like this:
zone "example.com" {
type master;
file "example.com";
allow-transfer {
200.190.100.10;
200.190.101.10;
common-allow-transfer;
};
};
The above is correct.
On the BAD server it would be added in the named.conf like this:
zone "Executing command: hostname -d
example.com" {
type master;
file "Executing command: hostname -d
example.com";
allow-transfer {
200.190.100.10;
200.190.101.10;
common-allow-transfer;
};
};
So it's add stuff of the action it does, in this example "Executing command: hostname -d"
Another example here when I run the command on a good server and on the bad server.
Bad server:
[root#bad-server /]# DOMAIN=`vzctl exec 3333 'hostname -d'`
[root#bad-server /]# echo $DOMAIN
Executing command: hostname -d example.com
Good server:
[root#good-server ~]# DOMAIN=`vzctl exec 4444 'hostname -d'`
[root#good-server ~]# echo $DOMAIN
example.com
My knowledge is limited, but I have tried several things checking rsyslog and the grub.conf, but nothing seems out of the ordinary.
I have no clue why it's displaying the extra information.
Probably it's something simple / stupid, but I have been trying to solve this for hours now and I really have no clue...
So any help is really appreciated.
Added information:
Both servers use: kernel.printk = 7 4 1 7
(I don't know if that's useful)
Well (thanks to Aaron for pointing me in the right direction) I finally found the little culprit which was causing all the issues I experienced with this script (which worked for every other server, so no need to change that obviously).
The issues were caused by the VERBOSE leven set in vz.conf (located in /etc/vz/ directory). There is an option in there called "VERBOSE" and in my case it was set to 3.
According to OpenVZ's website it does the following:
Increments logging level up from the default. Can be used multiple times.
Default value is set to the value of VERBOSE parameter in the global
configuration file vz.conf(5), or to 0 if not set by VERBOSE parameter.
After I changed VERBOSE=3 to VERBOSE=0 my script worked fine once again (as it did for every other server). :-)
So a big shoutout to Aaron for pointing me in the right direction. The answer is easy when you know where to look!
Sorry to say, but I am kinda disappointed by ndim's reaction. This is the 2nd time he was very unhelpful and rude in his response after that. He clearly didn't read the issue I posted correctly. Oh well.
I would make sure to properly parse the output of the command. In this case, we are only interested in lines of the form
entered into CT 12345
One way of doing this would be to pipe everything through sed and having sed print only the number when the line looks as above (untested, and I always forget which braces/brackets/parens need a backslash in front of them):
whateverthecommand | sed -n 's/^entered into CT ([0-9]{1,})$/\1/p'

Errors in UDP sending in a sub-script (bash)

Using a Raspi/Debian - I have a script that parses the results from an iwlist scan and sends them via UDP to a Pure Data patch. This runs fine in gui mode, but now I'm trying to automate the whole process in another script with the following:
pd-extended -nogui /home/pi/patch.pd & /home/pi/libOSC/scan.sh && fg
But when I run this new script, the UDP appears to only send the info to Pure Data once, and then the scanning continues but Pd does not receive the packet. Any help with this would be appreciated.
What happens when you run /home/pi/libOSC/scan.sh? It sends the results only once? Then maybe you need to do it differently, like calling that script from within pd using the 'shell' or 'popen' objects for instance. Or you implement a polling command via UDP that will return the values.
how does your scan.sh script look like?
you probably want to make it something like:
pdhost=localhost
pdport=9999
do_scan() {
## some code here that does the scan and print's the result to stdout
}
doscan | while read line
do
echo "${line};" | pdsend ${pdhost} ${pdport}
done
rather than the following:
doscan | pdsend ${pdhost} ${pdport}

Resources