What does stat() system call really do? - amazon-ec2

I am playing around with EFS in AWS. My setup includes 2 EC2 instances (a FreeBSD 9.3 & a CentOS 7) in the same AZ - both mounting an EFS volume (let's say, at /data/efs)
I have the below code running on both instances & all files are in /data/efs -
for x in xrange(100):
try:
os.link(socket.get_hostname(), 'file')
break
except OSError as e:
if e.errno == 17:
# Other host is yet to unlink
time.sleep(1)
else:
raise
<do_something>
os.unlink('file')
Invariably (irrespective of order of starting the script on the hosts), the linux host gets hold of control at some point & finishes "doing something" first a 100 times. Post that, FreeBSD instance takes a while (up to 1min) to start "doing something" again.
While FreeBSD host waits, running 'ls' on the host doesn't show the entry 'file' in /home/efs but still os.link() call above fails on it. But if I run os.stat() (ran it the first time to check link count), FreeBSD host starts "doing something" again immediately.
What does os.stat() really do in an NFS scenario? Does it force a recount?

Related

Initialize and terminate a bot on EC2 remotely

I am trying to run a script to start, stop, or restart a bot from my front end webpage.
I have a bot that runs almost 24/7 on a Linux EC2 instance, and a webpage front end that allows for parameter input and shows the current status of the bot. The front end sends a POST request to a lambda function, which writes the parameters to my S3 bucket. The script to start the bot on the EC2 instance pulls the latest parameters from S3 and initializes the bot. When the bot starts up and shuts down, it writes the status ("running", "stopped") to a file in the S3 bucket, which then shows on the front end.
I have looked into SSM Run Command with Lambda, but given that the bot runs for days at a time, I don't believe that's viable. Additionally, it uses an agent to connect, so trying to use the screen command would terminate when the agent terminates.
I have also tried adding the script to my EC2 instance’s User Data, but that does not seem to work. Similarly a cron job for reboot does not work.
I've considered using a trigger file in S3, i.e. having the EC2 instance check at a given time interval for some trigger file in S3 that would indicate a start or stop, but that seems very resource intensive.
What alternatives do I have?
The solution that worked for me was setting up a crontab job that runs on reboot, then starting, stopping, and restarting the EC2 instance with a lambda function.
Steps to resolve this for anyone in the same boat:
SSH into the EC2 instance
crontab -e
add the following line:
#reboot sleep 60 && cd /home/ec2-user/bot_folder/ && /usr/bin/screen -S bot -dm /usr/bin/python3 run_bot.py
(for vim, press i to enter insert mode, paste the line and make changes, then press esc :wq enter to save)
Ensure that the script has all of its paths specified absolutely. In my case, using Selenium, the chromedriver path needed to be specified.
Finally, setup a lambda function to start/stop/reboot your instance as the comment above referenced.

Bash - create multiple virtual guests in one loop

I'm working on a bash script (I just started learning bash) that involves creating virtual guests on a remote server. I do this by SSH'ing from server A to B and execute 2 different commands:
# create the images
$(ssh -n john#serverB.net "fallocate -l ${imgsize} /home/john/images/${imgname}")
and
# create the virtual machine
$(ssh -n john#serverB.net virt-install --bunch of options)
It is possible that these sets of commands have to be executed twice (if there need to be 2 virtual guests created) in a loop. When the second command is being run for the second time I sometimes get this error:
Domain installation still in progress.
This means I have to wait until the previous virtual guest is completed. How would I be able to do these operations in one loop? Can I run them asynchronously? Can I use threads? Or is there another way?
I have heard about the 'wait' command, but is that safe to use?
Check the man page for virt-install. You can use --wait=0 or --noautoconsole.
--wait=WAIT Amount of time to wait (in minutes) for a VM to complete its install. Without this option, virt-install will wait for the
console to close (not necessarily indicating the guest has shutdown),
or in the case of --noautoconsole, simply kick off the install and
exit. Any negative value will make virt-install wait indefinitely, a
value of 0 triggers the same results as noautoconsole. If the time
limit is exceeded, virt-install simply exits, leaving the virtual
machine in its current state.

How to wait in script until device is connected

I have a Sky wireless sensor node and a script which prints the output from the node.
sudo ./serialdump-linux -b115200 /dev/tmotesky1
If I start this script before my pc detects the node, I get the following error:
/dev/tmotesky1: No such file or directory
But if I wait for example 20 seconds, I miss the initial prints (which are important).
Is there a way to detect if the /dev/tmotesky1 exists?
Something like
while [ ! -f /dev/tmotesky1 ] ; do sleep 1; print 'Waiting...'; done
Thanks in advance!
Your code indicates that you are using Linux where you can use the hotplugging mechanism.
On generic systems, you can write an udev rule (--> see with udevadmin monitor -e what happens when you attach the device) which starts e.g. a program or writes something into a pipe. When systemd is used, you can start a service (see man systemd.device).
On small/embedded systems it is possible to write a custom /sbin/hotplug program (set in /proc/sys/kernel/hotplug) instead of using udev.

Bash: Script output to terminal session stops but script finishes normal

I'm opening an ssh-session to a remote server and execute a larger (around 1000 lines) bash-script on the remote machine. It involves several very CPU-intensive calls which run for up to three minutes each. To track the scripts progress it echoes messages placed at several points in the script.
In general the script runs smoothly. From time to time the script runs trough (the resulting file on the remote machine is correct) but the output to the terminal stops. Ctrl-C doesn't help, no prompt, just a frozen session. top in a separate session shows normal execution of the script.
My question: How keep the session alive?
local machine:
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.9
BuildVersion: 13A603
remote machine:
$ lsb_release -d
Description: Ubuntu 12.04.3 LTS
Personally, I would recommend using screen or tmux on the remote terminal for exactly this reason.
Those apps will allow the remote process to continue even if your local SSH session times out.
http://www.bangmoney.org/presentations/screen.html
http://tmux.sourceforge.net/
Start a screen on the remote machine and run your command from it:
screen -S largeScript
And then
./yourLargeScript.sh
Whenever your ssh session gets frozen, you can kill it with ~.
If you ssh again, you can grab back your screen by:
screen -dr largeScript
Make it log to a file instead (perhaps via syslog), and tail that file from wherever is convenient for you. This also helps detach the script so you can run it headless, from a cron job, etc. Also, if the log file has read access for others, they too can monitor it.

Can a standalone ruby script (windows and mac) reload and restart itself?

I have a master-workers architecture where the number of workers is growing on a weekly basis. I can no longer be expected to ssh or remote console into each machine to kill the worker, do a source control sync, and restart. I would like to be able to have the master place a message out on the network that tells each machine to sync and restart.
That's where I hit a roadblock. If I were using any sane platform, I could just do:
exec('ruby', __FILE__)
...and be done. However, I did the following test:
p Process.pid
sleep 1
exec('ruby', __FILE__)
...and on Windows, I get one ruby instance for each call to exec. None of them die until I hit ^C on the window in question. On every platform I tried this on, it is executing the new version of the file each time, which I have verified this by making simple edits to the test script while the test marched along.
The reason I'm printing the pid is to double-check the behavior I'm seeing. On windows, I am getting a different pid with each execution - which I would expect, considering that I am seeing a new process in the task manager for each run. The mac is behaving correctly: the pid is the same for every system call and I have verified with dtrace that each run is trigging a call to the execve syscall.
So, in short, is there a way to get a windows ruby script to restart its execution so it will be running any code - including itself - that has changed during its execution? Please note that this is not a rails application, though it does use activerecord.
After trying a number of solutions (including the one submitted by Byron Whitlock, which ultimately put me onto the path to a satisfactory end) I settled upon:
IO.popen("start cmd /C ruby.exe #{$0} #{ARGV.join(' ')}")
sleep 5
I found that if I didn't sleep at all after the popen, and just exited, the spawn would frequently (>50% of the time) fail. This is not cross-platform obviously, so in order to have the same behavior on the mac:
IO.popen("xterm -e \"ruby blah blah blah\"&")
The classic way to restart a program is to write another one that does it for you. so you spawn a process to restart.exe <args>, then die or exit; restart.exe waits until the calling script is no longer running, then starts the script again.

Resources