boto ec2 get_all_instances hanging in VPC

boto ec2 get_all_instances hanging in VPC - amazon-ec2

I have a script which hangs when it tries to get the instances. I am running from a machine in a VPC and it worked a couple of weeks ago but does not. I'm assuming someone may have changed the Security Group or ACL settings however I cant see anything obvious. What rule woudl I be looking for which might block this and stop it from working. It runs fine on a machine outside of the VPC
reg = RegionInfo(name='eu-west-1', endpoint='ec2.eu-west-1.amazonaws.com')
conn = EC2Connection(aws_access_key_id=awsAccessKey, aws_secret_access_key=awsSecret, region=reg)
reservations = conn.get_all_instances()
Thanks

Related

Vagrant up and its non-deterministic issues

Ok, I've started to use vagrant because a client uses it. Getting all the pieces together and make them work was truly painful... and now I'm facing a different kind of issue.
Once I run vagrant up there could be different results:
VM doesn't boot properly
Timed out while waiting for the machine to boot. This means that
Vagrant was unable to communicate with the guest machine within
the configured ("config.vm.boot_timeout" value) time period.
If you look above, you should be able to see the error(s) that
Vagrant had when attempting to connect to the machine. These errors
are usually good hints as to what may be wrong.
If you're using a custom box, make sure that networking is properly
working and you're able to connect to the machine. It is a common
problem that networking isn't setup properly in these boxes.
Verify that authentication configurations are also setup properly,
as well.
If the box appears to be booting properly, you may want to increase
the timeout ("config.vm.boot_timeout") value.
MySQL dies
default: ERROR 2006 (HY000) at line 1118: MySQL server has gone away
or
default: ERROR 1034 (HY000) at line 3720: Index for table 'ci_sessions' is corrupt; try to repair it
Works properly
I have to run back to back vagrant up and vagrant destroy commands, until my machine decides to boot and build properly. It's weird, I have no explanation for this, running the environment is such a gamble and takes me a lot of time until it works.
If someone knows anything about Vagrant or this strange issue, it would be very appreciated.
PD:
My Vagrantfile has been provisioned with
config.vm.provider "virtualbox" do |v|
v.memory = 4096
v.cpus = 3
end
Databases that are being loaded are huge (one of 5Gb) and I tried to include a sed command to improve max allowed package without success.
sudo sed -i "s#[client-server]#[client-server]\n\n[mysqld]\nmax_allowed_packet=256M#" /etc/my.cnf
I got this answer:
sed: -e expression #1, char 33: unterminated `s' command
It runs ok when I execute it at vagrant ssh but I'm not sure where to include it in my Vagrantfile.

I would speculate that a lot of stuff is going on in the VM during startup that either eats up resources, or just takes too long. You already point out that you give the VM only 4GB of RAM, but the database you try to load is 5GB, so the VM is likely thrashing the swap partition. That MySQL dies also suggests that you are not giving the VM the resources required.
Even if vagrant can't establish an SSH connection and times out, can you ssh into the box after a while (check "vagrant ssh-config")?
Increase RAM, and investigate (perhaps you can ask whoever packaged the box) what kind of services this VM launches as part of the boot process. Monitor whatever facilities the guest OS of the VM provides, and you should see what's going on in the VirtualBox GUI as well. Press whatever keys the guest OS uses for that purpose so that you don't just see some opaque boot screen.
As for your sed command, you can use a shell provisioner for that: https://www.vagrantup.com/docs/provisioning/shell; make sure you escape things properly when passing the command line via the inline attribute.

Executing gcloud commands in bash

I've spent 3 days beating my head against this before coming here in desperation.
So long story short I thought I'd fire up a simple PHP site to allow moderators of a gaming group I'm in the ability to start GCP servers on demand. I'm no developer so I'm looking at this from a Systems perspective to find the simplest solution to do the job.
I fired up an Ubuntu 18.04 machine on GCP and set it up with the Google SDK, authorised it for access to the project and was able to simply run gcloud commands which worked fine. Had some issues with the PHP file calling the shell script to run the same commands but with some testing I can see it's now calling the shell script no worries (it broadcasts wall "test") to console everytime I click the button on the PHP page.
However what does not happen is the execution of the gcloud command. If I manually run this shell script it starts up the instance no worries and broadcasts wall, if I click the button it broadcasts but that's it. I've set the files to have execution rights and I've even added the user nginx runs as to have sudo rights, putting sudo sh in front of the command in the PHP file also made no difference. Please find the bash script below:
#!/bin/bash
/usr/lib/google-cloud-sdk/bin/gcloud compute instances start arma3s1-prod --zone=australia-southeast1-b
wall "test"
Any help would be greatly appreciated, this coupled with an automated shut down would allow our gaming group to save money by only running the servers people want to play on.
Any more detail you want about the underlying system please let me know.

So I asked a PHP dev at work about this and in two seconds flat she pointed out the issue and now I feel stupid. In /etc/passwd the www-data user had /usr/sbin/nologin and after I fixed that running the script gcloud wanted permissions to write a log file to /var/www. Fixed those and it works fine. I'm not terribly worried about the page or even server being hacked and destroyed, I can recreate them pretty easily.
Thanks for the help though! Sometimes I think I just need to take a step back and get a set fresh of eyes on the problem.

When you launch a command while logged in, you have your account access rights to the Google cloud API but the PHP account doesn't have those.
Even if you add the www-data user to root, that won't fix the problem, maybe create some security issues but nothing more.
If you really want to do this you should create a service account and giving the json to the env variable, GOOGLE_APPLICATION_CREDENTIALS, which only have the rights on the compute instance inside your project this way your PHP should have enough rights to do what you are asking him.
Note that the issue with this method is that if you are hacked there is a change the instance hosting your PHP could be deleted too.
You could also try to make a call to prepared cloud function which will create the instance, this way, even if your instance is deleted the cloud function would still be there.

Allow ec2:TerminateInstance only on self?

I have a worker machine running on a spot instance with a dedicated role. I'd like to grant it the permission to run ec2-terminate-instances when it finishes but want it to be able to terminate only itself.
Couldn't find any variable of instance-id or something similar. How do I define that kind of permission?
I've also tried using shutdown -h now but the behaviour with spot instances was a bit weird - it killed the machine (terminated) but kept the spot request as fulfilled (rather than terminated-by-user)
Thanks!

Installed Zone Alarm on Amazon EC2 Windows Instance and cannot access now. How do I fix this?

I messed up this.
Installed ZoneMinder and now I cannot connect to my VPS via Remote Desktop, it must probably have blocked connections. Didnt know it will start blocking right away and let me configure it before.
How can I solve this?

Note: My answer is under the assumption this is a Windows instance due to the use of 'Remote Desktop', even though ZoneMinder is primarily Linux-based.
Short answer is you probably can't and will likely be forced to terminate the instance.
But at the very least you can take a snapshot of the hard drive (EBS volume) attached to the machine, so you don't lose any data or configuration settings.
Without network connectivity your server can't be accessed at all, and unless you've installed other services on the machine that are still accessible (e.g. ssh, telnet) that could be used to reverse the firewall settings, you can't make any changes.
I would attempt the following in this order (although they're longshots):
Restart your instance using the AWS Console (maybe the firewall won't be enabled by default on reboot and you'll be able to connect).
If this doesn't work (which it shouldn't), you're going to need to stop your crippled instance, detach the volume, spin up another ec2 instance running Windows, and attach the old volume to the new instance.
Here's the procedure with screenshots of the exact steps, except your specific steps to disable the new firewall will be different.
After this is done, you need to find instructions on manually uninstalling your new firewall -
Take a snapshot of the EBS volume attached to it to preserve your data (essentially the C:), this appears on the EC2 console page under the 'volumes' menu item. This way you don't lose any data at least.
Start another Windows EC2 instance, and attach the EBS volume from the old one to this one. RDP into the new instance and attempt to manually uninstall the firewall.
At a minimum at this point you should be able to recover your files and service settings very easily into the new instance, which is the approach I would expect you to have more success with.

AWS EC2: Instance from my own Windows AMI is not reachable

I am windows user and wanted to use a spot instance using my own EBS windows AMI. For this I have followed these steps:
I had my own on-demand instance with specific settings
Using AWS console I used option "Create Image EBS" to create EBS based windows AMI. IT worked and AMI created successfully
Then using this new AMI I launched a spot medium instance that was created well and now running with status checks passed.
After waiting an hour or more I am trying to connect it using windows 7 RDC client but is not reachable with client tool's standard error that either computer is not reachable or not powered on.
I have tried to achieve this goal and created/ deleted many volums, instances, snapshots everything but still unsuccessful. Doesn't anybody else have any solution to this problem?
Thanks

Basically what's happening is that the existing administrator password (and other user authentication information) for Windows is only valid in the original instance, and can't be used on the new "hardware" that you're launching the AMI on (even though it's all virtualized).
This is why RDP connections will fail to newly launched instances, as will any attempts to retrieve the administrator password. Unfortunately you have no choice but to shut down the new instances you've been trying to connect to because you won't be able to do anything with them.
For various reasons the Windows administrator password cannot be preserved when running a snapshot of the operating system on different hardware (even virtualized hardware) - this is a big part of the reason why technologies like Active Directory exist, so that user authentication information is portable between machines and networks.
It sounds like you did all the steps necessary to get this working except for one - you never took any steps to cause a new password to be assigned to your newly-launched instances based on the original AMI you created.
To fix this, BEFORE turning your instance into a custom AMI that can be used to launch new instances, you need to (in the original instance) run the Ec2ConfigService Settings tool (found in the start menu when remoted into the original instance using RDP), and enable the option to generate a new password on next reboot. Save this setting change.
Then when you do create an AMI from the original instance, and use that AMI to launch new instances, they will each boot up to your custom Windows image but will choose their own random administrator password.
At this point you can go to your ec2 dashboard and retrieve the newly-generated password for the new instance based on the old AMI, and you'll also be able to download the RDP file used to connect to it.
One additional note is that Amazon warns that it can take upwards of 30 minutes for the retrieval of an administrator password after launching a new instance, however in my previous experience I've never had to wait more than a few minutes to be able to get it.

Your problem is most likely that the security group you used to launch the AMI does not have RDP (TCP port #3389) enabled.
When you launch the windows AMI for the first time, AWS will populate the quicklaunch with this port enabled. However, when you launch the subsequent AMI, you will have to ensure that this port is open for your security group.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio