very strange behaviour with ruby, openssl, unicorn, systemd (Gcloud) - ruby

We started seeing some strange errors in our logs that normally appear when ruby isn't compiled properly with OpenSSL. But it's inconcistent...
We're getting errors like:
RuntimeError: Unsupported digest algorithm (SHA256). (also with other digests, like sha1). example error trace
Faraday::SSLError (SSL_CTX_new: (null)) example error trace
We managed to reproduce it when starting unicorn using service unicorn start or systemctl start unicorn. But only with some requests... Not all of them. Some requests that use OpenSSL under the hood do work. Others don't.
However, when we start unicorn with /etc/init.d/unicorn start, everything works without a hitch. (to clarify, systemd starts the same /etc/init.d script)
We tried debugging ENV vars, user permissions, file/dir ownership, recompile ruby, bootstrap a new server from scratch... Nothing seems to help.
In case this helps:
unicorn init.d script
unicorn.rb
What are we missing? What can we try that we haven't thought of?
UPDATE 1
output of some debug commands, e.g. OpenSSL, ruby etc
PATH is being set inside the init.d script
unicorn is being executed via su into www-data user
The same problem happens when we use this unicorn.service file in /etc/systemd/system
We're running Ubuntu 16.04 on Gcloud
Ruby was not installed via apt (explicitly removed, in case platform came pre-installed) and compiled from scratch. We're currently running 2.3.4 and tried also 2.3.6. Compiled either manually or using ruby-build. No rbenv, nor RVM.
We install libssl-dev via apt (we're running apt-get install -y autoconf bison build-essential libssl-dev libyaml-dev libreadline6-dev zlib1g-dev libncurses5-dev libffi-dev libgdbm3 libgdbm-dev before building ruby)
UPDATE 2
We're using a scripted/repeatable build process for the VM (using fabric), and this problem is consistent on multiple VMs we bootstrapped on GCloud. We then tried a VM on DigitalOcean with the same bootstrap scripts, and the problem doesn't seem to appear there.
In both cases we picked Ubuntu 16.04 64bit base image, but obviously there are some differences with kernel versions, base installed packages etc...
UPDATE 3
The problem simply vanished. See my answer below.

#gingerlime I had a similar situation with our Jenkins on GCP, we're using ChefDK 3.1.0 (ruby embeed 2.5.1p57) -- tried other also, over a Jenkins that was running over systemd (Ubuntu 16.04) and upstart (Ubuntu 14.04) -- we tried on both versions, right now running over 16.04 in 4.15.0-1023-gcp kernel version, running a few jobs with kitchen-docker and this problem always emerge in a few situations.
I digged into and found that this only happens when the Etc.getlogin class gets called (for me here), this doesn't return any error, it return the correct info, the correct type of the class (String), but once it gets a call, the Unsupported digest algorithm gets raised.
If I start the process manually by root or jenkins user, this problem doesn't happen. I tried to implement the Etc.getlogin in several different ways, like using ENV['USER'], a fixed String, or other classes from Etc, like getpwuid, simulating the return class and values from Etc.getlogin, and the error doesn't get raised.
I'm not sure if this is some bug related to the ruby version and the custom kernel that GCP instances uses, but it happens in a similar situation like yours, and for me, the Etc.getlogin was the problem. Right now, I fixed by using a custom configuration that doesn't gets the call from this function, and it's working normally.

One option is that this isn't an issue of sysVinit vs systemd at all, but you just haven't triggered the issue with your sysVinit script yet.
When you run your svsVinit script through the systemctl command it's going through a compatibility layer, and there may be a problem there. Your problem would be simplified both yourself and for us if you reproduced the issue directly with a systemd service file and shared that file.
You mentioned debugging ENV, but didn't mention exactly what you checked in the ENV. This is definitely one place where systemd could make a difference. As seen in man systemd.exec, systemd sets $PATH in the environment to a fixed value:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
If this is not exactly the same as when run directly as an sysVinit script, that could be an issue.
I would also check for all your copies of SSL on the system. Do you have more than one? Where? Do you have more than copy of the ruby openssl module loaded?
 locate -r lib/.*libssl.*so
Also see the answer to the FAQ: Why do things behave differently under systemd?

(also posted on this github issue)
It looks like the problem just vanished. We were testing and reproducing it consistently, across several Compute Engine instances on Google Cloud. Under certain conditions (unicorn / puma started by systemd, etc), it was completely reproducible both with our own rails app, and with a plain vanilla rails app we've set for testing purposes. It was reproducible across several ruby versions as well (we tested 2.3.4, 2.3.6 and 2.5.0).
Suddenly, all instances that were consistently failing started working without exhibiting these problems. Like it never existed. We didn't even reboot some of those instances, and we saw no evidence of any unattended upgrades taking place... We also had one snapshot of a system that had this problem, and that we can reliably reproduce on. Creating an instance from this snapshot stopped exhibiting it as well from that specific point in time a few hours ago.
We're totally confused as to what might have cause it, and what might have made it disappear... However, without being able to reproduce it now, I guess there's no point leaving this issue open, so will close it. Chalk it up to Deus ex machina I suppose. (perhaps the google support gods, but they haven't reported anything back to us yet)

Related

go run/vet/build/test commands hang after completing. Ignores interrupt signal

Similar to the issue #37033541, my commands do not stop. However, my system does not have unmounted drives; my GOPATH is set to /users/user_name/go:/users/user_name/goCode. Neither changing this path to the installation default, nor restarting the computer, or even starting a shell without my bashrc change the behaviour. While it is running, it does generate a functional executable.
I am running go 1.14.1 installed according to the instructions for macOS Mohave.
This behaviour replicates across other packages in my system. But transferring the code to the Go Playground or another Mac computer does not replicate the behaviour. When I run go build -x ..., the last action is: rm -r $WORK/b001/.
Running a stack trace on the process yields ongoing system calls that I cannot interpret (They do seem varied and would be happy to post some if someone would think them useful).
This did not use to happen, it started a few hours ago. I would appreciate the help of someone in troubleshooting this issue.
The issue was resolved only by putting in a fresh installation of the OS and then reinstalling go 1.14.1.
More here: https://groups.google.com/forum/#!topic/golang-nuts/YxqX9o2YJ4k

Optimal way to install ruby on Docker where base-OS access is required?

I'm trying to install Ruby on Docker (no Rails), but I'm having some issues. I initially tried with RVM, but I had issues with it; after I'd installed it in the usual way, commands such as ruby or gem install aren't recognised, and I understand that RVM is not best practice for a docker environment. I tried building from binary, but that seemed to be missing so many essential things, it seemed to be an exercise in futility.
I've now tried using the official docker ruby:2.5.1 image, however when I attach to this, I get an irb prompt, and am unable to use operating system commands, such as apt-get due to this.
It's essential that I have operating system access - this script will be using a browser through headless Watir webdriver, attaching to Geckodriver, so there are a number of dependencies required that won't be included in the base ruby install.
What's the best way to handle this with Docker?
This will get you on the command-line of the Ruby box:
docker run -it ruby:2.5.1 bash
You'll now be able to run ruby tools as normal, e.g. ruby, irb, gem. As well as regular Debian commands including apt-get.
Suggestion:
If you want to roam around inside a separated environment, you should choose something like Vagrant.
If you are intended to use docker, give a try this approach.
You can place any code in your ruby file whichever you would like to.
$ docker run -it -v $(pwd)/:/data ruby:2.5 ruby -- /data/hello.rb
hello world!

Vagrant stalling on boot

I am trying to get a virtual machine working with Vagrant. Everything seems to run fine and it begins to unpack/install all the needed files. But every single time it just stalls when I get to this point.
==> default: Setting up grub-pc (2.02~beta2-36ubuntu3.11) ...
Here is a screen shot of what is going on:
I shut down the virtual machine and booted it back up. I can ssh into it but nothing seems to work. By this I mean there is no psql, no SQLAlchemy. These, among other things, are supposed to be set up in the VM. It seems as if it halts before installing the necessary software.
I've tried vagrant destroy and reinstalling, downloading a new image in case that one was corrupt and I tried reinstalling Vagrant. I am running Vagrant 1.9.5
Looks like you're provisioning with shell commands. I'm guessing that there's some sort of install prompt that's coming up and demanding some sort of user interaction / response. Because vagrant's handling the provisioning behind the scenes, you can't respond to the prompt and the install is not continuing.
You should be able to fix the issue by editing your Vagrantfile. As a guess, it looks like grub-pc is causing the issue (there's actually a grub-pc command prompt in the image you shared). See if you can figure out which package is installing grub-pc. If you're lucky, the problem can be solved by piping in a yes along with the install command (which will automatically answer yes to all install questions). This looks something like yes | sudo apt-get install grub-pc. If grub-pc is being installed as part of another package, you'll need to do some educated guessing to figure out which package is installing it and adding the yes | apt-get pipe to that install line (or just add the pipe before every install line).
This being said, I ran into an issue when I was installing the Java SDK on vagrant, where Oracle was demanding I accept their terms of use before the install would complete and a yes pipe wouldn't solve the issue. I was able to fix it by searching the web for "silent java sdk install via command line". If you can figure out which package is causing the issue, and a yes pipe isn't enough, searching for how to "silently" install that package via command line should help.
UPDATE
As you can see in a comment on this answer
Unfortunately a yes pip didn't do the trick this time but a quick
search on how to "silently" install grub led me to this.
DEBIAN_FRONTEND=noninteractive apt-get -y -o
Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold"
upgrade . After editing my Vagrant file it worked perfectly

Grunt-contrib-sass or the sass gem is not working when using git push

I'm pretty sure I've tracked the issue down to Node.js not seeing Sass, but I have no clue why...
If I push from my laptop using:
git push lamp somebranch:master, the server remotely checks it out fine, runs npm install without error, and starts processing the gruntfile, but then aborts with "remote: Warning: spawn ENOENT Use --force to continue."
However, (after I push from my laptop like above) I can ssh in, cd into my hooks directory and run ./post-receive and it finishes "Done, without errors." I also tried running grunt in the website's root and it also completed without error.
Any ideas as to what might be going on? I'm completely stumped. Should I set paths to the sass gem in the hook? I scrapped down my gruntfile to use the same target locally as well as on the server to rule out the gruntfile. It compiles fine locally, compiles fine on the server, but fails only when using git push lamp somebranch:master.
Some may wonder why I just didn't compile locally and dump the css into the web root from the devel box... perhaps I should. This time though, I really wanted Push-to-deploy all the way through, compiles and all. For anyone attempting the same thing and running into the same problem, this should help.
First off, it probably wouldn't hurt to scrub the system of any versions of ruby and sass that were installed via the distro's package manager. Then I scrubbed any remnants of previous tinkering with rvm implode and removed traces from .bashrc, etc. Next I ran \curl -sSL https://get.rvm.io | bash -s stable --ruby --auto-dotfiles and pressed ctrl-c to fix any errors first. Once the install script was happy, I let it download and install as normal. I did not have to use rvm install n.n.n,rvm use n.n.n, or rvm use n.n.n --default as 2.2.1 was pulled in like I wanted anyway and seemed fine. After rvm had setup ruby, I then ran gem install sass
Now, the end-all-be-all... using PermitUserEnvironment, like had been mentioned here: How to use sshd-config permituserenvironment option was the way to go. I saw that there were security concerns with that method, but it was the only thing that worked and I won't be trying to run limited shells. It is normal behavior for SSH to not allow the env vars when not using a login shell. I assumed, however, that the git hooks had full access to the user's normal vars (with ruby paths, etc.) and that assumption was incorrect. Add PermitUserEnvironment yes to the server's /etc/ssh/sshd_config or the like and restart the ssh daemon. As the user on the server, I ran env and copied that into .ssh/environment and cleaned up what wasn't needed. After that, I did my git push from the devel box and it found and ran the sass compiler just fine.

How do i use RVM w/ Hudson CI server on Debian?

I'm trying to setup an automated "build" server for my rails projects using Hudson CI. SO far it's able to run specs and do metrics on the code but I have 2 different projects dependent on 2 different versions of ruby. So i'm trying to use RVM to run multiple copies of ruby then switch back and forth in a pre-build step.
I found a couple posts like this one that try and explain how to make this work, but I'm not running a startup script for hudson, it starts on boot which is how it worked out of the box when i installed it via the debian instructions.
The problem seems to be that even though hudson runs under the "hudson" account and that account has rvm installed (and working) when it tries to run a shell based prebuild step to call rvm switch 1.8.7 it fails with the error "rvm: command not found"
Not sure what I'm doing wrong. Hudson is using SH as its shell but i also tried using bash. no luck.
Has anyone gotten this working before in this setup?
edit the "/etc/init.d/hudson" (!) and change the line:
SU=/bin/su
... change to:
SU="/bin/su -"
... and add rvm setup in the /home/hudson/.profile
I had the same symptoms as you.
After a couple of hours of headbanging, check your $HOME environment variable for Hudson (viewable at http://yourserver/hudson/systemInfo).
Under Ubuntu, the Tomcat 6 start script doesn't set $HOME. Somehow, Hudson inherited my $HOME instead!
I added HOME=$CATALINA_HOME to the /etc/init.d/tomcat6 script just under the rest of the ENV declarations, and now it all works. Very annoying issue, to be sure.

Resources