Similar to the issue #37033541, my commands do not stop. However, my system does not have unmounted drives; my GOPATH is set to /users/user_name/go:/users/user_name/goCode. Neither changing this path to the installation default, nor restarting the computer, or even starting a shell without my bashrc change the behaviour. While it is running, it does generate a functional executable.
I am running go 1.14.1 installed according to the instructions for macOS Mohave.
This behaviour replicates across other packages in my system. But transferring the code to the Go Playground or another Mac computer does not replicate the behaviour. When I run go build -x ..., the last action is: rm -r $WORK/b001/.
Running a stack trace on the process yields ongoing system calls that I cannot interpret (They do seem varied and would be happy to post some if someone would think them useful).
This did not use to happen, it started a few hours ago. I would appreciate the help of someone in troubleshooting this issue.
The issue was resolved only by putting in a fresh installation of the OS and then reinstalling go 1.14.1.
More here: https://groups.google.com/forum/#!topic/golang-nuts/YxqX9o2YJ4k
We started seeing some strange errors in our logs that normally appear when ruby isn't compiled properly with OpenSSL. But it's inconcistent...
We're getting errors like:
RuntimeError: Unsupported digest algorithm (SHA256). (also with other digests, like sha1). example error trace
Faraday::SSLError (SSL_CTX_new: (null)) example error trace
We managed to reproduce it when starting unicorn using service unicorn start or systemctl start unicorn. But only with some requests... Not all of them. Some requests that use OpenSSL under the hood do work. Others don't.
However, when we start unicorn with /etc/init.d/unicorn start, everything works without a hitch. (to clarify, systemd starts the same /etc/init.d script)
We tried debugging ENV vars, user permissions, file/dir ownership, recompile ruby, bootstrap a new server from scratch... Nothing seems to help.
In case this helps:
unicorn init.d script
unicorn.rb
What are we missing? What can we try that we haven't thought of?
UPDATE 1
output of some debug commands, e.g. OpenSSL, ruby etc
PATH is being set inside the init.d script
unicorn is being executed via su into www-data user
The same problem happens when we use this unicorn.service file in /etc/systemd/system
We're running Ubuntu 16.04 on Gcloud
Ruby was not installed via apt (explicitly removed, in case platform came pre-installed) and compiled from scratch. We're currently running 2.3.4 and tried also 2.3.6. Compiled either manually or using ruby-build. No rbenv, nor RVM.
We install libssl-dev via apt (we're running apt-get install -y autoconf bison build-essential libssl-dev libyaml-dev libreadline6-dev zlib1g-dev libncurses5-dev libffi-dev libgdbm3 libgdbm-dev before building ruby)
UPDATE 2
We're using a scripted/repeatable build process for the VM (using fabric), and this problem is consistent on multiple VMs we bootstrapped on GCloud. We then tried a VM on DigitalOcean with the same bootstrap scripts, and the problem doesn't seem to appear there.
In both cases we picked Ubuntu 16.04 64bit base image, but obviously there are some differences with kernel versions, base installed packages etc...
UPDATE 3
The problem simply vanished. See my answer below.
#gingerlime I had a similar situation with our Jenkins on GCP, we're using ChefDK 3.1.0 (ruby embeed 2.5.1p57) -- tried other also, over a Jenkins that was running over systemd (Ubuntu 16.04) and upstart (Ubuntu 14.04) -- we tried on both versions, right now running over 16.04 in 4.15.0-1023-gcp kernel version, running a few jobs with kitchen-docker and this problem always emerge in a few situations.
I digged into and found that this only happens when the Etc.getlogin class gets called (for me here), this doesn't return any error, it return the correct info, the correct type of the class (String), but once it gets a call, the Unsupported digest algorithm gets raised.
If I start the process manually by root or jenkins user, this problem doesn't happen. I tried to implement the Etc.getlogin in several different ways, like using ENV['USER'], a fixed String, or other classes from Etc, like getpwuid, simulating the return class and values from Etc.getlogin, and the error doesn't get raised.
I'm not sure if this is some bug related to the ruby version and the custom kernel that GCP instances uses, but it happens in a similar situation like yours, and for me, the Etc.getlogin was the problem. Right now, I fixed by using a custom configuration that doesn't gets the call from this function, and it's working normally.
One option is that this isn't an issue of sysVinit vs systemd at all, but you just haven't triggered the issue with your sysVinit script yet.
When you run your svsVinit script through the systemctl command it's going through a compatibility layer, and there may be a problem there. Your problem would be simplified both yourself and for us if you reproduced the issue directly with a systemd service file and shared that file.
You mentioned debugging ENV, but didn't mention exactly what you checked in the ENV. This is definitely one place where systemd could make a difference. As seen in man systemd.exec, systemd sets $PATH in the environment to a fixed value:
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
If this is not exactly the same as when run directly as an sysVinit script, that could be an issue.
I would also check for all your copies of SSL on the system. Do you have more than one? Where? Do you have more than copy of the ruby openssl module loaded?
locate -r lib/.*libssl.*so
Also see the answer to the FAQ: Why do things behave differently under systemd?
(also posted on this github issue)
It looks like the problem just vanished. We were testing and reproducing it consistently, across several Compute Engine instances on Google Cloud. Under certain conditions (unicorn / puma started by systemd, etc), it was completely reproducible both with our own rails app, and with a plain vanilla rails app we've set for testing purposes. It was reproducible across several ruby versions as well (we tested 2.3.4, 2.3.6 and 2.5.0).
Suddenly, all instances that were consistently failing started working without exhibiting these problems. Like it never existed. We didn't even reboot some of those instances, and we saw no evidence of any unattended upgrades taking place... We also had one snapshot of a system that had this problem, and that we can reliably reproduce on. Creating an instance from this snapshot stopped exhibiting it as well from that specific point in time a few hours ago.
We're totally confused as to what might have cause it, and what might have made it disappear... However, without being able to reproduce it now, I guess there's no point leaving this issue open, so will close it. Chalk it up to Deus ex machina I suppose. (perhaps the google support gods, but they haven't reported anything back to us yet)
I've seen several Dockerfile, and i have the feeling that people try to avoid using RUN commands. But why?
So is there anything (but repetitive text in this example) to preferr
RUN gem install \
jekyll \
github-pages
over
RUN gem install jekyll
RUN gem install github-pages
Each execution of a RUN command creates a temporary container from the last resulting image, executes your commands, and saves the result as a new layer. Minimizing RUN commands both reduces the amount of overhead from these intermediate containers, but can also dramatically shrink the size of the resulting image.
If, for example, you do 2 run commands, one that downloads 1 gig of data, and a second that deletes that gig of data, your resulting image will exceed one gig even though it's not visible in the running container.
Therefore, when doing large downloads of cached files to do an install or build of an app and you cleanup that build environment when finished, it's a good practice to do that as a single step so the deleted files never make it into any part of the image.
One last reason is for the cache. If you need to pull a new version of an app from a package repository, you also want to update your info on that remote repository (e.g. apt-get update) before doing an install to pull the latest version. If you separate the apt-get update from the apt-get install, the update command may be cached from an old build and the install will attempt to pull old or non-existent files.
Recently while at work we were given an older version of ubuntu (12.04) to work with on a vagrant machine.
I wanted to migrate over to fish or zsh due to them being way better then default bash but I'm encountering a weird error where if I navigate to the /vagrant folder (shared onto the local machine as well) every command I run stalls for 5-10 seconds. Outside of the folder it does fine and has no stalling problems.
Has anyone encountered this before or have any ideas on why this could be happening?
Probably some stat or something taking too long to finish. I will suggest to run strace on bash (if you have patient for that :) ).
Such stat can be for example called from your PROMPT, it's common to have there CWD.
I will suggest to not run commands from there if not necessary. You can call commands outside and provide path to there.
Ofcourse you can play with sharing, there is probably some weak point.
I have installed msysgit, and I am attempting to use it inside of Hudson. Whenever I run a command in an interactive shell, whether it be git-bash or a command prompt, the commands are instant. When I run them in Hudson, they lag for a very long time.
Running /bin/git help took 63 seconds when I just invoked it. I've never waited long enough to see a clone begin outputting (>10 minutes).
The Hudson mailing list is down, so I figured I would try here...
I've run into this problem as well, and figured out a workaround. When Hudson runs as a service, something is missing that your normal desktop environment has, which causes something to do with the network to have to re-load for each process. msys-1.0.dll attempts to load something in netapi32.dll which causes it to take so long. So I just downloaded plink.exe from PuTTY, and set my GIT_SSH env to use that instead. Problem averted.
Have you tried using the Git plugin for Hudson?