cannot configure HDFS address using gethue/hue docker image - hadoop

I'm trying to get the Hue docker image from gethue/hue, but it seems to ignore the configuration I give him and always look for HDFS on localhost instead of the docker container I ask him to look for.
Here is some context:
I'm using the following docker compose to launch a HDFS cluster:
hdfs-namenode:
image: bde2020/hadoop-namenode:1.1.0-hadoop2.7.1-java8
hostname: namenode
environment:
- CLUSTER_NAME=davidov
ports:
- "8020:8020"
- "50070:50070"
volumes:
- ./data/hdfs/namenode:/hadoop/dfs/name
env_file:
- ./hadoop.env
hdfs-datanode1:
image: bde2020/hadoop-datanode:1.1.0-hadoop2.7.1-java8
depends_on:
- hdfs-namenode
links:
- hdfs-namenode:namenode
volumes:
- ./data/hdfs/datanode1:/hadoop/dfs/data
env_file:
- ./hadoop.env
This launches images from BigDataEurope, which are already properly configured, including:
- the activation of webhdfs (in /etc/hadoop/hdfs-site.xml):
- dfs.webhdfs.enabled set to true
- the hue proxy user (in /etc/hadoop/core-site.xml):
- hadoop.proxyuser.hue.hosts set to *
- hadoop.proxyuser.hue.groups set to *
The, I launch hue following their instructions:
First, I launch a bash prompt inside the docker container:
docker run -it -p 8888:8888 gethue/hue:latest bash
Then, I modify desktop/conf/pseudo-distributed.ini to point to the correct hadoop "node" (in my case a docker container with the address 172.30.0.2:
[hadoop]
# Configuration for HDFS NameNode
# ------------------------------------------------------------------------
[[hdfs_clusters]]
# HA support by using HttpFs
[[[default]]]
# Enter the filesystem uri
fs_defaultfs=hdfs://172.30.0.2:8020
# NameNode logical name.
## logical_name=
# Use WebHdfs/HttpFs as the communication mechanism.
# Domain should be the NameNode or HttpFs host.
# Default port is 14000 for HttpFs.
## webhdfs_url=http://172.30.0.2:50070/webhdfs/v1
# Change this if your HDFS cluster is Kerberos-secured
## security_enabled=false
# In secure mode (HTTPS), if SSL certificates from YARN Rest APIs
# have to be verified against certificate authority
## ssl_cert_ca_verify=True
And then I launch hue using the following command (still inside the hue container):
./build/env/bin/hue runserver_plus 0.0.0.0:8888
I then point my browser to localhost:8888, create a new user ('hdfs' in my case), and launch the HDFS file browser module. I then get the following error message:
Cannot access: /user/hdfs/.
HTTPConnectionPool(host='localhost', port=50070): Max retries exceeded with url: /webhdfs/v1/user/hdfs?op=GETFILESTATUS&user.name=hue&doas=hdfs (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 99] Cannot assign requested address',))
The interesting bit is that it still tries to connect to localhost (which of course cannot work), even though I modified its config file to point to 172.30.0.2.
Googling the issue, I found another config file: desktop/conf.dist/hue.ini. I tried modifying this one and launching hue again, but same result.
Does any one know how I could correctly configure hue in my case?
Thanks in advance for your help.
Regards,
Laurent.

Your one-off docker run command is not on the same network as the docker-compose containers.
You would need something like this, replacing [projectname] with the folder you started docker-compose up in
docker run -ti -p 8888:8888 --network="[projectname]_default" gethue/hue bash
I would suggest using Docker Compose also for the Hue container and volume mount for a INI files under desktop/conf/ that you can specify simply
fs_defaultfs=hdfs://namenode:8020
(since you put hostname: namenode in the compose file)
You'll also need to uncomment the WebHDFS line for your changes to take affect
All INI files are merged in the conf folder for Hue

Related

Why is curl shutting down the docker container?

Good day!
I have a microservice that runs in a windower and a registry that stores the address of the microservices.
I also have a script that runs when the container is turned on. The script gets its local ip and sends it to another server using curl. After executing the script, code 0 is returned and the container exits. How can you fix this problem?
#docker-compose realtime logs
nginx_1 | "code":"SUCCESSFUL_REQUEST" nginx_1 exited with code 0
My bash script
#!/bin/bash
address=$(hostname -i)
curl -X POST http://registry/service/register -H 'Content-Type: application/json' -d '{"name":"'"$MICROSERVICE_NAME"'","address":"'"$address"'"}'
The script runs fine and no problem, but unfortunately it breaks the container process. Is it possible to somehow intercept this code so that it does not shut down the container?
I would be grateful for any help or comment!🙏
EDIT:
Dockerfile here the script is called after starting the container
FROM nginx:1.21.1-alpine
WORKDIR /var/www/
COPY ./script.sh /var/www/script.sh
RUN apk add --no-cache --upgrade bash && \
apk add nano
#launch script
CMD /var/www/script.sh
EDIT 2:
my docker-compose.yml
version: "3.9"
services:
#database
pgsql:
hostname: pgsql
build: ./pgsql
ports:
- 5432:5432/tcp
volumes:
- ./pgsql/data:/var/lib/postgresql/data
#registry
registry_fpm:
build: ./fpm/registry
depends_on:
- pgsql
volumes:
- ./microservices/registry:/var/www/registry
registry_nginx:
hostname: registry
build: ./nginx/registry
depends_on:
- registry_fpm
volumes:
- ./microservices/registry:/var/www/registry
- ./nginx/registry/nginx.conf:/etc/nginx/nginx.conf
#server
nginx:
build: ./nginx
environment:
MICROSERVICE_NAME: Microservice_1
depends_on:
- registry_nginx
ports:
- 80:80/tcp
the purpose of the registry is to store only the ip of all microservices. If you are familiar with microservices, then it is quite possible that you know that the registry is like the custodian of all addresses of microservices. The registry is used by other microservices to obtain microservice addresses so that microservices can communicate over http.
there is no need for these addresses as far as i can tell. the microservices can easily use each other's hostnames.
you already do this with your curl: the POST request goes to the server registry; and so on
docker compose may just be all the orchestration you require for you microservices.
regarding IPs and networking
if you prefer, for more isolation and consistency, you can configure in your compose.yaml
custom networks virtualised network adapters; think of it as vLANs where the nodes are selected containers only.
for addn info on networking refer
custom IP addresses for each container
hostnames for each container
links deprecated; do not use; information only
regarding heartbeat
keeping track of a heartbeat shouldn't be necessary.
but if you really need one, doing it from within the container is a no-no. a container should be only one running process. and creating a new record is redunduant as the docker daemon is already keeping track of all IP and state (and loads of others).
the function of registry (keeping track of lifecycle) is instead played by the docker daemon. try docker compose ps
however, you can configure the container to restart automatically when it fails using the restart tag
if you need a way to monitor these without the CLI, listening on the docker socket is the way to go.
you could make your own dashboard that taps into the Docker API whose endpoints are listed here. NB: the socket might need to be protected and if possible, ought to be mounted as read-only
but better solution would be a using an image that already does this. i cannot give you recommendations unfortunately; i have not used any.

GItLab CI gives curl: (7) Failed to connect to localhost port 8090: Connection refused

The issue is I get the curl: (7) Failed to connect to localhost port 8090: Connection refused GItLab CI error but this does not happen on my laptop where I get the source html of the webpage. The .gitlab-ci.yml below is a simple reproduction of the issue. I have spent numerous hours trying to figure this out - i'm sure someone else has also.
Aside: This isn't a similar question - since they don't offer a solution.
GitLab Repo: https://gitlab.com/mudassir-ahmed/wordpress-testing-with-gitlab-ci/tree/another-approach but the only file it contains is the .gitlab-ci.yml shown below...
image: docker:stable
variables:
# When using dind service we need to instruct docker, to talk with the
# daemon started inside of the service. The daemon is available with
# a network connection instead of the default /var/run/docker.sock socket.
#
# The 'docker' hostname is the alias of the service container as described at
# https://docs.gitlab.com/ee/ci/docker/using_docker_images.html#accessing-the-services
#
# Note that if you're using the Kubernetes executor, the variable should be set to
# tcp://localhost:2375/ because of how the Kubernetes executor connects services
# to the job container
# DOCKER_HOST: tcp://localhost:2375/
#
# For non-Kubernetes executors, we use tcp://docker:2375/
DOCKER_HOST: tcp://docker:2375/
# When using dind, it's wise to use the overlayfs driver for
# improved performance.
DOCKER_DRIVER: overlay2
services:
- docker:dind
before_script:
- docker info
build:
stage: build
script:
- apk update
- apk add curl
#- hostname -i
- docker container ls
- docker run -d -p 8090:80 --name nginx-server kitematic/hello-world-nginx
- curl localhost:8090 # This works on my laptop but not on a GitLab runner.
Referring to the answer from here : gitlab-ci.yml & docker-in-docker (dind) & curl returns connection refused on shared runner
There are two ways to fix this :
Option 1: Replace localhost in curl localhost:8090 with docker like this curl docker:8090
Option 2:
services:
- name: docker:dind
alias: localhost
docker run -d -p 8090:80 --name nginx-server kitematic/hello-world-nginx
curl localhost:8090 # This works on my laptop but not on a GitLab runner.
Assuming that is your code i think that you should somehow add some timeout between docker run and curl.
I have similar issues some time ago after starting docker container on gitlab runner machine i wasnt able to accces my url to. When i added command which check if container is running for " about one minute " it resolved my problem.
"docker inspect -f {{.State.Running}} " + containerName" but in order to do that check, you should add some additional script

Setting redis configuration with docker in windows

I want to set up redis configuration in docker.
I have my own redis.conf under D:/redis/redis.conf and have configured it to have bind 127.0.0.1 and have uncommented requirepass foobared
Then used this command to load this configuration in docker:
docker run --volume D:/redis/redis.conf:/usr/local/etc/redis/redis.conf --name myredis redis redis-server /usr/local/etc/redis/redis.conf
Next,
I have docker-compose.yml in my application in maven Project under src/resources.
I have the following in my docker-compase.yml
redis:
image: redis
ports:
- "6379:6379"
And i execute the command :
docker-compose up
The Server runs, but when i check with the command:
docker ps -a
it Shows that redis Image runs at 0.0.0.0:6379.
I want it to run at 127.0.0.1.
How do i get that?
isn't my configuration file loading or is it wrong? or my commands are wrong?
Any suggestions are of great help.
PS: I am using Windows.
Thanks
Try to execute:
docker inspect <container_id>
And use "NetworkSettings"->"Gateway" (it must be 172.17.0.1) value instead of 127.0.0.1.
You can't use 127.0.0.1 as your Redis was run in the isolated environment.
Or you can link your containers.
So first of all you should not be worried about redis saying listening on 0.0.0.0:6379. Because redis is running inside the container. And if it doesn't listen on 0.0.0.0 then you won't be able to make any connections.
Next if you want redis to only listen on localhost on localhost then you need to use below
redis:
image: redis
ports:
- "127.0.0.1:6379:6379"
PS: I have not run container or docker for windows with 127.0.0.1 port mapping, so you will have to see if it works. Because host networking in Windows, Mac and Linux are different and may not work this way

Volume mapped filebeat.yml permissions from Docker on a Windows host

I'm trying to run the official 5.4.3 Filebeat docker container via VirtualBox on a Windows host. Rather than creating a custom image, I'm using a volume mapping to pass the filebeat.yml file to the container using the automatically created VirtualBox mount /c/Users which points to C:\Users on my host.
Unfortunately I'm stuck on this error:
Exiting: error loading config file: config file ("filebeat.yml") can only be writable by the owner but the permissions are "-rwxrwxrwx" (to fix the permissions use: 'chmod go-w /usr/share/filebeat/filebeat.yml')
My docker-compose config is:
filebeat:
image: "docker.elastic.co/beats/filebeat:5.4.3"
volumes:
- "/c/Users/Nathan/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro"
- "/c/Users/Nathan/log:/mnt/log:ro"
I've tried SSH-ing into the machine and running the chmod go-w command but no change. Is this some kind of permission limitation when working with VirtualBox shared folders on a Windows host?
It looks like this is a side effect of the Windows DACL permissions system. Fortunately I only need this for a development environment so I've simply disabled the permission check by overriding the container entry point and passing the strict.perms argument.
filebeat:
image: "docker.elastic.co/beats/filebeat:5.4.3"
entrypoint: "filebeat -e -strict.perms=false"
volumes:
- "/c/Users/Nathan/filebeat.yml:/usr/share/filebeat/filebeat.yml:ro"
- "/c/Users/Nathan/log:/mnt/log:ro"

How to mount network Volume in Docker for Windows (Windows 10)

We're working to create a standard "data science" image in Docker in order to help our team maintain a consistent environment. In order for this to be useful for us, we need the containers to have read/write access to our company's network. How can I mount a network drive to a docker container?
Here's what I've tried using the rocker/rstudio image from Docker Hub:
This works:
docker run -d -p 8787:8787 -v //c/users/{insert user}:/home/rstudio/foobar rocker/rstudio
This does not work (where P is the mapped location of the network drive):
docker run -d -p 8787:8787 -v //p:/home/rstudio/foobar rocker/rstudio
This also does not work:
docker run -d -p 8787:8787 -v //10.1.11.###/projects:/home/rstudio/foobar rocker/rstudio
Any suggestions?
I'm relatively new to Docker, so please let me know if I'm not being totally clear.
I know this is relatively old - but for the sake of others - here is what usually works for me. for use - we use a windows file-server so we use cifs-utils in order to map the drive. I assume that below instructions can be applied to nfs or anything else as well.
first - need to run the container in privileged mode so that you can mount remote folders inside of the container (--dns flag might not be required)
docker run --dns <company dns ip> -p 8000:80 --privileged -it <container name and tag>
now, (assuming centos with cifs and being root in the container) - hop into the container and run:
install cifs-utils if not installed yet
yum -y install cifs-utils
create the local dir to be mapped
mkdir /mnt/my-mounted-folder
prepare a file with username and credentials
echo "username=<username-with-access-to-shared-drive>" > ~/.smbcredentials
echo "password=<password>" > ~/.smbcredentials
map the remote folder
mount <remote-shared-folder> <my-local-mounted-folder> -t cifs -o iocharset=utf8,credentials=/root/.smbcredentials,file_mode=0777,dir_mode=0777,uid=1000,gid=1000,cache=strict
now you should have access
hope this helps..
I will write my decision. I have a Synology NAS. The shared folder uses the smb protocol.
I managed to connect it in the following way. The most important thing was to write version 1.0 (vers=1.0). It didn't work without it! I tried to solve the issue for 2 days.
version: "3"
services:
redis:
image: redis
restart: always
container_name: 'redis'
command: redis-server
ports:
- '6379:6379'
environment:
TZ: "Europe/Moscow"
celery:
build:
context: .
dockerfile: celery.dockerfile
container_name: 'celery'
command: celery --broker redis://redis:6379 --result-backend redis://redis:6379 --app worker.celery_worker worker --loglevel info
privileged: true
environment:
TZ: "Europe/Moscow"
volumes:
- .:/code
- nas:/mnt/nas
links:
- redis
depends_on:
- redis
volumes:
nas:
driver: local
driver_opts:
type: cifs
o: username=user,password=pass,**vers=1.0**
device: "//192.168.10.10/main"
I have been searching the solution the last days and I just get one working.
I am running docker container on an ubuntu virtual machine and I am mapping a folder on other host on the same network which is running windows 10, but I am almost sure that the operative system where the container is running is not a problem because the mapping is from the container itself so I think this solution should work in any SO.
Let's code.
First you should create the volume
docker volume create
--driver local
--opt type=cifs
--opt device=//<network-device-ip-folder>
--opt o=user=<your-user>,password=<your-pw>
<volume-name>
And then you have to run a container from an image
docker run
--name <desired-container-name>
-v <volume-name>:/<path-inside-container>
<image-name>
After this a container is running with the volume assignated to it,
and is mapped to .
You create some file in any of this folders and it will be replicated
automatically to the other.
In case someone wants to get this running from docker-compose I leave
this here
services:
<image-name>:
build:
context: .
container_name: <desired-container-name>
volumes:
- <volume-name>:/<path-inside-container>
...
volumes:
<volume-name>:
driver: local
driver_opts:
type: cifs
device: //<network-device-ip-folder>
o: "user=<your-user>,password=<your-pw>"
Hope I can help
Adding to the solution by #Александр Рублев, the trick that solved this for me was reconfiguring the Synology NAS to accept the SMB version used by docker. In my case I had to enable SMBv3
I know this is old, but I found this when looking for something similar but see that it's receiving comments for others, like myself, who find it.
I have figured out how to get this to work for a similar situation that took me awhile to figure out.
The answers here are missing some key information that I'll include, possibly because they weren't available at the time
The CIFS storage is, I believe, only for when you are connecting to a Windows System as I do not believe it is used by Linux at all unless that system is emulating a Windows environment.
This same thing can be done with NFS, which is less secure, but is supported by almost everything.
you can create an NFS volume in a similar way to the CIFS one, just with a few changes. I'll list both so they can be seen side by side
When using NFS on WSL2 you 1st need to install the NFS service into the Linux Host OS. I believe CIFS requires a similar one, most likely the cifs-utils mentioned by #LevHaikin, but as I don't use it I'm not certain. In my case the Host OS is Ubuntu, but you should be able to find the appropriate one by finding your system's equivalent for nfs-common (or cifs-utils if that's correct) installation
sudo apt update
sudo apt install nfs-common
That's it. That will install the service so NFS works on Docker (It took me forever to realize that was the problem since it doesn't seem to be mentioned as needed anywhere)
If using NFS, On the network device you need to have set NFS permissions for the NFS folder, in my case this would be done at the folder folder with the mount then being to a folder inside it. That's fine. (In my case the NAS that is my server mounts to #IP#/volume1/folder, within the NAS I never see the volume1 in the directory structure, but that full path to the shared folder is shown in the settings page when I set the NFS permissions. I'm not including the volume1 part as your system will likely be different) & you want the FULL PATH after the IP (use the IP as the numbers NOT the HostName), according to your NFS share, whatever it may be.
If using a CIFS device the same is true just for CIFS permissions.
The nolock option is often needed but may not be on your system. It just disables the ability to "lock" files.
The soft option means that if the system cannot connect to the mount directory it will not hang. If you need it to only work if the mount is there you can change this to hard instead.
The rw (read/write) option is for Read/Write, ro (read-only) would be for Read Only
As I don't personally use the CIFS volume the options set are just ones in the examples I found, whether they are necessary for you will need to be looked into.
The username & password are required & must be included for CIFS
uid & gid are Linux user & group settings & should be set, I believe, to what your container needs as Windows doesn't use them to my knowledge
file_mode=0777 & dir_mode=0777 are Linux Read/Write Permissions essentially like chmod 0777 giving anything that can access the file Read/Write/Execute permissions (More info Link #4) & this should also be for the Docker Container not the CIFS host
noexec has to do with execution permissions but I don't think actually function here, but it was included in most examples I found, nosuid limits it's ability to access files that are specific to a specific user ID & shouldn't need to be removed unless you know you need it to be, as it's a protection I'd recommend leaving it if possible, nosetuids means that it won't set UID & GUID for newly created files, nodev means no access to/creation of devices on the mount point, vers=1.0 I think is a fallback for compatibility, I personally would not include it unless there is a problem or it doesn't work without it
In these examples I'm mounting //NET.WORK.DRIVE.IP/folder/on/addr/device to a volume named "my-docker-volume" in Read/Write mode. The CIFS volume is using the user supercool with password noboDyCanGue55
NFS from the CLI
docker volume create --driver local --opt type=nfs --opt o=addr=NET.WORK.DRIVE.IP,nolock,rw,soft --opt device=:/folder/on/addr/device my-docker-volume
CIFS from CLI (May not work if Docker is installed on a system other than Windows, will only connect to an IP on a Windows system)
docker volume create --driver local --opt type=cifs --opt o=user=supercool,password=noboDyCanGue55,rw --opt device=//NET.WORK.DRIVE.IP/folder/on/addr/device my-docker-volume
This can also be done within Docker Compose or Portainer.
When you do it there, you will need to add a Volumes: at the bottom of the compose file, with no indent, on the same level as services:
In this example I am mounting the volumes
my-nfs-volume from //10.11.12.13/folder/on/NFS/device to "my-nfs-volume" in Read/Write mode & mounting that in the container to /nfs
my-cifs-volume from //10.11.12.14/folder/on/CIFS/device with permissions from user supercool with password noboDyCanGue55 to "my-cifs-volume" in Read/Write mode & mounting that in the container to /cifs
version: '3'
services:
great-container:
image: imso/awesome/youknow:latest
container_name: totally_awesome
environment:
- PUID=1000
- PGID=1000
ports:
- 1234:5432
volumes:
- my-nfs-volume:/nfs
- my-cifs-volume:/cifs
volumes:
my-nfs-volume:
name: my-nfs-volume
driver_opts:
type: "nfs"
o: "addr=10.11.12.13,nolock,rw,soft"
device: ":/folder/on/NFS/device"
my-cifs-volume:
driver_opts:
type: "cifs"
o: "username=supercool,password=noboDyCanGue55,uid=1000,gid=1000,file_mode=0777,dir_mode=0777,noexec,nosuid,nosetuids,nodev,vers=1.0"
device: "//10.11.12.14/folder/on/CIFS/device/"
More details can be found here:
https://docs.docker.com/engine/reference/commandline/volume_create/
https://www.thegeekdiary.com/common-nfs-mount-options-in-linux/
https://web.mit.edu/rhel-doc/5/RHEL-5-manual/Deployment_Guide-en-US/s1-nfs-client-config-options.html
https://www.maketecheasier.com/file-permissions-what-does-chmod-777-means/

Resources