Airflow EC2-Instance socket.getfqdn() Bug - amazon-ec2

I'm using Airflow version 1.9 and there is a bug in their software that you can read about here on my previous Stackoverflow post, as well as here on another one of my Stackoverflow posts, and here on Airflow's Github where the bug is reported and discussed.
Long story short there are a few locations in Airflow's code where it needs to get the IP address of the server. They accomplish this by running this command:
socket.getfqdn()
The problem is that on Amazon EC2-Instances (Amazon Linux 1) this command doesn't return the IP address rather it returns the hostname like this:
IP-1-2-3-4
Where as it needs the IP address like this:
1.2.3.4
To get this IP value I found from here that I can use this command:
socket.gethostbyname(socket.gethostname())
I've tested the command out in a Python shell and it returns the proper value. So I ran a search on the Airflow package to find all occurrences of socket.getfqdn() and this is what I got back:
[airflow#ip-1-2-3-4 site-packages]$ cd airflow/
[airflow#ip-1-2-3-4 airflow]$ grep -r "fqdn" .
./security/utils.py: fqdn = host
./security/utils.py: if not fqdn or fqdn == '0.0.0.0':
./security/utils.py: fqdn = get_localhost_name()
./security/utils.py: return '%s/%s#%s' % (components[0], fqdn.lower(), components[2])
./security/utils.py: return socket.getfqdn()
./security/utils.py:def get_fqdn(hostname_or_ip=None):
./security/utils.py: fqdn = socket.gethostbyaddr(hostname_or_ip)[0]
./security/utils.py: fqdn = get_localhost_name()
./security/utils.py: fqdn = hostname_or_ip
./security/utils.py: if fqdn == 'localhost':
./security/utils.py: fqdn = get_localhost_name()
./security/utils.py: return fqdn
Binary file ./security/__pycache__/utils.cpython-36.pyc matches
Binary file ./security/__pycache__/kerberos.cpython-36.pyc matches
./security/kerberos.py: principal = configuration.get('kerberos', 'principal').replace("_HOST", socket.getfqdn())
./security/kerberos.py: principal = "%s/%s" % (configuration.get('kerberos', 'principal'), socket.getfqdn())
Binary file ./contrib/auth/backends/__pycache__/kerberos_auth.cpython-36.pyc matches
./contrib/auth/backends/kerberos_auth.py: service_principal = "%s/%s" % (configuration.get('kerberos', 'principal'), utils.get_fqdn())
./www/views.py: 'airflow/circles.html', hostname=socket.getfqdn()), 404
./www/views.py: hostname=socket.getfqdn(),
Binary file ./www/__pycache__/app.cpython-36.pyc matches
Binary file ./www/__pycache__/views.cpython-36.pyc matches
./www/app.py: 'hostname': socket.getfqdn(),
Binary file ./__pycache__/jobs.cpython-36.pyc matches
Binary file ./__pycache__/models.cpython-36.pyc matches
./bin/cli.py: hostname = socket.getfqdn()
Binary file ./bin/__pycache__/cli.cpython-36.pyc matches
./config_templates/default_airflow.cfg:# gets augmented with fqdn
./jobs.py: self.hostname = socket.getfqdn()
./jobs.py: fqdn = socket.getfqdn()
./jobs.py: same_hostname = fqdn == ti.hostname
./jobs.py: "{fqdn}".format(**locals()))
Binary file ./api/auth/backend/__pycache__/kerberos_auth.cpython-36.pyc matches
./api/auth/backend/kerberos_auth.py:from socket import getfqdn
./api/auth/backend/kerberos_auth.py: hostname = getfqdn()
./models.py: self.hostname = socket.getfqdn()
./models.py: self.hostname = socket.getfqdn()
I'm unsure if I should just replace all occurrences of the socket.getfqdn() command with socket.gethostbyname(socket.gethostname()) or not. For one this would be cumbersome to maintain since I would no longer be using the Airflow package I installed from Pip. I tried upgrading to Airflow version 1.10 but it was very buggy and I couldn't get it up and running. So it seems like for now I'm stuck with Airflow version 1.9 but I need to correct this Airflow bug because it's causing my tasks to sporadically fail.

Just replace all occurences of the faulty function call with the one that works. Here are the steps I ran. Make sure you do this for all Airflow servers (Masters and Workers) if you are using an Airflow cluster.
[ec2-user#ip-1-2-3-4 ~]$ cd /usr/local/lib/python3.6/site-packages/airflow
[ec2-user#ip-1-2-3-4 airflow]$ grep -r "socket.getfqdn()" .
./security/utils.py: return socket.getfqdn()
./security/kerberos.py: principal = configuration.get('kerberos', 'principal').replace("_HOST", socket.getfqdn())
./security/kerberos.py: principal = "%s/%s" % (configuration.get('kerberos', 'principal'), socket.getfqdn())
./www/views.py: 'airflow/circles.html', hostname=socket.getfqdn()), 404
./www/views.py: hostname=socket.getfqdn(),
./www/app.py: 'hostname': socket.getfqdn(),
./bin/cli.py: hostname = socket.getfqdn()
./jobs.py: self.hostname = socket.getfqdn()
./jobs.py: fqdn = socket.getfqdn()
./models.py: self.hostname = socket.getfqdn()
./models.py: self.hostname = socket.getfqdn()
[ec2-user#ip-1-2-3-4 airflow]$ sudo find . -type f -exec sed -i 's/socket.getfqdn()/socket.gethostbyname(socket.gethostname())/g' {} +
[ec2-user#ip-1-2-3-4 airflow]$ grep -r "socket.getfqdn()" .
[ec2-user#ip-1-2-3-4 airflow]$ grep -r "socket.gethostbyname(socket.gethostname())" .
./security/utils.py: return socket.gethostbyname(socket.gethostname())
./security/kerberos.py: principal = configuration.get('kerberos', 'principal').replace("_HOST", socket.gethostbyname(socket.gethostname()))
./security/kerberos.py: principal = "%s/%s" % (configuration.get('kerberos', 'principal'), socket.gethostbyname(socket.gethostname()))
./www/views.py: 'airflow/circles.html', hostname=socket.gethostbyname(socket.gethostname())), 404
./www/views.py: hostname=socket.gethostbyname(socket.gethostname()),
./www/app.py: 'hostname': socket.gethostbyname(socket.gethostname()),
./bin/cli.py: hostname = socket.gethostbyname(socket.gethostname())
./jobs.py: self.hostname = socket.gethostbyname(socket.gethostname())
./jobs.py: fqdn = socket.gethostbyname(socket.gethostname())
./models.py: self.hostname = socket.gethostbyname(socket.gethostname())
./models.py: self.hostname = socket.gethostbyname(socket.gethostname())
After making that update simply restart the Airflow Webserver, Scheduler, and Worker processes and you should be all set. Note that when I am cd'ing into the python package for airflow I am using python 3.6 well some of you may be on like 3.7 so your path may have to be adjusted to like /usr/local/lib/python3.7/site-packages/airflow so just cd into /usr/local/lib and see what python folder you have to go into. I don't think airflow goes under this location but sometimes python packages are located here too /usr/local/lib64/python3.6/site-packages so the difference in the path there is that it's lib64 instead of lib. Also, keep in mind that this is fixed in Airflow version 1.10 so you should not need to make these changes anymore in the latest version of Airflow.

Related

the script I wrote for nmap does not work

I wrote a script for nmap but it doesn't work
Maybe a problem with the libraries, but not sure (I'm new to this)
import nmap
import sys
import traceback
def scan_hosts(hosts, scan_type):
try:
nm = nmap.PortScanner()
if scan_type == 'SYN':
nm.scan(hosts=hosts, arguments='-p- -sS -T4 -A --script=vulners')
elif scan_type == 'UDP':
nm.scan(hosts=hosts, arguments='-p- -sU -T4 -A --script=vulners')
elif scan_type == 'FULL':
nm.scan(hosts=hosts, arguments='-p- -sS -sU -T4 -A --script=vulners')
for host in nm.all_hosts():
print('Host: %s (%s)' % (host, nm[host].hostname()))
for proto in nm[host].all_protocols():
print('Protocol: %s' % proto)
lport = nm[host][proto].keys()
for port in lport:
print('port : %s\tstate : %s' % (port, nm[host][proto][port]['state']))
except Exception as e:
print(f"An error occurred while scanning the host: {e}")
print("\n".join(traceback.format_exception(etype=type(e), value=e, tb=e.__traceback__)))
# Attempt to correct the error
if "timed out" in str(e):
print("Error: Host timed out. Retrying with increased timeout...")
nm.scan(hosts=hosts, arguments='-p- -sS -T4 -A --script=vulners --host-timeout=60')
else:
print("Error could not be corrected. Exiting...")
sys.exit(1)
if __name__ == '__main__':
hosts = '0.0.0.0/0'
scan_type = 'FULL' # can be one of 'SYN', 'UDP', or 'FULL'
scan_hosts(hosts, scan_type)
Maybe a problem with the libraries, but not sure (I'm new to this)

Conda: list all environments that use a certain package

How can one get a list of all environments that use a certain package in conda?
conda search
Since Conda v 4.5.0, the conda search command has had an --envs flag for searching local environments for installed packages. See conda search -h:
--envs Search all of the current user's environments. If run
as Administrator (on Windows) or UID 0 (on unix),
search all known environments on the system.
A use case given in the original Issue was finding environments with outdated versions of openssl packages:
conda search --envs 'openssl<1.1.1l'
Conda Python API
Here's an example of how to do it with the Conda Python package (run this in base environment):
import conda.gateways.logging
from conda.core.envs_manager import list_all_known_prefixes
from conda.cli.main_list import list_packages
from conda.common.compat import text_type
# package to search for; this can be a regex
PKG_REGEX = "pymc3"
for prefix in list_all_known_prefixes():
exitcode, output = list_packages(prefix, PKG_REGEX)
# only print envs with results
if exitcode == 0 and len(output) > 3:
print('\n'.join(map(text_type, output)))
This works as of Conda v4.10.0, but there since it relies on internal methods, there's no guarantee going forward. Perhaps this should be a feature request, say for a CLI command like conda list --any.
Script Version
Here is a version that uses arguments for package names:
conda-list-any.py
#!/usr/bin/env conda run -n base --no-capture-output python
## Usage: conda-list-any.py [PACKAGE ...]
## Example: conda-list-any.py numpy pandas
import conda.gateways.logging
from conda.core.envs_manager import list_all_known_prefixes
from conda.cli.main_list import list_packages
from conda.common.compat import text_type
import sys
for pkg in sys.argv[1:]:
print("#"*80)
print("# Checking for package '%s'..." % pkg)
n = 0
for prefix in list_all_known_prefixes():
exitcode, output = list_packages(prefix, pkg)
if exitcode == 0 and len(output) > 3:
n += 1
print("\n" + "\n".join(map(text_type, output)))
print("\n# Found %d environment%s with '%s'." % (n, "" if n == 1 else "s", pkg))
print("#"*80 + "\n")
The shebang at the top should ensure that it will run in base, at least on Unix/Linux systems.
I've ran into the same issue and constructed a Powershell script that:
takes a regex query as an argument
goes through all available conda environments (and shows a progress bar while doing so)
lists all packages in each environment
returns the environments that have a package that match the query
per returned environment shows the package name that matched the query
The script is available at the bottom of this answer. I have saved it in a folder that is avaialble in the PATH under the name search-pkg-env.ps1.
The script can be run as follows (here I search for environments with the pip package pymupdf):
PS > search-pkg-env.ps1 pymupdf
Searching 19 conda environments...
py39
- pymupdf 1.18.13
Script
if ($args.Length -eq 0) {
Write-Host "Please specify a search term for that matches the package name (regex)." -BackgroundColor DarkYellow -ForegroundColor White
exit
}
$pkg_regex = $args[0]
$conda_envs = ConvertFrom-JSON -InputObject $([string]$(conda env list --json))
$total = $conda_envs.envs.Count
Write-Host "Searching $total conda environments..."
$counter = 1
ForEach ($conda_env in $conda_envs.envs) {
if ($total -gt 0) {
$progress = $counter / $total * 100
}
else {
$progress = 0
}
# Split the full path into its elements
$parts = $conda_env -Split '\\'
# The last element has the env name
$env_name = $parts[-1]
Write-Progress -Activity "Searching conda environment $env_name for $pkg_regex" -PercentComplete $progress
# Now search the provided package name in this environment
$search_results = ConvertFrom-JSON -InputObject $([string]$(conda list $pkg_regex -n $env_name --json))
If ($search_results.Count -gt 0) {
Write-Host $env_name
foreach ($result in $search_results) {
$pkg_name = $result.name
$pkg_version = $result.version
Write-Host " - $pkg_name $pkg_version"
}
}
$counter++
}
Python + Terminal solution
Jump directly to solution:
Use the following Github script Listing Environments Containing a Set of Packages
You Don't Need to install any Library
Simply copy the script, run and provide the set of packages needed (one or more)
From Scratch Solution:
To find all environments that contains a set of packages (one or more):
Get all the conda environments using Popen and conda env list; this will list all conda envs on your station.
Loop over the packages and check if a package exist in a conda env using Popen and conda list -n <environment>; this will list all available packages in an environment
Check through the output if it exists using either grep, findstr or your platform alternative on the terminal output - or do the search with python.
For each package save in a list
Check through lists for common environments and Voila!

Ansible does not run without password file

I have a very simple setup:
ansible-playbook -vvv playbooks/init.yml -i inventories/dev/local.hosts
ansible-playbook 2.7.8
config file = /Users/user/code/lambda/pahan/ansible/ansible.cfg
configured module search path = ['/Users/user/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/Cellar/ansible/2.7.8/libexec/lib/python3.7/site-packages/ansible
executable location = /usr/local/bin/ansible-playbook
python version = 3.7.2 (default, Feb 12 2019, 08:15:36) [Clang 10.0.0 (clang-1000.11.45.5)]
Using /Users/user/code/lambda/pahan/ansible/ansible.cfg as config file
ERROR! The vault password file /Users/user/.vault_pass.txt was not found
The password file is not referenced anywhere and there is no passowrd used from the vault:
ansible.cfg
[defaults]
roles_path = ./roles
[ssh_connection]
control_path=~/%%h-%%r
pipelining = True
Why is it trying to access the password file anyways?
It turns out that the env variable was stuck in the environment forcing ansible to read the password file.
I was able to debug this by: ansible-config -c ansible.cfg dump
ACTION_WARNINGS(default) = True
AGNOSTIC_BECOME_PROMPT(default) = False
ALLOW_WORLD_READABLE_TMPFILES(default) = False
ANSIBLE_COW_PATH(default) = None
ANSIBLE_COW_SELECTION(default) = default
ANSIBLE_COW_WHITELIST(default) = ['bud-frogs', 'bunny', 'cheese', 'daemon', 'default', 'dragon', 'elephant-in-snake', 'elephant', 'eyes
ANSIBLE_FORCE_COLOR(default) = False
ANSIBLE_NOCOLOR(default) = False
ANSIBLE_NOCOWS(default) = False
ANSIBLE_PIPELINING(/Users/user/code/lambda/pahan/ansible/ansible.cfg) = True
ANSIBLE_SSH_ARGS(default) = -C -o ControlMaster=auto -o ControlPersist=60s
ANSIBLE_SSH_CONTROL_PATH(/Users/user/code/lambda/pahan/ansible/ansible.cfg) = ~/%%h-%%r
ANSIBLE_SSH_CONTROL_PATH_DIR(default) = ~/.ansible/cp
ANSIBLE_SSH_EXECUTABLE(default) = ssh
ANSIBLE_SSH_RETRIES(default) = 0
ANY_ERRORS_FATAL(default) = False
BECOME_ALLOW_SAME_USER(default) = False
CACHE_PLUGIN(default) = memory
CACHE_PLUGIN_CONNECTION(default) = None
CACHE_PLUGIN_PREFIX(default) = ansible_facts
CACHE_PLUGIN_TIMEOUT(default) = 86400
...
After removing the env variable Ansible was able to run again.

How to get active namenode hostname from Cloudera Manager REST API?

I'm able to access the Cloudera manager rest API.
curl -u username:password http://cmhost:port/api/v10/clusters/clusterName
How to find the active namenode and resource mangarer hostname?
I couldn't find anything relevant from API docs.
http://cloudera.github.io/cm_api/apidocs/v10/index.html
Note: Cluster is configured with high availability
You need to use this endpoint:
http://cloudera.github.io/cm_api/apidocs/v10/path__clusters_-clusterName-services-serviceName-roles-roleName-.html
Then do the following:
For each Name Node:
$ curl -u username:password \
http://cmhost:port/api/v10/clusters/CLNAME/services/HDFS/roles/NN_NAME
Replacing:
CLNAME with your clusterName
HDFS with your HDFS serviceName
NN_NAME with your NameNode name
This will return the apiRole object which has a field called haStatus. The one that shows "ACTIVE" is the active NameNode.
For the Resource Manager do similar steps:
For each Resource Manager:
$ curl -u username:password \
http://cmhost:port/api/v10/clusters/CLNAME/services/YARN/roles/RM_NAME
Where:
YARN with your YARN serviceName
RM_NAME with your Resource Manager name
Once you have the right NameNode and Resource Manager, use:
http://cloudera.github.io/cm_api/apidocs/v10/path__hosts_-hostId-.html
to map the hostId to the hostname.
You can get a bulk of HDFS related information for hosts by using the REST API:
$ python build.py username:password cmhost:port
$ cat build.py
import sys
import json
import requests
args = sys.argv
if len(args) != 3:
print "Usage: python %s login:password host:port" % args[0]
exit(1)
LP = args[1]
CM = args[2]
host = {}
hosts = requests.get('http://'+LP+'#'+CM+'/api/v10/hosts').json()
for h in hosts['items']:
host[h['hostId']] = h['hostname']
nameservices = requests.get('http://'+LP+'#'+CM+'/api/v10/clusters/cluster/services/hdfs/nameservices').json()
for ns in nameservices['items']:
print('hdfs.NS:' + ns['name'])
services = requests.get('http://'+LP+'#'+CM+'/api/v10/clusters/cluster/services').json()
for s in services['items']:
if (s['name'] == 'hdfs'):
roles = requests.get('http://'+LP+'#'+CM+'/api/v10/clusters/cluster/services/' + s['name'] + '/roles').json()
srv = {}
for r in roles['items']:
suff = '.' + r.get('haStatus') if r.get('haStatus') else ''
key = s['name'] + '.' + r['type'] + suff
srv[key] = srv.get(key) + ',' + host[r['hostRef']['hostId']] if srv.get(key) else host[r['hostRef']['hostId']]
for s in srv:
print(s + ":" + ','.join(sorted(srv[s].split(','))))
Then you'll get something like this, just grep for hdfs.NAMENODE.ACTIVE (or slightly change the python script):
hdfs.NS:H1
hdfs.NAMENODE.ACTIVE:h6
hdfs.NAMENODE.STANDBY:h1
hdfs.FAILOVERCONTROLLER:h1,h2,h3
hdfs.DATANODE:h1
hdfs.HTTPFS:h1,h2,h3
hdfs.GATEWAY:h1,h2,h3
hdfs.JOURNALNODE:h4,h5
hdfs.BALANCER:h7

traefik - HTTP to HTTPS WWW Redirect

I could not find a question similar to this, there were others mentioning https redirects, but not about minimizing the redirects.
Been looking for a solution, and could not sort it out yet.
We use Docker > Traefik for WordPress and have www as the preferred version for WordPress. There are multiple WP instances. Domains are added dynamically.
However, with this config, I am receiving two redirects, from http to https to https www
http://example.com/
https://example.com/
https://www.example.com/
Is there any way to minimize the redirect?
ideally a 301 redirect from
http://example.com directly to https://www.example.com
Traefik config file as follows
defaultEntryPoints = ["http", "https"]
[web]
address = ":8080"
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.http.redirect]
entryPoint = "https"
[entryPoints.https]
address = ":443"
compress = true
[entryPoints.https.tls]
[acme]
email = "email#domain.com"
storage = "acme.json"
entryPoint = "https"
onDemand = false
OnHostRule = true
[docker]
endpoint = "unix:///var/run/docker.sock"
domain = "traefik.example.com"
watch = true
exposedbydefault = false
Try replacing your [entryPoints.http.redirect] entry with this:
[entryPoints.http.redirect]
#entryPoint = "https"
regex = "^http:\/\/(www\.)*(example\.com)(.*)"
replacement = "https://www.$2$3"
permanent = true
Regex101
It will not handle the https://example.com/ entry so you need to add:
[entryPoints.https.redirect]
regex = "^https:\/\/(example\.com)(.*)"
replacement = "https://www.$1/$2"
permanent = true
If you have multiple frontedns, the regex can get hard to handle, so instead you can consider having a label on the container, like this:
traefik.frontend.headers.SSLRedirect=true
traefik.frontend.headers.SSLHost=www.example.com
As of 1.7 there is new option SSLForceHost that would force even existing SSL connection to be redirected.
traefik.frontend.headers.SSLForceHost=true
Here's what I had to do. The above answer was helpful, but traefik wouldn't start because you actually need a double \ to escape in the .toml.
Also you still need to make sure you have the normal entry points and ports there.
Here's my complete entryPoints section:
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
[entryPoints.http.redirect]
regex = "^http:\\/\\/(www.)*(example\\.com)(.*)"
replacement = "https://www.$2/$3"
permanent = true
[entryPoints.https.redirect]
regex = "^https:\\/\\/(example.com)(.*)"
replacement = "https://www.$1/$2"
permanent = true
[entryPoints.https.tls]
This is how I got it to work with docker provider behind AWS ELB.
traefik container
/usr/bin/docker run --rm \
--name traefik \
-p 5080:80 \
-p 5443:443 \
-v /etc/traefik/traefik.toml:/etc/traefik/traefik.toml \
-v /var/run/docker.sock:/var/run/docker.sock \
traefik
traefik.toml
defaultEntryPoints = ["http", "https"]
[entryPoints]
[entryPoints.http]
address = ":80"
[entryPoints.https]
address = ":443"
docker labels
-l traefik.enable=true \
-l traefik.http.middlewares.redirect.redirectregex.regex="^http://(.*)" \
-l traefik.http.middlewares.redirect.redirectregex.replacement="https://\$1" \
-l traefik.http.routers.web-redirect.rule="Host(\`domain.com\`)" \
-l traefik.http.routers.web-redirect.entrypoints="http" \
-l traefik.http.routers.web-redirect.middlewares="redirect" \
-l traefik.http.routers.web-secure.rule="Host(\`domain.com\`)" \
-l traefik.http.routers.web-secure.entrypoints="https" \
ELB listeners

Resources