Why I cannot scale openshift cart having an environment variable defined? - cluster-computing

I am trying to integrate Hazelcast in Tomcat cartridge (https://github.com/worldline/openshift-cartridge-tomcat). The problem is to retrieve ip:ports of gears of a scalable application. I looked at how vert.x does it and it is perfectly fine. It uses pub/sub mechanist to set ip and port in the future environment variable. I can see this in hook folder "set-vertex-cluster" file.
echo $list > $OPENSHIFT_VERTX_DIR/env/OPENSHIFT_VERTX_HAZELCAST_CLUSTER
I did id the same way replacing VERTX by TOMCAT (short name of cart).
But after creating the app there is no OPENSHIFT_TOMCAT_HAZELCAST_CLUSTER env variable.
I looked at how JbossEAP does it. It has
touch ${OPENSHIFT_JBOSSEAP_DIR}/env/OPENSHIFT_JBOSSEAP_CLUSTER
https://github.com/openshift/origin-server/blob/master/cartridges/openshift-origin-cartridge-jbosseap/bin/install
It worked for me, and finally I see OPENSHIFT_VERTX_HAZELCAST_CLUSTER env var and it is populated by gear_ip:gear_port.It's good. But when I scale the app I get the error:
Activation of new gears failed: 53d2ed31e0b8cd2bba00051f: Error activating gear: CLIENT_ERROR: Failed to execute: 'control start' for /var/lib/openshift/53d2ed31e0b8cd2bba00051f/tomcat
Unable to complete the requested operation due to: An invalid exit code (1) was returned from the server ex-std-node92.prod.rhcloud.com. This indicates an unexpected problem during the execution of your request.
Reference ID: 0b4e8a465d1901e8317a18739586e6d1
OPENSHIFT_VERTX_HAZELCAST_CLUSTER is populated by gear1_ip:gear1_port and gear2_ip:gear2_port but of course the second gear failed to start.
When I remove
touch ${OPENSHIFT_TOMCAT_DIR}/env/OPENSHIFT_TOMCAT_HAZELCAST_CLUSTER
from bin/install file everything is fine! Except that I don't have list of cluster members...
I am going mad, struggling with the problem all day long. Can anybody help me, please?
UPDATED:
• Customer A create an OpenShift Application A1 with the Git downloadable cartridge
• OpenShift install the downloadable cartridge into Node N1 and install it into the Application A1.
• Customer A now want to scale the application A1.
• OpenShift try to scale the application A1 by acquiring a new gear in Node N2 (notice that it is different from N1 above) and copy the content from A1 into N2 (but somehow will not copy all the environment variables and necessary settings). In .env folder of every gear
• The gear creation now fail on N2, because the downable cartridge content is not available on N2, because of following commands in bin/tomcat
# Filter user-owned configuration files through sed to replace all
# ${OPENSHIFT_*} variables with their actual values, and write the
# resulting filtered files to the live Tomcat configuration location.
sed_replace_env=$(print_sed_exp_replace_env_var)
replacement_conf_files=(
"server.xml"
"context.xml"
)
for conf_file in "${replacement_conf_files[#]}"; do
sed ${sed_replace_env} ${OPENSHIFT_REPO_DIR}/.openshift/config/${conf_file} > ${OPENSHIFT_TOMCAT_DIR}/conf/${conf_file}
done
The particular function
function print_sed_exp_replace_env_var {
sed_exp=""
for openshift_var in $(env | grep OPENSHIFT_ | awk -F '=' '{print $1}')
do
# environment variable values that contain " or / need to be escaped
# or they will cause problems in the sed command line.
variable_val=$(echo "${!openshift_var}" | sed -e "s#\/#\\\\/#g" | sed -e "s/\"/\\\\\"/g")
# the entire sed s/search/replace/g command needs to be quoted in case the variable value
# contains a space.
sed_exp="${sed_exp} -e \"s/\\\${env.${openshift_var}}/${variable_val}/g\""
done
printf "%s\n" "$sed_exp"
}
As you can see, there is a dangerous of the sed command that may replace entire file into blank file for server.xml and context.xml if the environment variables are not present.
The correct order that OpenShift should perform is:
Customer A create an OpenShift Application A1 with the Git downloadable cartridge
• OpenShift install the downloadable cartridge into Node N1 and install it into the Application A1.
• Customer A now want to scale the application A1.
• OpenShift try to scale the application A1 by acquiring a new gear in Node N2 (notice that it is different from N1 above). Copy all necessary environment variables into the new gears as well
• There is a script within your cartridge that require the server.xml and context.xml template from downloadable cartridge, it can now be successfully found and copy.

Related

gcloud cli failing to add record when contents start with dash

I'm working with the LetsEncrypt dns-01 challenge system which entails dynamically creating a TXT record in Google Cloud DNS with specific content, so LE can assert proof of ownership for generating a wildcard certificate (so I can't use http-01). The problem is sometimes LE tells me to create a TXT record that starts with a "-", for example -E_DFDFHJKF1783FSHDJ. I cannot get the gcloud cli to properly accept this data no matter what I do.
Example:
gcloud dns record-sets transaction start --zone=myzone
gcloud dns record-sets transaction add "-E_ASDFSDF" --ttl=30 --zone=myzone --name=test --type=TXT
gcloud dns record-sets transaction remove "-A_DSFKHSDF" --ttl=30 --zone=myzone --name=test2 --type=TXT
If you run those commands and inspect the resulting transaction.yaml you can see whether it properly contains the right string. If it did it correct, you should see something like:
- kind: dns#resourceRecordSet
name: test.
rrdatas:
- '"ASDFASDF"'
ttl: 30
type: TXT
I am executing this via Node's child_process, but I have the issue even if I execute it directly from bash, so Node isn't really meaningful issue at the moment. I've tried echoing the value in. I've tried setting an environment variable and using that in the string.
No matter what I do I get an error like the following:
ERROR: (gcloud.dns.record-sets.transaction.add) unrecognized arguments: -E_ASDFSDF
It turns out some characters need to be escaped in the CLI. I can confirm that the following works:
gcloud dns --project=myprojectid record-sets transaction add "\-test123" --name=test.mydomain.com. --ttl=300 --type=TXT --zone=myzoneid

Perforce - how to prevent "p4 client" from creating a client when the template form is not saved?

The Perforce documentation for p4 client <no args> states:
The p4 client command puts the client spec into a temporary file and
invokes the editor configured by the environment variable P4EDITOR.
For new workspaces, the client name defaults to the P4CLIENT
environment variable, if set, or to the current host name. Saving the
file creates or modifies the client spec.
What I am seeing on our network is that the client is created no matter what, even when I exit without saving.
Ex.
[cad_test_user#sws-cab9-0 ~]$ pwd
/home/cad_test_user
[cad_test_user#sws-cab9-0 ~]$ env | grep P4
P4EDITOR=
P4PORT=tcp:p4p:1666
P4DIFF=tkdiff
P4CONFIG=.p4config
P4IGNORE=.ignore
P4USER=cad_test_user
[cad_test_user#sws-cab9-0 ~]$ p4 clients | grep sws-cab9-0
[cad_test_user#sws-cab9-0 ~]$ p4 client
Client: sws-cab9-0
Owner: cad_test_user
Host: sws-cab9-0.aus5.mythic-ai.com
Client sws-cab9-0 saved.
Root: /home/cad_test_user
Options: noallwrite noclobber nocompress unlocked nomodtime normdir
SubmitOptions: submitunchanged
LineEnd: local
View:
<quit without save>
Client sws-cab9-0 saved.
[cad_test_user#sws-cab9-0 ~]$ p4 clients | grep sws-cab9-0
Client sws-cab9-0 2021/04/06 root /home/cad_test_user 'Created by cad_test_user. '
Now as another user outside of a .p4config hierarhchy, I get an unexpected value for %clientroot%:
[cad_test_user#sws-cab9-0 /]$ p4 -F %clientRoot% -ztag info
/home/cad_test_user
I am wondering if there is something wrong with our default settings; why is the client created and saved even without a write? Ideally, I'd want to manage the default specification to some degree, like:
synthesize the client name so that it is never the hostname, like c:$USER:foo
Not have a "Host:"
define the "Root:" to be somewhere personal
Not create the client unless the user does a write-quit!
Thanks for your answers!
Set up a trigger (a form-save trigger on the client form) that rejects a client which doesn't meet your criteria. It's hard to enforce #4 directly, but as long as at least one of your other criteria is something that requires the form to be edited, it's handled well enough indirectly.
Note that you can pair your form-save trigger with a form-out trigger that modifies the default client form -- you could for example replace Root with an obviously invalid field like --ENTER SOMETHING PERSONALIZED HERE-- and then make sure your form-save trigger rejects it. The Perforce sys admin guide has some nice simple example triggers, one of which demonstrates customizing client spec defaults: https://www.perforce.com/manuals/p4sag/Content/P4SAG/scripting.triggers.forms.out.html
On your criteria #2, I would recommend against this unless you're in an environment where it's commonplace for multiple host machines to share a single filesystem. The default Host guardrails are there to keep you from confusing yourself (and possibly losing data) by reusing a client spec in ways that throw the workspace state out of whack.

MapReduceIndexerTool output dir error "Cannot write parent of file"

I want to use Cloudera's MapReduceIndexerTool to understand how morphlines work. I created a basic morphline that just reads lines from the input file and I tried to run that tool using that command:
hadoop jar /opt/cloudera/parcels/CDH/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool \
--morphline-file morphline.conf \
--output-dir hdfs:///hostname/dir/ \
--dry-run true
Hadoop is installed on the same machine where I run this command.
The error I'm getting is the following:
net.sourceforge.argparse4j.inf.ArgumentParserException: Cannot write parent of file: hdfs:/hostname/dir
at org.apache.solr.hadoop.PathArgumentType.verifyCanWriteParent(PathArgumentType.java:200)
The /dir directory has 777 permissions on it, so it is definitely allowed to write into it. I don't know what I should do to allow it to write into that output directory.
I'm new to HDFS and I don't know how I should approach this problem. Logs don't offer me any info about that.
What I tried until now (with no result):
created a hierarchy of 2 directories (/dir/dir2) and put 777 permissions on both of them
changed the output-dir schema from hdfs:///... to hdfs://... because all the examples in the --help menu are built that way, but this leads to an invalid schema error
Thank you.
It states 'cannot write parent of file'. And the parent in your case is /. Take a look into the source:
private void verifyCanWriteParent(ArgumentParser parser, Path file) throws ArgumentParserException, IOException {
Path parent = file.getParent();
if (parent == null || !fs.exists(parent) || !fs.getFileStatus(parent).getPermission().getUserAction().implies(FsAction.WRITE)) {
throw new ArgumentParserException("Cannot write parent of file: " + file, parser);
}
}
In the message printed is file, in your case hdfs:/hostname/dir, so file.getParent() will be /.
Additionally you can try the permissions with hadoop fs command, for example you can try to create a zero length file in the path:
hadoop fs -touchz /test-file
I solved that problem after days of working on it.
The problem is with that line --output-dir hdfs:///hostname/dir/.
First of all, there are not 3 slashes at the beginning as I put in my continuous trying to make this work, there are only 2 (as in any valid HDFS URI). Actually I put 3 slashes because otherwise, the tool throws an invalid schema exception! You can easily see in this code that the schema check is done before the verifyCanWriteParent check.
I tried to get the hostname by simply running the hostname command on the Cent OS machine that I was running the tool on. This was the main issue. I analyzed the /etc/hosts file and I saw that there are 2 hostnames for the same local IP. I took the second one and it worked. (I also attached the port to the hostname, so the final format is the following: --output-dir hdfs://correct_hostname:8020/path/to/file/from/hdfs
This error is very confusing because everywhere you look for the namenode hostname, you will see the same thing that the hostname command returns. Moreover, the errors are not structured in a way that you can diagnose the problem and take a logical path to solve it.
Additional information regarding this tool and debugging it
If you want to see the actual code that runs behind it, check the cloudera version that you are running and select the same branch on the official repository. The master is not up to date.
If you want to just run this tool to play with the morphline (by using the --dry-run option) without connecting to Solr and playing with it, you can't. You have to specify a Zookeeper endpoint and a Solr collection or a solr config directory, which involves additional work to research on. This is something that can be improved to this tool.
You don't need to run the tool with -u hdfs, it works with a regular user.

Bash case not properly evaluating value

The Problem
I have a script that has a case statement which I'm expecting to execute based on the value of a variable. The case statement appears to either ignore the value or not properly evaluate it instead dropping to the default.
The Scenario
I pull a specific character out of our server hostnames which indicates where in our environment the server resides. We have six different locations:
Management(m): servers that are part of the infrastructure such as monitoring, email, ticketing, etc
Development(d): servers that are for developing code and application functionality
Test(t): servers that are used for initial testing of the code and application functionality
Implementation(i): servers that the code is pushed to for pre-production evaluation
Production(p): self-explanatory
Services(s): servers that the customer needs to integrate that provide functionality across their project. These are separate from the Management servers in that these are customer servers while Management servers are owned and operated by us.
After pulling the character from the hostname I pass it to a case block. I expect the case block to evaluate the character and add a couple lines of text to our rsyslog.conf file. What is happening instead is that the case block returns the default which does nothing but tell the person building the server to manually configure the entry due to an unrecognized character.
I've tested this manually against a server I recently built and verified that the character I am pulling from the hostname (an 's') is expected and accounted for in the case block.
The Code
# Determine which environment our server resides in
host=$(hostname -s)
env=${host:(-8):1}
OLDFILE=/etc/rsyslog.conf
NEWFILE=/etc/rsyslog.conf.new
# This is the configuration we need on every server regardless of environment
read -d '' common <<- EOF
...
TEXT WHICH IS ADDED TO ALL CONFIG FILES REGARDLESS OF FURTHER CODE EXECUTION
SNIPPED
....
EOF
# If a server is in the Management, Dev or Test environments send logs to lg01
read -d '' lg01conf <<- EOF
# Relay messages to lg01
*.notice ##xxx.xxx.xxx.100
#### END FORWARDING RULE ####
EOF
# If a server is in the Imp, Prod or is a non-affiliated Services zone server send logs to lg02
read -d '' lg02conf <<- EOF
# Relay messages to lg02
*.notice ##xxx.xxx.xxx.101
#### END FORWARDING RULE ####
EOF
# The general rsyslog configuration remains the same; pull it out and write it to a new file
head -n 63 $OLDFILE > $NEWFILE
# Add the common language to our config file
echo "$common" >> $NEWFILE
# Depending on which environment ($env) our server is in, add the appropriate
# remote log server to the configuration with the $common settings.
case $env in
m) echo "$lg01conf" >> $NEWFILE;;
d) echo "$lg01conf" >> $NEWFILE;;
t) echo "$lg01conf" >> $NEWFILE;;
i) echo "$lg02conf" >> $NEWFILE;;
p) echo "$lg02conf" >> $NEWFILE;;
s) echo "$lg02conf" >> $NEWFILE;;
*) echo "Unknown environment; Manually configure"
esac
# Keep a dated backup of the original rsyslog.conf file
cp $OLDFILE $OLDFILE.$(date +%Y%m%d)
# Replace the original rsyslog.conf file with the new version
mv $NEWFILE $OLDFILE
An Aside
I've already determined that I can combine the different groups of code from the case block onto single lines (a total of two) using the | operator. I've listed it in the manner above since this is how it is coded while I'm having issues with it.
I can't see what's wrong with your code. Maybe add another ;; to the default clause. To find the problem add a set -vx as a first line. Will show you lots of debug information.

What gems do you recommend to use for this kind of automation?

I have to create a script to manage maintenance pages server for my hosting company.
I will need to do a CLI interface that would act like this (example scenario) :
(here, let's suppose that mcli is the name of the script, 1.1.1.1 the original server address (that host the website, www.exemple.com)
Here I just create the loopback interface on the maintenance server with the original ip address and create the nginx site-specific config file in sites-enabled
$ mcli register www.exemple.com 1.1.1.1
[DEBUG] Adding IP 1.1.1.1 to new loopback interface lo:001001001001
[WARNING] No root directory specified, setting default maintenance page.
[DEBUG] Registering www.exemple.com maintenance page and reloading Nginx: OK
Then when I want to enable the maintenance page and completely shutdown the website:
$ mcli maintenance www.exemple.com
[DEBUG] Connecting to router with SSH: OK
[DEBUG] Setting new route to 1.1.1.1 to maintenance server: OK
[DEBUG] Writing configuration: Ok
Then removing the maintenance page:
$ mcli nomaintenance www.exemple.com
[DEBUG] Connecting to router with SSH: OK
[DEBUG] Removing route to 1.1.1.1: Ok
[DEBUG] Writing configuration: Ok
And I would need a function to see the actual states of the websites
$ mcli list
+------------------+-----------------+------------------+
| Site Name | Server I.P | Maintenance mode |
+------------------+-----------------+------------------+
| www.example.com | 1.1.1.1 | Enabled |
| www.example.org | 1.1.1.2 | Disabled |
+------------------+-----------------+------------------+
$ mcli show www.example.org
Site Name: www.example.org
Server I.P: 1.1.1.1
Maintenance Mode: Enabled
Root Directory : /var/www/maintenance/default/
But I never did this kind of scripting with Ruby. What gems do you recommend for this kind of things ? For command line parsing ? Column/Colorized output ? SSH connection (needed to connect to cisco routers)
Do you recommend me to use a local database (sqlite) to store meta datas (Stages changes, actual states) or do you recommend me to compute on the fly by analyzing nginx/interfaces configuration files and using syslog for monitoring changes done with this script ?
This script will be used at first time for a massive datacenter physical migration, and next for standard usages for scheduled downtimes.
Thank you
First of all, I'd recommend you get a copy of Build awesome command-line applications in Ruby.
That said, you might want to check
GLI command line parsing like git
OptionParser command line parsing
Personally, I'd go for the SQLite approach for storing data, but I'm biased (having a strong SQL background).
Thor is a good gem for handling CLI options. It allows this type of organization in your script:
class Maintenance < Thor
desc "maintenance", "put up maintenance page"
method_option :switch, :aliases => '-s', :type => 'string'
#The method name is the name of the task that would be run => mcli maintenance
def maintenance
#do stuff
end
no_tasks do
#methods that you don't want cli tasks for go here
end
end
Maintenance.start
I don't really have any good suggestions for column/colorized output though.
I definitely recommend using some kind of a database to store states though. Maybe not sqlite, I would probably opt for maybe a redis database that stores key/value pairs with the information you are looking for.
We have similar task. I use next architecture
Small application (C) what generate config file
Add nginx init.d script new switch update_clusters. This script will restart nginx only if config file is changed
update_clusters() {
${CONF_GEN} --outfile=/tmp/nginx_clusters.conf
RETVAL=$?
if [[ "$RETVAL" != "0" ]]; then
return 5
fi
if ! diff ${CLUSTER_CONF_FILE} /tmp/nginx_clusters.conf > /dev/null; then
echo "Cluster configuration changed. Reload service"
mv -f /tmp/nginx_clusters.conf ${CLUSTER_CONF_FILE}
reload
fi
}
Set of bash scripts to add records to database.
Web console to add/modify/delete records in database (extjs+nginx module)

Resources