run.as option does not work other than Nifi user - apache-nifi

I want to run my NiFi application using ec2-user rather than default nifi user. I changed run.as=ec2-user in bootstrap.conf but it did not worked .It is not allowing me to start Nifi application getting following error while staring nifi service.
./nifi.sh start
nifi.sh: JAVA_HOME not set; results may vary
Java home:
NiFi home: /opt/nifi/current
Bootstrap Config File: /opt/nifi/current/conf/bootstrap.conf
User Runnug Nifi Application : sudo -u ec2-user
Error: Could not find or load main class org.apache.nifi.bootstrap.RunNiFi
Any pointer to this issue?

This is most likely a file permission problem, which is not covered by installing the service with nifi.sh install. A summary of the required permissions includes:
Read access to the entire distribution in the NIFI_HOME directory
Write access to the NIFI_HOME directory itself - NiFi will create a number of directories and files at runtime including logs, work, state, and various repositories.
Write access to the bin directory
Write access to the conf directory
Write access to the lib directory, and to all of the files in the lib directory
It is certainly possible to narrow the permissions by creating the working directories manually, and by adjusting NiFi's settings to rearrange the directory layout. But the permissions above should get you started.

Related

spring.flyway.locations property is unable to read files from specified location in kubernetes

I am using flyway script to migrate the database.
In my project setup I have dev env scripts in the default location : /db/migration/
Also a specific location (under root) for pp scripts: /my_app/flyway_pp/
application-preprod.properties
spring.flyway.baseline-on-migrate = true
spring.flyway.locations=filesystem:/my_app/flyway-pp
My application runs as a image in kubernetes cluster. I have checked using classpath: and filesystem: both. But nothing is getting picked up from flyway_pp folder and instead it is picking up from default location always /db/migration/.
My doubt is may be in the kubernetes env it is not able to resolve the path correctly.
But what should I do in this case.

Unable to save output from Rscripts in system directory using Devops Pipeline

I am running Rscripts on a self hosted Devops agent. My Windows agent is able to access the system's directory where its hosted. Below is the directory structure for my code
Agent loc. : F:/agent
Source Code : F:/agent/deployment/projects/project1/sourcecode
DWH _dump : F:/agent/deployment/DWH_dump/2021/
Output loca. : F:/agent/deployment/projects/project1/output_data/2021
The agent is using CMD in the devops pipeline to trigger R from the system and use the libraries from the system directory.
Problem statement: I am unable to save the output from my Rscript in to the Output Loca. directory. It give an error as Probable reason: permission denied error by pointing to that directory.
Output File Format: file_name.rds but same issue happens even for a csv file.
Command leading to failure: saverds(paste0(Output loca.,"/",file_name.rds))
Workaround : However I found a workaround, tried to save the scripts at the Source Code directory and then save the same files at the Output Loca. directory. This works perfectly fine but costs me 2 extra hours of run time because I have to save all intermediatory files and delete them in the end. Keeping the intermediatory files in memory eats up my RAM.
I have not opened that directory anywhere in the machine. Only open application in my explorer is my browser where the pipeline is running. I spent hours to figure out the reason but no success. Even I checked the system Path to see whether I have mentioned that directory over there and its not present.
When I run the same script directly, on the machine using Rstudio, I do not have any issues with saving the file at any directory.
Spent 2 full days already. Any pointers to figure out the root cause can save me few hours of runtime.
Solution was to set the Azure Pipeline Agent services in Windows to run with Admin Credentials. The agent was not configured as an admin during creation and so after enabling it with my userid which has admin access on the VM, the pipelines were able to save files without any troubles.
Feels great, saved few hours of run time!
I was able to achieve this by following this post.

The installation of Workload Scheduler agent component fails during start up

The installation of Workload Scheduler agent component fails during the step Start up IBM Workload Scheduler showing the following message:
tebctl-tws_cpa_agent_agt94 agent not installed properly
I suggest you to check that the installation directory has the proper permissions set.
If you are installing in the /opt/IBM directory and its permissions are set to 750, change the permissions to 755.
This problem can arise for one of the following three reasons:
1) your tws user "agt94" is not able to read the file "ita.ini" located in the "TWS_HOME/ITA/cpa/ita" folder;
2) your tws user "agt94" has not the needed execute permission on file "agent.sh" located in the "TWS_HOME/ITA/cpa/ita" folder;
3) your tws user "agt94" has not the needed execute permission on file "agent" located in the "TWS_HOME/ITA/cpa/ita" folder.

How to use Ambari service to deploy a jar on all hadoop nodes?

I have a requirement where I want to deploy a jar file at a particular location on all hadoop cluster nodes using Ambari server. For that purpose I think I can use service feature.
So I created a sample service and could deploy it as client or slave on all nodes.
I added a new folder as Testservice inside /var/lib/ambari-server/resources/stacks/HDP/2.2/services/ and it has following files/directories
[machine]# cd /var/lib/ambari-server/resources/stacks/HDP/2.2/services/Testservice^C
[machine]#
[machine]# pwd
/var/lib/ambari-server/resources/stacks/HDP/2.2/services/Testservice
[machine]# ls
configuration metainfo.xml package
[machine]# ls package/*
package/archive.zip
package/files:
filesmaster.py test1.jar
package/scripts:
test_client.py
[machine]#
With this my service is added and installed on all nodes. On each node, a respective directory "/var/lib/ambari-agent/cache/stacks/HDP/2.2/services/Testservice" is created with same file structure as mentioned above. As of now test_client.py script has no code at all. Just dummy implementation of install, configure function.
So here I want to add code such that package/files/test1.jar from each host to a defined destination location say "/lib folder.
I need help on this point. How I can make use test_client.py script? How I can write generic code to copy my jar file.
test_client.py has install method as shown below
class TestClient(Script):
def install(self, env):
Need more details how env variable can be used to get all required base paths for ambari service directory and hadoop install base paths.
You are correct in thinking that you can use a Custom Ambari Service to ensure a file is present on various nodes in your cluster. Your custom service should have a CLIENT component which handles laying down the files you need on various hosts in the cluster. It should be a client component because it has no running processes.
However, using the files folder is not the correct approach to distribute the file you have (test1.jar). All the Ambari services rely on linux packages to install the necessary files on the system. So what you should be doing is creating a software package that takes care of laying down that lib file to the correct location on disk. This could be an rpm and/or deb file depending on what OSs you are planning to support. Once you have the software package you can accomplish your goal by modifying two files you already have outlined above.
metainfo.xml - You will list the necessary software packages required for your service to function correctly. For example if you were planning on supporting RHEL6 and RHEL7 you would create an rpm package named my_package_name and include it with this code:
<osSpecifics>
<osSpecific>
<osFamily>redhat6,redhat7</osFamily>
<packages>
<package>
<name>my_package_name</name>
</package>
</packages>
</osSpecific>
</osSpecifics>
test-client.py - You will need to replace the starter code you have in your question with:
class TestClient(Script):
def install(self, env):
self.install_packages(env)
The self.install_packages(env) call will ensure that the packages you have listed in metainfo.xml file get installed when your custom service CLIENT component is installed.
Note: Your software package (rpm, deb, etc.) will have to be hosted in an online repository in order for Ambari to access it and install it. You could create a local repository on the node running Ambari Server using httpd and createrepo. This process can be gleaned from the HDP Documentation.
Alternative approach (Not Recommended)
Now that I have explained the way it SHOULD be done. Let me tell you how you can achieve this using the package/files folder. Again this is not the recommended approach to handle installing software on a linux system, the package management system for your distribution should be handling this.
test-client.py - Update your starter file to include the below content. For this example we will copy your test1.jar to /lib folder with file permissions 0664, owner of 'guest', and group 'hadoop':
def configure(self,env):
File("/lib/test1.jar",
mode=0644,
group="hadoop",
owner="guest",
content=StaticFile("test1.jar")
)
Why is this approach not recommended? This is not recommended because installing software on a linux distribution should be managed so that it makes it easy to upgrade and remove said software. Ambari does not have full uninstall functionality when it comes to its services. The most you can do is remove a service from being managed in your ambari cluster, after doing so all those files will remain on the system and would have to be removed by writing a custom script or doing it manually. However if you used package managment to handle installing the files you could easily remove the software by using the same package management system.

how do I change the location of the httpd.conf for Apache on windows?

I am working on setting up a load balancing cluster on windows server 2012 and have a shared drive where I want the configuration files for Apache to exist at. This way each member of the LB can load the exact same config files. How do I change where the config file is located independently of where the ServerRoot is?
Start the Apache process with the -d parameter and give your alternative ServerRoot as an argument, though I'd imagine it would be a much better idea for you to use some mechanism to sync the files locally to each server.
Also read http://httpd.apache.org/docs/2.4/mod/core.html#mutex, as it's advised if you're running from a networked file system.
If you just want to specify the main config file, start the process with the -f parameter and the path to the config file as an argument.

Resources