IBM Cloud: How to open Analytics Engine port 7070? - hadoop

I want to use big data services on the IBM Cloud, so I found Analytics Engine(AE) and BigInsight, but unfortunately, BigInsight is going to stop, so I can only choose AE. However, IBM AE is different AWS and GCP big data services, AE prohibits users from having root permissions, so I cannot change some configurations on the clusters, but I want to install Kylin on the cluster, and I need to open Kylin's 7070 port, later, I found that Knox can map ports, but it looks like IBM change it. So how do I open port 7070 for external access? Can I get root permissions? Or is there any other big data service besides "Analytic Engine" and "BigInsight"?

The Analytics Engine service on IBM Cloud can be customized and additional services and packages installed. However, there are restrictions. As you pointed out, it includes that you cannot have root permissions and open ports for security reasons.
To be entirely flexible and have root permissions you could deploy a Hadoop cluster to a Kubernetes or OpenShift cluster or run it on virtual machines.

Related

Show Windows and Linux Server firewall data to a azure solution

This is Yaseen Zafar. DevOps Engineer from Integrated Dealer Systems. We have multiple customers whose servers are hosted on multiple locations from Canada to America. They are hosted on premises (i.e. they are not currently on Azure). Though we are currently using Microsoft Azure Log Analytics to get some insights of the Windows and Linux Servers. So far it has been a very good experience.
Actually I wanted to know if there is any solution available on Azure that can show me firewall related logs, rules, IP and port details ingested from the Windows and Linux Servers that are hosted on premise location.
Best Regards.
Yaseen Zafar
• Yes, there is a way through which you can forward your on-premises firewall logs to Azure log analytics workspace since almost every firewall device has syslog functionality in built in it to forward logs to a log management server on a specific port. Thus, similarly, on-premises firewall logs that include all data collected related to the traffic passed inbound and outbound to the environment can be forwarded to a Linux virtual machine which then can be forwarded to the Azure Log Analytics.
• Syslog is the cross-platform equivalent of Windows Event log which can be leveraged by forwarding these syslog messages to Azure Log Analytics through Linux machines. This linux system should be deployed as a virtual appliance (VM) in on-premises or in Azure cloud such that the syslog-generating firewalls can communicate directly with them. The Linux forwarder can be on-premises physically near the firewall, or it can be in Azure or another cloud, connected to your firewall by an IPSEC tunnel. The Linux computer has a Log Analytics agent configured to communicate with your Log Analytics workspace.
• Once your firewall is connected to Azure Log Analytics you should create a custom dashboard solution that suits your needs. You will have excellent visibility and gain a lot of insight into your firewall operation by studying the collected and indexed syslog data in the Log search feature of the Azure portal. You will notice which types of data your firewall is delivering and learn what to monitor to meet your business and security needs.
Please find the below links for more information on how to configure the Linux virtual machine as a syslog forwarder and how to implement the above stated solution as a whole: -
https://blog.johnjoyner.net/connect-your-firewall-to-azure-log-analytics-for-security-insights/
https://accountabilit.com/azure-log-analytics-best-syslog-destination/

How to configure JDBC for Cloud Fusion to connect MySQL installed on localhost:3306

I'm trying to connect my local standalone MySQL with Cloud Fusion to create and test a data pipeline. I have deployed the driver successfully.
Also, I have configured the pipeline properties with correct values of jdbc string, user name and password but connectivity isn't getting established.
Connection String: jdbc:mysql://localhost:3306/test_database
I have also tried to test the connectivity via data wrangling option but that is also not getting succeeded.
Do I need to bring both the environments under same network by setting up some VPC and tunneling?
In your example, I see that you specified localhost in your Connection String. localhost is only advertised to other services running local to your machine, and Cloud Data Fusion (running in GCP) will not be able to reach the MySQL instance (running on your machine). Hence you're seeing the connectivity issue.
I highly recommend looking at this answer on SO that will help you setup a quick proof-of-concept.
I think that your question is more related to the way how to connect some on-premise environments to GCP networking system that gathering Google cloud instances or resources throughout VPC connection model.
Admitting the fact that GCP is actually leveraging different approaches for connection methods within a Hybrid cloud concepts, I would encourage you to learn some fundamental principles of Cloud VPN as a essential part of performing secure connection between particular VPN Peer Gateway and Cloud VPN Gateway and further creating a VPN tunnel between parties.
I guess there is even dedicated chapter in GCP documentation about Data Fusion VPC peering implementation that might be helpful in your user case.

Automating Cloudera Management Services

I am using Cloudera Express. The Cloudera Manager version is 5.12.0. I am trying to automate the bring-up of services like hdfs, hbase... I am able to do so by specifying necessary information of each service in host template, and pushing the host template to Cloudera Manager using curl command which uses Cloudera Manager API. Now, I want to automate the bring-up of Cloudera Management Services like host monitor, service monitor, event server, activity monitor and alert publisher. I have tried to do so by adding the corresponding role types and service types of each service in the host template. When I push the host template to Cloudera Manager using curl command, Cloudera Manager shows an error that It could not find service type 'MGMT' with version CDH 5.12.0. As the management services are different from cluster services like hdfs, yarn, hbase..., How should I automate the bring-up of management services? Is their a dedicated API to automate Management Services?
Unfortunately, the host template only applies to clusters not CM. To configure CM look at:
https://cloudera.github.io/cm_api/
https://cloudera.github.io/cm_api/apidocs/v17/index.html (in particular the /cm/service/* APIs)

jelastic Tomcat 8 access to storage

I am evaluating jelastic for use with Tomcat 8 and Postgres 9.5.
Does a user have ssh access to the instance that is running these services?
Does Tomcat have access to the local storage, or can you attach storage that Tomcat can create and read files?
Does a user have ssh access to the instance that is running these services?
Yes, a user have ssh access to the any instance. The authentication procedure in Jelastic SSH Gateway is divided into two independent parts:
connection from end user to Gateway (external authentication)
connection from Gateway to users’ container (internal authentication)
Both parts of the authentication procedure are based on a standard SSH protocol, using public/private keypairs.
With Jelastic SSH Gateway, you can easily access:
the whole account where you can navigate across your environments and containers using an interactive menu without extra authentication
separate containers directly while working with them remotely via additional tools (e.g. Capistrano) or using SFTP and FISH protocols.
While accessing containers via SSH, a user receives all required permissions and additionally can manage the main services with sudo commands of the following kind (and others):
sudo /etc/init.d/jetty start
sudo /etc/init.d/mysql stop
sudo /etc/init.d/tomcat restart
sudo /etc/init.d/memcached status
sudo /etc/init.d/mongod reload
sudo /etc/init.d/nginx upgrade
sudo /etc/init.d/httpd help
Using our documentation you’ll find out how to:
generate SSH key
add SSH key
access environments and containers
Does Tomcat have access to the local storage, or can you attach storage that Tomcat can create and read files?
Jelastic supported the local storage and the dedicated storage container.
Jelastic Dedicated Storage Container is a special type of node, based on Docker centos7 image. Being developed specially for data storing, it provides a number of the appropriate benefits:
being delivered with the corresponding software (i.e. NFS & RPC) already pre-installed, so such a container can be used as a storage immediately after the creation without any additional configurations required
compared to other common-purposed Jelastic nodes, Dedicated Storage Container provides the enlarged amount of disk space, which allows to persist a comparatively bigger data volumes (herewith, the particular value depends on your service provider’s settings and can vary according to your account type).
Some tips on this container type usage and examples it can be leveraged in the best way are revealed within the corresponding use case description.
And below we'll consider how to set up such Storage server inside your Cloud and some tips on its management:
Storage container creation
Storage container management
If you don't have root permissions, please contact your hosting provider.
Applications you run on tomcat have access to storage on the running system is based on several things. There are layers of security. Tomcat literally has access to whatever user you run it under has access to. That's true in both Windows and Linux environments. A running service has operable services defined as soon as you decide to log in.

Session replication in Glassfish Cluster on EC2

I've built a cluster on Glassfish administred via SSH, where there are 2 instances. I deployed an application that shows the "Session id".
This application has in the web.config:
<distributable/>
And in the sun-web.xml:
<session-config>
<cookie-properties>
<property name="cookieDomain" value="compute.amazonaws.com"/>
</cookie-properties>
</session-config>
I enabled "Availability" in Edit Application.
But when I access the 2 web app versions I see different session ids.
Can anyone help me?
EDIT: As some users noticed, in EC2 is not supported multicast. A solution comes with Glassfish v3.1.2, that allows two other different ways to discover a cluster when multicasting is not permitted (by listing instances ip or making it auto-generate the list). Here's specified how to start a cluster in a non-multicasting environment: Administering Glassfish Server Clusters
Read the High Availability Administration Guide for v3.1.2, specifically section "Discovering a Cluster When Multicast Transport Is Unavailable". Haven't tried it yet, but looking forward. Cheers!
First thing to try would be trying to validate if the multicast works on your your setup, use below asadmin command.
asadmin validate-multicast
You can checkout this simple Youtube Video about how to do that
http://www.youtube.com/watch?v=sJTDao9OpWA
In case Multicast does not work, you may want to try the non multicast option that is supported on Recent release of Glassfish 3.1.2
The release notes say that it supports non multicast clustering
New Support for non-Multicast clustering. GlassFish High-Availability
clustering is now possible in environments where multicast is
disabled.
I was not able to find any documentation that provides steps for setting up the non Multicast cluster. There may be one for the enterprise support customers.
As some users noticed, in EC2 is not supported multicast. A solution comes with Glassfish v3.1.2, that allows two other different ways to discover a cluster when multicasting is not permitted (by listing instances ip or making it auto-generate the list). Here's specified how to start a cluster in a non-multicasting environment: Administering Glassfish Server Clusters

Resources