Hbase MuleSoft Cloudhub Connectivity - hadoop

I have to connect Cloudhub to Hbase. I have trid from community edition HBase connector but not succeeded. Then I tried with Java Code and again failed. From HBase Team, they have given only Master IP (10.99.X.X) and Port(2181) and userName (hadoop).
I have tried with following options:
Through Java Code:
public Object transformMessage(MuleMessage message, String outputEncoding) throws TransformerException {
try {
Configuration conf = HBaseConfiguration.create();
//conf.set("hbase.rotdir", "/hbase");
conf.set("hbase.zookeeper.quorum", "10.99.X.X");
conf.set("hbase.zookeeper.property.clientPort", "2181");
conf.set("hbase.client.retries.number", "3");
logger.info("############# Config Created ##########");
// Create a get api for consignment table
logger.info("############# Starting Consignment Test ##########");
// read from table
// Creating a HTable instance
HTable table = new HTable(conf, "consignment");
logger.info("############# HTable instance Created ##########");
// Create a Get object
Get get = new Get(Bytes.toBytes("6910358750"));
logger.info("############# RowKey Created ##########");
// Set column family to be queried
get.addFamily(Bytes.toBytes("consignment_detail"));
logger.info("############# CF Created ##########");
// Perform get and capture result in a iterable
Result result = table.get(get);
logger.info("############# Result Created ##########");
// Print consignment data
logger.info(result);
logger.info(" #### Ending Consignment Test ###");
// Begining Consignment Item Scanner api
logger.info("############# Starting Consignmentitem test ##########");
HTable table1 = new HTable(conf, "consignmentitem");
logger.info("############# HTable instance Created ##########");
// Create a scan object with start rowkey and end rowkey (partial
// row key scan)
// actual rowkey design: <consignment_id>-<trn>-<orderline>
Scan scan = new Scan(Bytes.toBytes("6910358750"),Bytes.toBytes("6910358751"));
logger.info("############# Partial RowKeys Created ##########");
// Perform a scan using start and stop rowkeys
ResultScanner scanner = table1.getScanner(scan);
// Iterate over result and print them
for (Result result1 = scanner.next(); result1 != null; result1 = scanner.next()) {
logger.info("Printing Records\n");
logger.info(result1);
}
return scanner;
} catch (MasterNotRunningException e) {
logger.error("HBase connection failed! --> MasterNotRunningException");
logger.error(e);
} catch (ZooKeeperConnectionException e) {
logger.error("Zookeeper connection failed! -->ZooKeeperConnectionException");
logger.error(e);
} catch (Exception e) {
logger.error("Main Exception Found! -- Exception");
logger.error(e);
}
return "Not Connected";
}
Above Code giving below Error
java.net.UnknownHostException: unknown host: ip-10-99-X-X.ap-southeast-2.compute.internal
It Seems that CloudHub is not able to find host name because cloudHub is not configured with DNS
When I tried with Community Edition HBase Connector it is giving following Exception:
org.apache.hadoop.hbase.MasterNotRunningException: Retried 3 times
Please suggest some way...
Rgeards
Nilesh
Email: bit.nilesh.kumar#gmail.com

It appears that you are configuring your client to try to connect to the zookeeper quorum at a private IP address (10.99.X.X). I'll assume you've already set up a VPC, which is required for your CloudHub worker to connect to your private network.
Your UnknownHostException implies that the HBase server you are connecting to is hosted on AWS, which defines private domain names similar to the one in the error message.
So what might be happening is this:
Mule connects to Zookeeper, asks what HBase nodes there are, and gets back ip-10-99-X-X.ap-southeast-2.compute.internal.
Mule tries to connect to that to find the HTable "consignment", but can't resolve an IP address for that name.
Unfortunately, if this is what's going on, it will take some networking changes to fix it. The FAQ in the VPC discovery form says this about private DNS:
Currently we don't have the ability to relay DNS queries to internal DNS servers. You would need to either use IP addresses or public DNS entries. Beware of connecting to systems which may redirect to a Virtual IP endpoint by using an internal DNS entry.
You could use public DNS and possibly an Elastic IP to get around this problem, but that would require you to expose your HBase cluster to the internet.

I believe the answer of your question is covered in the cloudhub networking guide.
https://developer.mulesoft.com/docs/display/current/CloudHub+Networking+Guide

Related

Unable to create a client-server connection in ignite

I have successfully created a connection between client and server(localhost) in ignite.But while trying to connect the ignite server which is running in remote IP(eg: 192.168.33.44), I am not able to establish connection. The client side configuration given below.
#Bean(name = "igniteConfiguration")
public IgniteConfiguration igniteConfiguration() {
IgniteConfiguration igniteConfiguration = new IgniteConfiguration();
igniteConfiguration.setClientMode(true);
igniteConfiguration.setPeerClassLoadingEnabled(true);
igniteConfiguration.setLocalHost("127.0.0.1");
TcpDiscoverySpi tcpDiscoverySpi = new TcpDiscoverySpi();
TcpDiscoveryMulticastIpFinder ipFinder = new TcpDiscoveryMulticastIpFinder();
ipFinder.setAddresses(Collections.singletonList("127.0.0.1:47500..47509"));
tcpDiscoverySpi.setIpFinder(ipFinder);
tcpDiscoverySpi.setLocalPort(47500);
// Changing local port range. This is an optional action.
tcpDiscoverySpi.setLocalPortRange(9);
tcpDiscoverySpi.setLocalAddress("localhost");
igniteConfiguration.setDiscoverySpi(tcpDiscoverySpi);
TcpCommunicationSpi communicationSpi = new TcpCommunicationSpi();
communicationSpi.setLocalAddress("localhost");
communicationSpi.setLocalPort(48100);
communicationSpi.setSlowClientQueueLimit(1000);
igniteConfiguration.setCommunicationSpi(communicationSpi);
igniteConfiguration.setCacheConfiguration(cacheConfiguration());
return igniteConfiguration;
}
Can anyone help me to make code change for creating a successful client-server connecion.Thanks in advance.
Since you are moving from localhost deployment, you need to do the following changes:
TcpDiscoveryMulticastIpFinder ipFinder = new TcpDiscoveryMulticastIpFinder();
ipFinder.setAddresses(Collections.singletonList("192.168.33.44:47500..47509"));
Most likely, the server configurations need to be changed as well.

Query Oracle using .net core throws exception for Connect Identifier

I'm quite new to Oracle and never used that before, now, I'm trying to query an Oracle database from a .Net Core Web Application having nuget package oracle.manageddataaccess.core installed and using an alias as the Data Source but I'm receiving the following error:
If I use the full connection string the query will work correctly
ORA-12154: TNS:could not resolve the connect identifier specified
at OracleInternal.Network.AddressResolution..ctor(String TNSAlias, SqlNetOraConfig SNOConfig, Hashtable ObTnsHT, String instanceName, ConnectionOption CO)
at OracleInternal.Network.OracleCommunication.Resolve(String tnsAlias, ConnectionOption& CO)
at OracleInternal.ConnectionPool.PoolManager`3.ResolveTnsAlias(ConnectionString cs, Object OC)
at OracleInternal.ServiceObjects.OracleConnectionImpl.Connect(ConnectionString cs, Boolean bOpenEndUserSession, OracleConnection connRefForCriteria, String instanceName)
So, from a few links I could understand that there is a tsnnames.ora file which must contain the map between connect identifiers and connect descriptors. And that this file can be found at the machine on which the Oracle has been installed with the path ORACLE_HOME\network\admin.
Question is:
Does the alias name that I'm using in my connection string which reads as Data Source: <alias_name>; User ID=<user>; Password=<password> need to be specified in the tsnnames.ora file? I don't have access to the machine where the Oracle database resides on otherwise I would have checked it earlier.
Here's the code snippet for more info: connection string and query values are mocked out
public static string Read()
{
const string connectionString = "Data Source=TestData;User ID=User1;Password=Pass1";
const string query = "select xyz from myTable";
string result;
using (var connection = new OracleConnection(connectionString))
{
try
{
connection.Open();
var command = new OracleCommand(query) { Connection = connection, CommandType = CommandType.Text };
OracleDataReader reader = command.ExecuteReader();
reader.Read();
result = reader.GetString(0);
}
catch (Exception exception)
{
Console.WriteLine(exception);
throw;
}
}
return result;
}
Is this enough or something else needs to be added/changed in here? Or probably the issue is with the tsnNames.ora file which might not contain the alias name here?
Thanks in advance
For the Data source
Data Source=TestData;User Id=myUsername;Password=myPassword;
Your tnsnames.ora probably should have the below entry
TestData=(DESCRIPTION=
(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=MyHost) (PORT=MyPort)))
(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=MyOracleSID)))
Since
Data Source=TestData;User Id=myUsername;Password=myPassword;
is same as
Data Source=(DESCRIPTION=(ADDRESS_LIST=(ADDRESS=(PROTOCOL=TCP)(HOST=MyHost) (PORT=MyPort)))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=MyOracleSID)));User Id=myUsername;Password=myPassword;

JDBC statement pooling with DB2 does not have significant time difference

I'm using JDBC db2 driver, a.k.a. JT400 to connect to db2 server on Application System/400, a midrange computer system.
My goal is to insert into three Tables, from outside of IBM mainframe, which would be cloud instance(eg. Amazon WS).
To make the performance better
1) I am using already established connections to connect to db2.
(using org.apache.commons.dbcp.BasicDataSource or com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource, both are working fine.)
public class AS400JDBCManagedConnectionPoolDataSource extends AS400JDBCManagedDataSource implements ConnectionPoolDataSource, Referenceable, Serializable {
}
public class AS400JDBCManagedDataSource extends ToolboxWrapper implements DataSource, Referenceable, Serializable, Cloneable {
}
2) I want to cache the insert into statements for all three tables, so that I don't have to send query every time and compile every time, which is expensive. I would instead just pass the parameters only. (Obviously I am doing this using JDBC prepared statements)
Based on an official IBM document Optimize Access to DB2 for i5/OS
from Java and WebSphere, page 17-20 - Enabling Extended Dynamic Support, it's possible to cache the statement with AS400JDBCManagedConnectionPoolDataSource.
BUT, the problem is the insert into queries are being compiled each time, which is taking 200ms * 3 queries = 600ms each time.
Example I'm using,
public class CustomerOrderEventHandler extends MultiEventHandler {
private static Logger logger = LogManager.getLogger(CustomerOrderEventHandler.class);
//private BasicDataSource establishedConnections = new BasicDataSource();
//private DB2SimpleDataSource nativeEstablishedConnections = new DB2SimpleDataSource();
private AS400JDBCManagedConnectionPoolDataSource dynamicEstablishedConnections =
new AS400JDBCManagedConnectionPoolDataSource();
private State3 orderState3;
private State2 orderState2;
private State1 orderState1;
public CustomerOrderEventHandler() throws SQLException {
dynamicEstablishedConnections.setServerName(State.server);
dynamicEstablishedConnections.setDatabaseName(State.DATABASE);
dynamicEstablishedConnections.setUser(State.user);
dynamicEstablishedConnections.setPassword(State.password);
dynamicEstablishedConnections.setSavePasswordWhenSerialized(true);
dynamicEstablishedConnections.setPrompt(false);
dynamicEstablishedConnections.setMinPoolSize(3);
dynamicEstablishedConnections.setInitialPoolSize(5);
dynamicEstablishedConnections.setMaxPoolSize(50);
dynamicEstablishedConnections.setExtendedDynamic(true);
Connection connection = dynamicEstablishedConnections.getConnection();
connection.close();
}
public void onEvent(CustomerOrder orderEvent){
long start = System.currentTimeMillis();
Connection dbConnection = null;
try {
dbConnection = dynamicEstablishedConnections.getConnection();
long connectionSetupTime = System.currentTimeMillis() - start;
state3 = new State3(dbConnection);
state2 = new State2(dbConnection);
state1 = new State1(dbConnection);
long initialisation = System.currentTimeMillis() - start - connectionSetupTime;
int[] state3Result = state3.apply(orderEvent);
int[] state2Result = state2.apply(orderEvent);
long state1Result = state1.apply(orderEvent);
dbConnection.commit();
logger.info("eventId="+ getEventId(orderEvent) +
",connectionSetupTime=" + connectionSetupTime +
",queryPreCompilation=" + initialisation +
",insertionOnlyTimeTaken=" +
(System.currentTimeMillis() - (start + connectionSetupTime + initialisation)) +
",insertionTotalTimeTaken=" + (System.currentTimeMillis() - start));
} catch (SQLException e) {
logger.error("Error updating the order states.", e);
if(dbConnection != null) {
try {
dbConnection.rollback();
} catch (SQLException e1) {
logger.error("Error rolling back the state.", e1);
}
}
throw new CustomerOrderEventHandlerRuntimeException("Error updating the customer order states.", e);
}
}
private Long getEventId(CustomerOrder order) {
return Long.valueOf(order.getMessageHeader().getCorrelationId());
}
}
And the States with insert commands look like below,
import org.apache.logging.log4j.LogManager;
import org.apache.logging.log4j.Logger;
import java.sql.Connection;
import java.sql.PreparedStatement;
import java.sql.SQLException;
public class State2 extends State {
private static Logger logger = LogManager.getLogger(DetailState.class);
Connection connection;
PreparedStatement preparedStatement;
String detailsCompiledQuery = "INSERT INTO " + DATABASE + "." + getStateName() +
"(" + DetailState.EVENT_ID + ", " +
State2.ORDER_NUMBER + ", " +
State2.SKU_ID + ", " +
State2.SKU_ORDERED_QTY + ") VALUES(?, ?, ?, ?)";
public State2(Connection connection) throws SQLException {
this.connection = connection;
this.preparedStatement = this.connection.prepareStatement(detailsCompiledQuery); // this is taking ~200ms each time
this.preparedStatement.setPoolable(true); //might not be required, not sure
}
public int[] apply(CustomerOrder event) throws StateException {
event.getMessageBody().getDetails().forEach(detail -> {
try {
preparedStatement.setLong(1, getEventId(event));
preparedStatement.setString(2, getOrderNo(event));
preparedStatement.setInt(3, detail.getSkuId());
preparedStatement.setInt(4, detail.getQty());
preparedStatement.addBatch();
} catch (SQLException e) {
logger.error(e);
throw new StateException("Error setting up data", e);
}
});
long startedTime = System.currentTimeMillis();
int[] inserted = new int[0];
try {
inserted = preparedStatement.executeBatch();
} catch (SQLException e) {
throw new StateException("Error updating allocations data", e);
}
logger.info("eventId="+ getEventId(event) +
",state=details,insertionTimeTaken=" + (System.currentTimeMillis() - startedTime));
return inserted;
}
#Override
protected String getStateName() {
return properties.getProperty("state.order.details.name");
}
}
So the flow is each time an event is received(eg. CustomerOrder), it gets the establishedConnection and then asks the states to initialise their statements.
The metrics for timing look as below,
for the first event, it takes 580ms to create the preparedStatements for 3 tables.
{"timeMillis":1489982655836,"thread":"ScalaTest-run-running-CustomerOrderEventHandlerSpecs","level":"INFO","loggerName":"com.xyz.customerorder.events.handler.CustomerOrderEventHandler",
"message":"eventId=1489982654314,connectionSetupTime=1,queryPreCompilation=580,insertionOnlyTimeTaken=938,insertionTotalTimeTaken=1519","endOfBatch":false,"loggerFqcn":"org.apache.logging.log4j.spi.AbstractLogger","threadId":1,"threadPriority":5}
for the second event, takes 470ms to prepare the statements for 3 tables, which is less than the first event but just < 100ms, I assume it to be drastically less, as it should not even make it to compilation.
{"timeMillis":1489982667243,"thread":"ScalaTest-run-running-PurchaseOrderEventHandlerSpecs","level":"INFO","loggerName":"com.xyz.customerorder.events.handler.CustomerOrderEventHandler",
"message":"eventId=1489982665456,connectionSetupTime=0,queryPreCompilation=417,insertionOnlyTimeTaken=1363,insertionTotalTimeTaken=1780","endOfBatch":false,"loggerFqcn":"org.apache.logging.log4j.spi.AbstractLogger","threadId":1,"threadPriority":5}
What I'm thinking is since I'm closing preparedStatement for that particular connection, it does not even exist for new connection. If thats the case whats the point of having statement caching at all in multi-threaded environment.
The documentation has similar example, where its making transactions inside the same connection which is not the case for me, as I need to have multiple connections at the same time.
Questions
Primary
Q1) Is DB2 JDBC driver caching the statements at all, between multiple connections? Because I don't see much difference while preparing the statement. (see example, first one takes ~600ms, second one takes ~500ms)
References
ODP = Open Data Path
SQL packages
SQL packages are permanent objects used to store information related
to prepared SQL statements. They can be used by the IBM iSeries Access
for the IBM Toolbox for
Java JDBC driver. They are also used by applications which use the
QSQPRCED (SQL Process Extended Dynamic) API interface.
In the case JDBC, the existence of the SQL package is
checked when the client application issues the first prepare of a SQL
Statement. If the package does not exist, it is created at that time
(even though it may not yet contain any SQL statements)
Tomcat jdbc connection pool configuration - DB2 on iSeries(AS400)
IBM Data Server Driver for JDBC and SQLJ statement caching
A couple of important things to note regarding statement caching:
Because Statement objects are child objects of a given Connection, once the Connection is closed all child objects (e.g. all Statement objects) must also be closed.
It is not possible to associate a statement from one connection with a different connection.
Statement pooling may or may not be done be by a given JDBC driver. Statement pooling may also be performed by a connection management layer (i.e. application server)
Per JDBC spec, default value for Statement.isPoolable() == false and PreparedStatement.isPoolable() == true, however this flag is only a hint to the JDBC driver. There is no guarantee from the spec that statement pooling will occur.
First off, I am not sure if the JT400 driver does statement caching. The document you referenced in your question comment, Optimize Access to DB2 for i5/OS from Java and WebSphere, is specific to using the JT400 JDBC driver with WebSphere application server, and on slide #3 it indicates that statement caching comes from the WebSphere connection management layer, not the native JDBC driver layer. Given that, I'm going to assume that the JT400 JDBC driver doesn't support statement caching on its own.
So at this point you are probably going to want to plug into some sort of app server (unless you want to implement statement caching on your own, which is sort of re-inventing the wheel). I know for sure that both WebSphere Application Server products (traditional and Liberty) support statement caching for any JDBC driver.
For WebSphere Liberty (the newer product), the data source config is the following:
<dataSource jndiName="jdbc/myDS" statementCacheSize="10">
<jdbcDriver libraryRef="DB2iToolboxLib"/>
<properties.db2.i.toolbox databaseName="YOURDB" serverName="localhost"/>
</dataSource>
<library id="DB2iToolboxLib">
<fileset dir="/path/to/jdbc/driver/dir" includes="jt400.jar"/>
</library>
The key bit being the statementCacheSize attribute of <dataSource>, which has a default value of 10.
(Disclaimer, I'm a WebSphere dev, so I'm going to talk about what I know)
First off, the IBM i Java documentation is here: IBM Toolbox for Java
Secondly, I don't see where you are setting the "extended dynamic" property to true which provides
a mechanism for caching dynamic SQL statements on the server. The first
time a particular SQL statement is prepared, it is stored in a SQL
package on the server. If the package does not exist, it is
automatically created. On subsequent prepares of the same SQL
statement, the server can skip a significant part of the processing by
using information stored in the SQL package. If this is set to "true",
then a package name must be set using the "package" property.
I think you're missing some steps in using the managed pool...here's the first example in the IBM docs
import javax.naming.Context;
import javax.naming.InitialContext;
import javax.sql.DataSource;
import com.ibm.as400.access.AS400JDBCManagedConnectionPoolDataSource;
import com.ibm.as400.access.AS400JDBCManagedDataSource;
public class TestJDBCConnPoolSnippet
{
void test()
{
AS400JDBCManagedConnectionPoolDataSource cpds0 = new AS400JDBCManagedConnectionPoolDataSource();
// Set general datasource properties. Note that both connection pool datasource (CPDS) and managed
// datasource (MDS) have these properties, and they might have different values.
cpds0.setServerName(host);
cpds0.setDatabaseName(host);//iasp can be here
cpds0.setUser(userid);
cpds0.setPassword(password);
cpds0.setSavePasswordWhenSerialized(true);
// Set connection pooling-specific properties.
cpds0.setInitialPoolSize(initialPoolSize_);
cpds0.setMinPoolSize(minPoolSize_);
cpds0.setMaxPoolSize(maxPoolSize_);
cpds0.setMaxLifetime((int)(maxLifetime_/1000)); // convert to seconds
cpds0.setMaxIdleTime((int)(maxIdleTime_/1000)); // convert to seconds
cpds0.setPropertyCycle((int)(propertyCycle_/1000)); // convert to seconds
//cpds0.setReuseConnections(false); // do not re-use connections
// Set the initial context factory to use.
System.setProperty(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.fscontext.RefFSContextFactory");
// Get the JNDI Initial Context.
Context ctx = new InitialContext();
// Note: The following is an alternative way to set context properties locally:
// Properties env = new Properties();
// env.put(Context.INITIAL_CONTEXT_FACTORY, "com.sun.jndi.fscontext.RefFSContextFactory");
// Context ctx = new InitialContext(env);
ctx.rebind("mydatasource", cpds0); // We can now do lookups on cpds, by the name "mydatasource".
// Create a standard DataSource object that references it.
AS400JDBCManagedDataSource mds0 = new AS400JDBCManagedDataSource();
mds0.setDescription("DataSource supporting connection pooling");
mds0.setDataSourceName("mydatasource");
ctx.rebind("ConnectionPoolingDataSource", mds0);
DataSource dataSource_ = (DataSource)ctx.lookup("ConnectionPoolingDataSource");
AS400JDBCManagedDataSource mds_ = (AS400JDBCManagedDataSource)dataSource_;
boolean isHealthy = mds_.checkPoolHealth(false); //check pool health
Connection c = dataSource_.getConnection();
}
}

Error trying to connect using Astyanax to Cassandra hosted on a EC2 instance

I am getting the following error "astyanax.connectionpool.exceptions.PoolTimeoutException:" when trying to use client Astyanax to connect to Cassandra on a EC2 instance. Need help
Following is my code snippet.
import org.mortbay.jetty.servlet.Context;
import com.netflix.astyanax.AstyanaxContext;
import com.netflix.astyanax.Keyspace;
import com.netflix.astyanax.MutationBatch;
import com.netflix.astyanax.connectionpool.NodeDiscoveryType;
import com.netflix.astyanax.connectionpool.OperationResult;
import com.netflix.astyanax.connectionpool.exceptions.ConnectionException;
import com.netflix.astyanax.connectionpool.impl.ConnectionPoolConfigurationImpl;
import com.netflix.astyanax.connectionpool.impl.ConnectionPoolType;
import com.netflix.astyanax.connectionpool.impl.CountingConnectionPoolMonitor;
import com.netflix.astyanax.impl.AstyanaxConfigurationImpl;
import com.netflix.astyanax.model.Column;
import com.netflix.astyanax.model.ColumnFamily;
import com.netflix.astyanax.model.ColumnList;
import com.netflix.astyanax.model.CqlResult;
import com.netflix.astyanax.serializers.StringSerializer;
import com.netflix.astyanax.thrift.ThriftFamilyFactory;
public class MetadataRS {
public static void main(String args[]){
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder()
.forCluster("ClusterName")
.forKeyspace("KeyspaceName")
.withAstyanaxConfiguration(new AstyanaxConfigurationImpl()
.setDiscoveryType(NodeDiscoveryType.RING_DESCRIBE)
.setConnectionPoolType(ConnectionPoolType.ROUND_ROBIN)
)
.withConnectionPoolConfiguration(new ConnectionPoolConfigurationImpl("MyConnectionPool")
.setPort(9042)
.setMaxConnsPerHost(40)
.setSeeds("<EC2-IP>:9042")
.setConnectTimeout(5000)
)
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
.buildKeyspace(ThriftFamilyFactory.getInstance());
context.start();
Keyspace keyspace = context.getEntity();
System.out.println(keyspace);
ColumnFamily<String, String> CF_USER_INFO =
new ColumnFamily<String, String>(
"Standard1", // Column Family Name
StringSerializer.get(), // Key Serializer
StringSerializer.get()); // Column
OperationResult<ColumnList<String>> result = null;
try {
result = keyspace.prepareQuery(CF_USER_INFO)
.getKey("user_id_hash")
.execute();
} catch (ConnectionException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
ColumnList<String> columns = result.getResult();
// Lookup columns in response by name
String uid = columns.getColumnByName("user_id_hash").getStringValue();
System.out.println(uid);
// Or, iterate through the columns
for (Column<String> c : result.getResult()) {
System.out.println(c.getName());
}
}
}
Error
com.netflix.astyanax.thrift.ThriftKeyspaceImpl#1961f4
com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: PoolTimeoutException: [host=():9042, latency=5001(5001), attempts=1] Timed out waiting for connection
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:201)
at com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:158)
at com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:60)
at com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:50)
at com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:229)
at com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$1.execute(ThriftColumnFamilyQueryImpl.java:180)
at com.rjil.jiodrive.rs.MetadataRS.main(MetadataRS.java:57)
Exception in thread "main" java.lang.NullPointerException
at com.rjil.jiodrive.rs.MetadataRS.main(MetadataRS.java:62)
Since you are running cassandra on EC2 instance, check that the cassandra's port no. (which you have choosen as 9042) is in the allowed list of ec2 security group and that you have access to it. If not add the port no. in the inbound list of the ec2 security group and set the ip ranges as 0.0.0.0.
Alos check the firewall on the ec2 is turned off. By default its false, but its good to check it anyway.
If you have done this, then your client might be behind a firewall that prevents outbound traffic to your choosen port (9042).
Lastly if you have not used any elastic ip, its better to use the ec2 instance dns name, both in your setSeeds section and in the rpc_address of cassandra.yaml
Your issue is not in your code. You have a connectivity problem to the node you specified as your seed. So either that node isn't running, or you can't reach it from the machine running your client.
I finally upgraded libthrift to 0.9 and changed my code to the following nd it working fine now.
public Keyspace getDBConnection() {
if (poolConfig == null) {
poolConfig = new ConnectionPoolConfigurationImpl(
"CassandraPool").setPort(port).setMaxConnsPerHost(1)
.setSeeds((new StringBuilder(seedHost).append(":").append(port).toString()))
.setLatencyAwareUpdateInterval(latencyAwareUpdateInterval) // Will resort hosts per
// token partition every
// 10 seconds
.setLatencyAwareResetInterval(latencyAwareResetInterval) // Will clear the latency
// every 10 seconds. In
// practice I set this
// to 0 which is the
// default. It's better
// to be 0.
.setLatencyAwareBadnessThreshold(latencyAwareBadnessThreshold) // Will sort hosts if a host
// is more than 100% slower
// than the best and always
// assign connections to the
// fastest host, otherwise
// will use round robin
.setLatencyAwareWindowSize(latencyAwareWindowSize) // Uses last 100 latency
// samples. These samples are in
// a FIFO q and will just cycle
// themselves.
.setTimeoutWindow(60000)
;
}
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder()
.forCluster(clusterName)
.forKeyspace(keyspaceName)
.withAstyanaxConfiguration(
new AstyanaxConfigurationImpl().setDiscoveryType(
NodeDiscoveryType.NONE)
.setConnectionPoolType(
ConnectionPoolType.ROUND_ROBIN)
.setCqlVersion("3.0.0")
.setTargetCassandraVersion("2.0"))
.withConnectionPoolConfiguration(poolConfig)
.withConnectionPoolMonitor(new CountingConnectionPoolMonitor())
.buildKeyspace(ThriftFamilyFactory.getInstance());
context.start();
return context.getClient();
}

Check Session with Cassandra Datastax Java Driver

Is there any direct way to check if a Cluster/Session is connected/valid/ok?
I mean, I have a com.datastax.driver.core.Session created into a neverending thread and I'd like to assure the session is ok every time is needed. I use the next cluster initialization, but I'm not sure this is enough...
Cluster.builder().addContactPoint(url)
.withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE)
.withReconnectionPolicy(new ConstantReconnectionPolicy(1000L)).build());
In fact when using the DataStax Java Driver, you have a hidden/magic capability embedded:
The driver is aware of the full network topology (nodes topology across datacenters and nodes availabilities).
Thus, the only thing you have to do is to initialise your cluster with a few nodes(1) and then you can be sure at every moment that if there is at least one available node your request will be performed correctly. Because the driver is topology aware, if one node (even initialisation nodes) goes out of availability, the driver will automagically route your request to another available node.
In summary, your code is good(1).
(1): You should provide a few nodes in order to be fault tolerant in the cluster initialisation phase. Indeed, if one initialisation node is down, the driver has then the possibility to contact another one to discover the full topology.
I have a local development environment setup where I am starting up my java application and Cassandra (Docker) container at the same time, so Cassandra will normally not be in a ready state when the java application first attempts to connect.
When this is starting up the application will throw a NoHostAvailableException when the Cluster instance attempts to create a Session. Subsequent attempts to create a Session from the Cluster will then throw an IllegalStateException because the cluster instance was closed after the first exception.
What I did to rememdy this was to create a check method that attempts to create a Cluster and Session and then immediately closes these. See this:
private void waitForCassandraToBeReady(String keyspace, Cluster.Builder builder) {
RuntimeException exception = null;
int retries = 0;
while (retries++ < 40) {
Session session = null;
Cluster cluster = null;
try {
cluster = builder.build();
session = cluster.connect(keyspace);
log.info("Cassandra is available");
return;
} catch (RuntimeException e) {
log.warn("Cassandra not available, try {}", retries);
exception = e;
} finally {
if (session != null && !session.isClosed()) session.close();
if (cluster != null && !cluster.isClosed()) cluster.close();
}
sleep();
}
log.error("Retries exceeded waiting for Cassandra to be available");
if (exception != null) throw exception;
else throw new RuntimeException("Cassandra not available");
}
After this method returns, I then create a create a Cluster and Session independent of this check method.

Resources