Astyanax/Cassandra - Getting "Re-preparing already prepared query" warning with caching enabled - caching

I'm trying to insert some data to Cassandra with Astyanax, by I'm getting a lot of "Re-preparing already prepared query" warnings even if have caching enabled:
22:08:03,703 WARN Cluster:1702 - Re-preparing already prepared query INSERT INTO test.test (key,c0,c1,c2,c3,c4,c5,c6,c7,c8,c9) VALUES (?,?,?,?,?,?,?,?,?,?,?) . Please note that preparing the same query more than once is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once.
22:08:03,707 WARN Cluster:1702 - Re-preparing already prepared query INSERT INTO test.test (key,c0,c1,c2,c3,c4,c5,c6,c7,c8,c9) VALUES (?,?,?,?,?,?,?,?,?,?,?) . Please note that preparing the same query more than once is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once.
22:08:03,708 WARN Cluster:1702 - Re-preparing already prepared query INSERT INTO test.test (key,c0,c1,c2,c3,c4,c5,c6,c7,c8,c9) VALUES (?,?,?,?,?,?,?,?,?,?,?) . Please note that preparing the same query more than once is generally an anti-pattern and will likely affect performance. Consider preparing the statement only once.
Source code:
Connect: (executed once)
#Override
public void connect() throws ClientException {
AstyanaxContext<Keyspace> context = new AstyanaxContext.Builder()
.forCluster(clusterName)
.forKeyspace(keyspaceName)
.withHostSupplier(new Supplier<List<Host>>() {
#Override
public List<Host> get() {
return Collections.singletonList(new Host(host, 9160));
}
})
.withAstyanaxConfiguration(
new AstyanaxConfigurationImpl().setDiscoveryType(NodeDiscoveryType.DISCOVERY_SERVICE)
.setDiscoveryDelayInSeconds(60000))
.withConnectionPoolConfiguration(new JavaDriverConfigBuilder().build())
.buildKeyspace(CqlFamilyFactory.getInstance());
context.start();
keyspace = context.getClient();
columnFamilyTemplate = new ColumnFamily<String, String>(columnFamily,
StringSerializer.get(), StringSerializer.get());
try {
columnFamilyTemplate.describe(keyspace);
} catch (ConnectionException e) {
throw new ClientException(e);
}
insert = keyspace.prepareMutationBatch().withCaching(true);
}
Insert: (executed multiple times)
insert.discardMutations();
final ColumnListMutation<String> row = insert.withRow(columnFamilyTemplate, key);
for (Map.Entry<String, String> pair : columnValues.entrySet()) {
final String column = pair.getKey();
final String value = pair.getValue();
row.putColumn(column, value, null);
}
try {
insert.withCaching(true).execute();
} catch (ConnectionException e) {
throw new ClientException(e);
}
The warning message suggests that the caching is not actually working. Any idea how to fix it?

Related

How to pass dynamic SQL statement to the JDBCIO connector in apache beam?

Apache beam provides the JDBCIO connector to connect to CloudSql postgreSQL. My job reads an event from pub/sub. The event body is as below:
tableName,
list<value>
I need to write to the table based on the table name that I get in from my message.
The JDBCIO has prepared statement which will let me parameterize the values in my insert query. But I need to generate the insert query dynamically based on the information present in the event.
pipeline
.apply(PubsubIO.readStrings().fromSubscription())
.apply(convertToKV())
.apply(JdbcIO.<List<String>>>write()
.withDataSourceConfiguration(JdbcIO.DataSourceConfiguration.create(
"com.mysql.jdbc.Driver", "jdbc:mysql://hostname:3306/mydb")
.withUsername("username")
.withPassword("password"))
.withStatement("insert into Person values(?, ?)")
.withPreparedStatementSetter(new JdbcIO.PreparedStatementSetter<KV<Integer, String>>() {
public void setParameters(KV<Integer, String> element, PreparedStatement query)
throws SQLException {
i=0
for each element in list
query.setInt(i, element.get(i);
i++;
}
})
);
I should be able to create the SQL statement dynamically based on the input event from the pcollection.
My select statement should be dynamically generated based on the list value and the table name. Please let me know whether we can do this or not.
Update:-
im trying to manually call the jdbc driver inside the parDo function but getting the below error.
No suitable driver found for jdbcURL.
Please let me know if im missing anyting:
#Setup
public void doAnyRequiredSetup() throws SQLException
{
LoggingContextUtil.installContext(loggingContext);
connection=DriverManager.getConnection(JdbcUrl,user,password);
statement=connection.createStatement();
if (LOGGER.isDebugEnabled()) {
LOGGER.debug("In doAnyRequiredSetup logging Context is now set and JDBC connection is .");
}
}
#SuppressWarnings("unchecked")
#ProcessElement
public void processElement(ProcessContext context)
{
JsonNode element=context.element();
try {
String query=formatQuery(baseQuery);
boolean result=statement.execute(query);
if(LOGGER.isDebugEnabled()) {
LOGGER.debug("Executed query : "+query+" and the result is "+ result);
}
} catch (IllegalArgumentException | SQLException e) {
ErrorMessage em = new ErrorMessage(element.toString(), "Insert Query Failed", e.getMessage());
context.output(ValidateTagHelper.FAILURE_TAG,em);
}
}
You can not have dynamic queries on JdbcIO based on the input elements.
The ParDo has to reset as you like, you can rewrite your ParDo in which you would call the JDBC driver manually.
If find this other workaround, you can split the input PColleciton into multiple outputs. That will work if your use case is limited to some predefined set of queries that you can choose from based on the input. This way you split the input into multiple PCollections and then attach differently configured IOs to each.
https://cloud.google.com/blog/products/gcp/guide-to-common-cloud-dataflow-use-case-patterns-part-1
You can try to read pubsub messages with attributes and in attributes, you can pass the table name and values in the form of Key-value pair.
PCollection<PubsubMessage> pubsubMessage = pipeline
.apply(PubsubIO.readMessagesWithAttributes().fromSubscription("")

Parallel Stream repeating items

I am retrieving big chunks of data from DB and using this data to write it somewhere else. In order to avoid a long processing time, I'm trying to use parallel streams to write it. When I run this as sequential streams, it works perfectly. However, if I change it to parallel, the behavior is odd: it prints the same object multiple times (more than 10).
#PostConstruct
public void retrieveAllTypeRecords() throws SQLException {
logger.info("Retrieve batch of Type records.");
try {
Stream<TypeRecord> typeQueryAsStream = jdbcStream.getTypeQueryAsStream();
typeQueryAsStream.forEach((type) -> {
logger.info("Printing Type with field1: {} and field2: {}.", type.getField1(), type.getField2()); //the same object gets printed here multiple times
//write this object somewhere else
});
logger.info("Completed full retrieval of Type data.");
} catch (Exception e) {
logger.error("error: " + e);
}
}
public Stream<TypeRecord> getTypeQueryAsStream() throws SQLException {
String sql = typeRepository.getQueryAllTypesRecords(); //retrieves SQL query in String format
TypeMapper typeMapper = new TypeMapper();
JdbcStream.StreamableQuery query = jdbcStream.streamableQuery(sql);
Stream<TypeRecord> stream = query.stream()
.map(row -> {
return typeMapper.mapRow(row); //maps columns values to object values
});
return stream;
}
public class StreamableQuery implements Closeable {
(...)
public Stream<SqlRow> stream() throws SQLException {
final SqlRowSet rowSet = new ResultSetWrappingSqlRowSet(preparedStatement.executeQuery());
final SqlRow sqlRow = new SqlRowAdapter(rowSet);
Supplier<Spliterator<SqlRow>> supplier = () -> Spliterators.spliteratorUnknownSize(new Iterator<SqlRow>() {
#Override
public boolean hasNext() {
return !rowSet.isLast();
}
#Override
public SqlRow next() {
if (!rowSet.next()) {
throw new NoSuchElementException();
}
return sqlRow;
}
}, Spliterator.CONCURRENT);
return StreamSupport.stream(supplier, Spliterator.CONCURRENT, true); //this boolean sets the stream as parallel
}
}
I've also tried using typeQueryAsStream.parallel().forEach((type) but the result is the same.
Example of output:
[ForkJoinPool.commonPool-worker-1] INFO TypeService - Saving Type with field1: L6797 and field2: P1433.
[ForkJoinPool.commonPool-worker-1] INFO TypeService - Saving Type with field1: L6797 and field2: P1433.
[main] INFO TypeService - Saving Type with field1: L6797 and field2: P1433.
[ForkJoinPool.commonPool-worker-1] INFO TypeService - Saving Type with field1: L6797 and field2: P1433.
Well, look at you code,
final SqlRow sqlRow = new SqlRowAdapter(rowSet);
Supplier<Spliterator<SqlRow>> supplier = () -> Spliterators.spliteratorUnknownSize(new Iterator<SqlRow>() {
…
#Override
public SqlRow next() {
if (!rowSet.next()) {
throw new NoSuchElementException();
}
return sqlRow;
}
}, Spliterator.CONCURRENT);
You are returning the same object every time. You achieve your desired effects by implicitly modifying the state of this object when calling rowSet.next().
This obviously can’t work when multiple threads try to access that single object concurrently. Even buffering some items, to hand them over to another thread will cause trouble. Therefore, such interference can cause problems with sequential streams as well, as soon as stateful intermediate operations are involved, like sorted or distinct.
Assuming that typeMapper.mapRow(row) will produce an actual data item which has no interference to other data items, you should integrate this step into the stream source, to create a valid stream.
public Stream<TypeRecord> stream(TypeMapper typeMapper) throws SQLException {
SqlRowSet rowSet = new ResultSetWrappingSqlRowSet(preparedStatement.executeQuery());
SqlRow sqlRow = new SqlRowAdapter(rowSet);
Spliterator<TypeRecord> sp = new Spliterators.AbstractSpliterator<TypeRecord>(
Long.MAX_VALUE, Spliterator.CONCURRENT|Spliterator.ORDERED) {
#Override
public boolean tryAdvance(Consumer<? super TypeRecord> action) {
if(!rowSet.next()) return false;
action.accept(typeMapper.mapRow(sqlRow));
return true;
}
};
return StreamSupport.stream(sp, true); //this boolean sets the stream as parallel
}
Note that for a lot of use cases, like this one, implementing a Spliterator is simpler than implementing an Iterator (which needs to be wrapped via spliteratorUnknownSize anyway). Also, there is no need to encapsulate this instantiation into a Supplier.
As a final note, the current implementation does not perform well for streams with an unknown size, as it treats Long.MAX_VALUE like a very large number, ignoring the “unknown” semantic assigned to it by the specification. It will be very beneficial to the parallel performance to provide an estimate size, it doesn’t need to be precise, in fact, with the current implementation, even a completely made up number, say 1000 may perform better than correctly using Long.MAX_VALUE to denote an entirely unknown size.

Querying single database row using rxjava2

I am using rxjava2 for the first time on an Android project, and am doing SQL queries on a background thread.
However I am having trouble figuring out the best way to do a simple SQL query, and being able to handle the case where the record may or may not exist. Here is the code I am using:
public Observable<Record> createRecordObservable(int id) {
Callable<Record> callback = new Callable<Record>() {
#Override
public Record call() throws Exception {
// do the actual sql stuff, e.g.
// select * from Record where id = ?
return record;
}
};
return Observable.fromCallable(callback).subscribeOn(Schedulers.computation());
}
This works well when there is a record present. But in the case of a non-existent record matching the id, it treats it like an error. Apparently this is because rxjava2 doesn't allow the Callable to return a null.
Obviously I don't really want this. An error should be only if the database failed or something, whereas a empty result is perfectly valid. I read somewhere that one possible solution is wrapping Record in a Java 8 Optional, but my project is not Java 8, and anyway that solution seems a bit ugly.
This is surely such a common, everyday task that I'm sure there must be a simple and easy solution, but I couldn't find one so far. What is the recommended pattern to use here?
Your use case seems appropriate for the RxJava2 new Observable type Maybe, which emit 1 or 0 items.
Maybe.fromCallable will treat returned null as no items emitted.
You can see this discussion regarding nulls with RxJava2, I guess that there is no many choices but using Optional alike in other cases where you need nulls/empty values.
Thanks to #yosriz, I have it working with Maybe. Since I can't put code in comments, I'll post a complete answer here:
Instead of Observable, use Maybe like this:
public Maybe<Record> lookupRecord(int id) {
Callable<Record> callback = new Callable<Record>() {
#Override
public Record call() throws Exception {
// do the actual sql stuff, e.g.
// select * from Record where id = ?
return record;
}
};
return Maybe.fromCallable(callback).subscribeOn(Schedulers.computation());
}
The good thing is the returned record is allowed to be null. To detect which situation occurred in the subscriber, the code is like this:
lookupRecord(id)
.observeOn(AndroidSchedulers.mainThread())
.subscribe(new Consumer<Record>() {
#Override
public void accept(Record r) {
// record was loaded OK
}
}, new Consumer<Throwable>() {
#Override
public void accept(Throwable throwable) {
// there was an error
}
}, new Action() {
#Override
public void run() {
// there was an empty result
}
});

Is it possible to get a key value after DuplicateKeyException in spring?

I wonder if it is possible to get a key of a value after DuplicateKeyException in spring?
For example like this:
try {
KeyHolder keyHolder = new GeneratedKeyHolder();
getJdbcTemplate().update(
new PreparedStatementCreator() {
#Override
public PreparedStatement createPreparedStatement(Connection con) throws SQLException {
PreparedStatement ps = con.prepareStatement(SQL, new String[]{"ID"});
ps.setString(1, user.getUSERNAME());
ps.setString(2, user.getEMAIL());
return ps;
}
},
keyHolder);
logger.info("Insert success");
return (BigDecimal) keyHolder.getKey();
} catch (DuplicateKeyException dke) {
logger.info("Insert failed");
// get key and do sth with it
return new BigDecimal(-1);
}
Any suggestions here are welcome.
I don't see an optimal solution for this problem. Try something like this:
Preselection - Select all counts of values. (if you have constraint for mail, check whether the mail exists or not. If you have another constraint for name, check whether the name exists or not. (SELECT COUNT(name) FROM table WHERE name='SOME_NAME') You can do this, before or after experiencing a DuplicateKeyException.
Extraction - Try to extract the column which causes problems from the Exception-Message (this is dirty). In this case, you don't have to execute multiple queries.

Query returning values from Oracle and no records when run from Java

This query is returning the record with Min Create time Stamp for the Person Pers_ID when I run it in SQL Developer and the same query is not returning any value from Java JDBC connection.
Can you please help?
select PERS_ID,CODE,BEG_DTE
from PRD_HIST H
where PERS_ID='12345'
and CODE='ABC'
and CRTE_TSTP=(
select MIN(CRTE_TSTP)
from PRD_HIST S
where H.PERS_ID=S.PERS_ID
and PERS_ID='12345'
and EFCT_END_DTE is null
)
Java Code
public static List<String[]> getPersonwithMinCreateTSTP(final String PERS_ID,final String Category,final Connection connection){
final List<String[]> personRecords = new ArrayList<String[]>();
ResultSet resultSet = null;
Statement statement = null;
String PersID=null;
String ReportCode=null;
String effBegDate=null;
try{
statement = connection.createStatement();
final String query="select PERS_ID,CODE,EFCT_BEG_DTE from PRD_HIST H where PERS_ID='"+PERS_ID+"'and CODE='"+Category+"'and CRTE_TSTP=(select MIN(CRTE_TSTP) from PRD_HIST S where H.PERS_ID=S.PERS_ID and PERS_ID='"+PERS_ID+"' and EFCT_END_DTE is null)";
if (!statement.execute(query)) {
//print error
}
resultSet = statement.getResultSet();
while (resultSet.next()) {
PersID=resultSet.getString("PERS_ID");
ReportCode=resultSet.getString("CODE");
effBegDate=resultSet.getString("EFCT_BEG_DTE");
final String[] personDetails={PersID,ReportCode,effBegDate};
personRecords.add(personDetails);
}
} catch (SQLException sqle) {
CTLoggerUtil.logError(sqle.getMessage());
}finally{ // Finally is added to close the connection and resultset
try {
if (resultSet!=null) {
resultSet.close();
}if (statement!=null) {
statement.close();
}
} catch (SQLException e) {
//print error
}
}
return personRecords;
}
Print out your SQL SELECT statement from your java program and paste it into SQL*Plus and see what is happening. It's likely you're not getting your variables set to what you think you are. In fact, you're likely to see the error when you print out the SELECT statement without even running it - lower case values when upper is needed, etc.
If you still can't see it, post the actual query from your java code here.
I came here with similar problem - just thought I'd post my solution for others following - I hadn't run "COMMIT" after the inserts I'd made (via sqlplus) - doh!
The database table has records but the JDBC client can't retrieve the records.
Means the JDBC client doesn't have the select privileges. Please run the below query on command line:
grant all on emp to hr;

Resources