Snappydata - sql put into on jobserver don't aggregate values - spark-streaming

I'm trying to create a jar to run on snappy-job shell with streaming.
I have aggregation function and it works in windows perfectly. But I need to have a table with one value for each key. Base on a example from github a create a jar file and now I have problem with put into sql command.
My code for aggregation:
val resultStream: SchemaDStream = snsc.registerCQ("select publisher, cast(sum(bid)as int) as bidCount from " +
"AggrStream window (duration 1 seconds, slide 1 seconds) group by publisher")
val conf = new ConnectionConfBuilder(snsc.snappySession).build()
resultStream.foreachDataFrame(df => {
df.write.insertInto("windowsAgg")
println("Data received in streaming window")
df.show()
println("Updating table updateTable")
val conn = ConnectionUtil.getConnection(conf)
val result = df.collect()
val stmt = conn.prepareStatement("put into updateTable (publisher, bidCount) values " +
"(?,?+(nvl((select bidCount from updateTable where publisher = ?),0)))")
result.foreach(row => {
println("row" + row)
val publisher = row.getString(0)
println("publisher " + publisher)
val bidCount = row.getInt(1)
println("bidcount : " + bidCount)
stmt.setString(1, publisher)
stmt.setInt(2, bidCount)
stmt.setString(3, publisher)
println("Prepared Statement after bind variables set: " + stmt.toString())
stmt.addBatch()
}
)
stmt.executeBatch()
conn.close()
})
snsc.start()
snsc.awaitTermination()
}
I have to update or insert to table updateTable, but during update command the current value have to added to the one from stream.
And now :
What I see when I execute the code:
select * from updateTable;
PUBLISHER |BIDCOUNT
--------------------------------------------
publisher333 |10
Then I sent message to kafka:
1488487984048,publisher333,adv1,web1,geo1,11,c1
and again select from updateTable:
select * from updateTable;
PUBLISHER |BIDCOUNT
--------------------------------------------
publisher333 |11
the Bidcount value is overwritten instead of added.
But when I execute the put into command from snappy-sql shell it works perfectly:
put into updateTable (publisher, bidcount) values ('publisher333',4+
(nvl((select bidCount from updateTable where publisher =
'publisher333'),0)));
1 row inserted/updated/deleted
snappy> select * from updateTable;
PUBLISHER |BIDCOUNT
--------------------------------------------
publisher333 |15
Could you help me with this case? Mayby someone has other solution for insert or update value using snappydata ?
Thank you in advanced.

bidCount value is read from tomi_update table in case of streaming but it's getting read from updateTable in case of snappy-sql. Is this intentional? May be you wanted to use updateTable in both the cases ?

Related

Not getting right value from state store

I am trying to use state store to merge multiple kafka streams. As part of it , I am consuming messages from multiple topics and put them in state store with keys for Ex :
message from topic1 saved in state store as key_p1 and value1
message from topic2 saved in state store as key_p2 and value2
message from topic3 saved in state store as key_p3 and value3.
To meet SLA , I tired to query the state store to verify if I received mandatory transactions (for ex on topic2 and topic3 , with values key_p2 and key_p3).
val priorityTxns: KeyValueIterator[String, ValueAndTimestamp[String]] = kvStore.range(key_p2,key_p3)
Though i have the messages in state store , most of the times (not always ) I only get only one message.
Is there a way to refresh the store before querying?
Code in my transform method :
`override def transform(kafkaKey: String, value: String): KeyValue[String,String] ={
var key=""
var taggedMsg=""
if((kafkaKey != null) && (kafkaKey != "")) {
key=kafkaKey+txnKeys.get(msgTag).get // this will be like key_p1,key_p2,key_p3 etc.
kvStore.put(key, ValueAndTimestamp.make(taggedMsg, context.timestamp))
log.info("saving value in state store with key " + key + " and value " + kvStore.get(key))
}
}`
and code in inti(context ProcessorContext)
val priorityTxns: KeyValueIterator[String, ValueAndTimestamp[String]] = kvStore.range(key_p2,key_p3)
val tempP3 = kvStore.get(rangeKey + "_p3")
val tempP2 = kvStore.get(rangeKey + "_p2")
while (priorityTxns.hasNext) {
log.info("available priority keys " + priorityTxns.peekNextKey())
val e = priorityTxns.next()
}

GORM hangs with parameter

The below queries hangs exactly after 5 calls every time,
tx := db.Raw("select count(*) as hash from transaction_logs left join blocks on transaction_logs.block_number = blocks.number"+
" where (transaction_logs.address = ? and transaction_logs.topic0 = ?) and blocks.total_confirmations >= 7 "+
"group by transaction_hash", strings.ToLower("0xa11265a58d9f5a2991fe8476a9afea07031ac5bf"),
"0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef").Scan(&totalIds)
If we replace it without the arguments it works
db.Raw("select count(*) as hash from transaction_logs left join blocks on transaction_logs.block_number = blocks.number"+
" where (transaction_logs.address = #tokenAddress and transaction_logs.topic0 = '0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef') and blocks.total_confirmations >= 7 "+
"group by transaction_hash", sql.Named("tokenAddress", strings.ToLower("0xa11265a58d9f5a2991fe8476a9afea07031ac5bf"))
Tried even with named parameter, same result
Can anyone help here

How to solve for pyodbc.ProgrammingError: The second parameter to executemany must not be empty

Hi i'm having an issue with the transfer of data from one database to another. I created a list using field in a table on a msql db, used that list to query and oracle db table (using the initial list in the where statement to filter results) I then load the query results back into the msql db.
The program runs for the first few iterations but then errors out, with the following error (
Traceback (most recent call last):
File "C:/Users/1/PycharmProjects/DataExtracts/BuyerGroup.py", line 67, in
insertIntoMSDatabase(idString)
File "C:/Users/1/PycharmProjects/DataExtracts/BuyerGroup.py", line 48, in insertIntoMSDatabase
mycursor.executemany(sql, val)
pyodbc.ProgrammingError: The second parameter to executemany must not be empty.)
I can't seem to find and guidance online to troubleshoot this error message. I feel it may be a simple solution but I just can't get there...
# import libraries
import cx_Oracle
import pyodbc
import logging
import time
import re
import math
import numpy as np
logging.basicConfig(level=logging.DEBUG)
conn = pyodbc.connect('''Driver={SQL Server Native Client 11.0};
Server='servername';
Database='dbname';
Trusted_connection=yes;''')
b = conn.cursor()
dsn_tns = cx_Oracle.makedsn('Hostname', 'port', service_name='name')
conn1 = cx_Oracle.connect(user=r'uid', password='pwd', dsn=dsn_tns)
c = conn1.cursor()
beginTime = time.time()
bind = (b.execute('''select distinct field1
from [server].[db].[dbo].[table]'''))
print('MSQL table(s) queried, List Generated')
# formats ids for sql string
def surroundWithQuotes(id):
return "'" + re.sub(",|\s$", "", str(id)) + "'"
def insertIntoMSDatabase(idString):
osql = '''SELECT distinct field1, field2
FROM Database.Table
WHERE field2 is not null and field3 IN ({})'''.format(idString)
c.execute(osql)
claimsdata = c.fetchall()
print('Oracle table(s) queried, Data Pulled')
mycursor = conn.cursor()
sql = '''INSERT INTO [dbo].[tablename]
(
[fields1]
,[field2]
)
VALUES (?,?)'''
val = claimsdata
mycursor.executemany(sql, val)
conn.commit()
ids = []
formattedIdStrings = []
# adds all the ids found in bind to an iterable array
for row in bind:
ids.append(row[0])
# splits the ids[] array into multiple arrays < 1000 in length
batchedIds = np.array_split(ids, math.ceil(len(ids) / 1000))
# formats the value inside each batchedId to be a string
for batchedId in batchedIds:
formattedIdStrings.append(",".join(map(surroundWithQuotes, batchedId)))
# runs insert into MS database for each batch of IDs
for idString in formattedIdStrings:
insertIntoMSDatabase(idString)
print("MSQL table loaded, Data inserted into destination")
endTime = time.time()
print("Program Time Elapsed: ",endTime-beginTime)
conn.close()
conn1.close()
mycursor.executemany(sql, val)
pyodbc.ProgrammingError: The second parameter to executemany must not be empty.
Before calling .executemany() you need to verify that val is not an empty list (as would be the case if .fetchall() is called on a SELECT statement that returns no rows) , e.g.,
if val:
mycursor.executemany(sql, val)

CT_FETCH error in PowerBuilder Program

I'm still learning PowerBuilder and trying to get familiar with it. I'm receiving the following error when I try to run a program against a specific document in my database:
ct_fetch(): user api layer: internal common library error: The bind of result set item 4 resulted in an overflow. ErrCode: 2.
What does this error mean? What is item 4? This is only when I run this program against a specific document in my database, any other document works fine. Please see code below:
string s_doc_nmbr, s_doc_type, s_pvds_doc_status, s_sql
long l_rtn, l_current_fl, l_apld_fl, l_obj_id
integer l_pvds_obj_id, i_count
IF cbx_1.checked = True THEN
SELECT dsk_obj.obj_usr_num,
dsk_obj.obj_type,
preaward_validation_doc_status.doc_status,
preaward_validation_doc_status.obj_id
INTO :s_doc_nmbr, :s_doc_type, :s_pvds_doc_status, :l_pvds_obj_id
FROM dbo.dsk_obj dsk_obj,
preaward_validation_doc_status
WHERE dsk_obj.obj_id = :gx_l_doc_obj_id
AND preaward_validation_doc_status.obj_id = dsk_obj.obj_id
using SQLCA;
l_rtn = sqlca.uf_sqlerrcheck("w_pdutl095_main", "ue_run_script", TRUE)
IF l_rtn = -1 THEN
RETURN -1
END IF
//check to see if document (via obj_id) exists in the preaward_validation_doc_status table.
SELECT count(*)
into :i_count
FROM preaward_validation_doc_status
where obj_id = :l_pvds_obj_id
USING SQLCA;
IF i_count = 0 THEN
//document doesn't exist
// messagebox("Update Preaward Validation Doc Status", + gx_s_doc_nmbr + ' does not exist in the Preaward Validation Document Status table.', Stopsign!)
//MC - 070815-0030-MC Updating code to insert row into preaward_validation_doc_status if row doesn't already exist
// s_sql = "insert into preaward_validation_doc_status(obj_id, doc_status) values (:gx_l_doc_obj_id, 'SUCCESS') "
INSERT INTO preaward_validation_doc_status(obj_id, doc_status)
VALUES (:gx_l_doc_obj_id, 'SUCCESS')
USING SQLCA;
IF sqlca.sqldbcode <> 0 then
messagebox('SQL ERROR Message',string(sqlca.sqldbcode)+'-'+sqlca.sqlerrtext)
return -1
end if
MessageBox("PreAward Validation ", 'Document number ' + gx_s_doc_nmbr + ' has been inserted and marked as SUCCESS for PreAward Validation.')
return 1
Else
//Update document status in the preaward_validation_doc_status table to SUCCESS
Update preaward_validation_doc_status
Set doc_status = 'SUCCESS'
where obj_id = :l_pvds_obj_id
USING SQLCA;
IF sqlca.sqldbcode <> 0 then
messagebox('SQL ERROR Message',string(sqlca.sqldbcode)+'-'+sqlca.sqlerrtext)
return -1
end if
MessageBox("PreAward Validation ", 'Document number '+ gx_s_doc_nmbr + ' has been marked as SUCCESS for PreAward Validation.')
End IF
update crt_script
set alt_1 = 'Acknowledged' where
ticket_nmbr = :gx_s_ticket_nmbr and
alt_2 = 'Running' and
doc_nmbr = :gx_s_doc_nmbr
USING SQLCA;
Return 1
ElseIF cbx_1.checked = False THEN
messagebox("Update Preaward Validation Doc Status", 'The acknowledgment checkbox must be selected for the script to run successfully. The script will now exit. Please relaunch the script and try again . ', Stopsign!)
Return -1
End IF
Save yourself a ton of headaches and use datawindows... You'd reduce that entire script to about 10 lines of code.
Paul Horan gave you good advice. This would be simple using DataWindows or DataStores. Terry Voth is on the right track for your problem.
In your code, Variable l_pvds_obj_id needs to be the same type as gx_l_doc_obj_id because if you get a result, it will always be equal to it. From the apparent naming scheme it was intended to be long. This is the kind of stuff we look for in peer reviews.
A few other things:
Most of the time you want SQLCode not SQLDbCode but you didn't say what database you're using.
After you UPDATE crt_script you need to check the result.
I don't see COMMIT or ROLLBACK. Autocommit isn't suitable when you need to update multiple tables.
You aren't using most of the values from the first SELECT. Perhaps you've simplified your code for posting or troubleshooting.

jdbcTemplate.queryForList returns list of Map where all column values are NULL

Any calls using jdbcTemplate.queryForList returns a list of Maps which have NULL values for all columns. The columns should've had string values.
I do get the correct number of rows when compared to the result I get when I run the same query in a native SQL client.
I am using the JDBC ODBC bridge and the database is MS SQL server 2008.
I have the following code in my DAO:
public List internalCodeDescriptions(String listID) {
List rows = jdbcTemplate.queryForList("select CODE, DESCRIPTION from CODE_DESCRIPTIONS where LIST_ID=? order by sort_order asc", new Object[] {listID});
//debugcode start
try {
Connection conn1 = jdbcTemplate.getDataSource().getConnection();
Statement stat = conn1.createStatement();
boolean sok = stat.execute("select code, description from code_descriptions where list_id='TRIGGER' order by sort_order asc");
if(sok) {
ResultSet rs = stat.getResultSet();
ResultSetMetaData rsmd = rs.getMetaData();
String columnname1=rsmd.getColumnName(1);
String columnname2=rsmd.getColumnName(2);
int type1 = rsmd.getColumnType(1);
int type2 = rsmd.getColumnType(2);
String tn1 = rsmd.getColumnTypeName(1);
String tn2 = rsmd.getColumnTypeName(2);
log.debug("Testquery gave resultset with:");
log.debug("Column 1 -name:" + columnname1 + " -typeID:"+type1 + " -typeName:"+tn1);
log.debug("Column 2 -name:" + columnname2 + " -typeID:"+type2 + " -typeName:"+tn2);
int i=1;
while(rs.next()) {
String cd=rs.getString(1);
String desc=rs.getString(2);
log.debug("Row #"+i+": CODE='"+cd+"' DESCRIPTION='"+desc+"'");
i++;
}
} else {
log.debug("Query execution returned false");
}
} catch(SQLException se) {
log.debug("Something went haywire in the debug code:" + se.toString());
}
log.debug("Original jdbcTemplate list result gave:");
Iterator<Map<String, Object>> it1= rows.iterator();
while(it1.hasNext()) {
Map mm = (Map)it1.next();
log.debug("Map:"+mm);
String code=(String)mm.get("CODE");
String desc=(String)mm.get("description");
log.debug("CODE:"+code+" : "+desc);
}
//debugcode end
return rows;
}
As you can see I've added some debugging code to list the results from the queryForList and I also obtain the connection from the jdbcTemplate object and uses that to sent the same query using the basic jdbc methods (listID='TRIGGER').
What is puzzling me is that the log outputs something like this:
Testquery gave resultset with:
Column 1 -name:code -typeID:-9 -typeName:nvarchar
Column 2 -name:decription -typeID:-9 -typeName:nvarchar
Row #1: CODE='C1' DESCRIPTION='BlodoverxF8rin eller bruk av blodprodukter'
Row #2: CODE='C2' DESCRIPTION='Kodetilfelle, hjertestans/respirasjonstans'
Row #3: CODE='C3' DESCRIPTION='Akutt dialyse'
...
Row #58: CODE='S14' DESCRIPTION='Forekomst av hvilken som helst komplikasjon'
...
Original jdbcTemplate list result gave:
Map:(CODE=null, DESCRIPTION=null)
CODE:null : null
Map:(CODE=null, DESCRIPTION=null)
CODE:null : null
...
58 repetitions total.
Why does the result from the queryForList method return NULL in all columns for every row? How can I get the result I want using jdbcTemplate.queryForList?
The xF8 should be the letter ΓΈ so I have some encoding issues, but I can't see how that may cause all values - also strings not containing any strange letters (se row#2) - to turn into NULL values in the list of maps returned from the jdbcTemplate.queryForList method.
The same code ran fine on another server against a MySQL Server 5.5 database using the jdbc driver for MySQL.
The issue was resolved by using the MS SQL Server jdbc driver rather than using the JDBC ODBC bridge. I don't know why it didn't work with the bridge though.

Resources