Questions of Generating Simulated Data - quantitative-finance

I generated the simulated data, the scripts is shown as following:
def writeData(mutable t , dbName, tableName, days){
pt = loadTable(dbName,tableName); for(i in days){
update t set date = i;
pt.append!(t);
} }
def main(dbName, tableName, days){
pt = loadTable(dbName, tableName);
mr(pt,writeData(, dbName, tableName, days), parallel=true);
}
dbName = "dfs://level2";
tableName = `quotes;
days = (2020.06.01..2020.06.30)[weekday(2020.06.01..2020.06.30) between 1:5 ];
main(dbName, tableName, days);
The update reported by the error is not a table object, and then modify it with loadTableBySQL, and the error message is the same:
t = loadTableBySQL(<select * from pt where date=2020.06.01>);
mr(t,writeData(, dbName, tableName, days), parallel=true);

the parameters of mr function you passed in is wrong
mr(pt,writeData(, dbName, tableName, days), parallel=true); 
It should be modified as the following:
mr(t,writeData{, dbName, tableName, days}, parallel=true);
The {} represents a partial application which can fix the first parameter. Please refer to partialApplication for more details.

Related

Why does this ADO.NET query return no results?

I have the following code that executes a SQL statement and looks for a result.
var sql = #"select BOQ_IMPORT_ID "
+ "from ITIS_PRJ.PRJ_BOQ_IMPORT_HEADER "
+ "where PROJECT_ID = :Projectid "
+ "order by CREATED_ON desc "
+ "fetch first 1 row only";
using (var conn = new OracleConnection(ApplicationSettings.ConnectionString))
using (var cmd = new OracleCommand(sql, conn))
{
conn.Open();
cmd.Parameters.Add(LocalCreateParameterRaw("ProjectId", projectId));
var reader = cmd.ExecuteReader();
if (reader.Read())
{
byte[] buffer = new byte[16];
reader.GetBytes(0, 0, buffer, 0, 16);
var boqId = new Guid(buffer);
return boqId;
}
return null;
}
Where LocalCreateParameterRaw is declared as:
public static OracleParameter LocalCreateParameterRaw(string name, object value)
{
OracleParameter oracleParameter = new OracleParameter();
oracleParameter.ParameterName = name;
oracleParameter.OracleDbType = OracleDbType.Raw;
oracleParameter.Size = 16;
oracleParameter.Value = value;
return oracleParameter;
}
The underlying type for 'projectId' is 'Guid'.
The if (reader.Read()) always evaluates to false, despite there being exactly one row in the table. It normally should return only one row.
Using GI Oracle Profiler I can catch the SQL sent to the db, but only once did the profiler provide a value for the :ProjectId parameter, and it was in lower case. Like that it returned no results, but as soon as I applied UPPER to that value, I get a result.
It looks like I somehow have to get my parameter into uppercase for the query to work, but I have no idea how. Yet if I do a ToString().ToUpper() on the projectId GUID, I get a parameter binding error.
VERY IMPORTANT:
I have tried removing the where clause altogether, and no longer add a parameter, so all rows in the table should be returned, yet still no results.
I don't know how, but making the SQL string a verbatim string (prefixed with #) causes the proc to work. So, it doesn't work with:
var sql = #"SELECT BOQ_IMPORT_ID "
+ "FROM ITIS_PRJ.PRJ_BOQ_IMPORT_HEADER "
+ "WHERE PROJECT_ID = :projectid "
+ "ORDER BY CREATED_ON DESC "
+ "FETCH FIRST ROW ONLY";
Yet the same command string in SQL Developer executes and returns results. When I make my SQL string verbatim, as below, I get results.
var sql = #"select BOQ_IMPORT_ID
from ITIS_PRJ.PRJ_BOQ_IMPORT_HEADER
where PROJECT_ID = :ProjectId
order by CREATED_ON desc
fetch first 1 row only";
Using a more general approach, try the following
var sql = "SELECT BOQ_IMPORT_ID "
+ "FROM ITIS_PRJ.PRJ_BOQ_IMPORT_HEADER "
+ "WHERE PROJECT_ID = :projectid "
+ "ORDER BY CREATED_ON DESC "
+ "FETCH FIRST ROW ONLY";
using (DbConnection conn = new OracleConnection(ApplicationSettings.ConnectionString))
using (DbCommand cmd = conn.CreateCommand()) {
DbParameter parameter = cmd.CreateParameter();
parameter.ParameterName = "projectid";
parameter.Value = projectId.ToString("N").ToUpper(); //<-- NOTE FORMAT USED
cmd.Parameters.Add(parameter);
cmd.CommandType = CommandType.Text;
cmd.CommandText = sql;
conn.Open();
var reader = cmd.ExecuteReader();
if (reader.Read()) {
var boqId = new Guid((byte[])reader[0]);
return boqId;
}
return null;
}
It looks like I somehow have to get my parameter into uppercase for the query to work, but I have no idea how. Yet if I do a ToString().ToUpper() on the projectId GUID, I get a parameter binding error.
Reference Guid.ToString Method
Specifier N formats it to 32 digits: 00000000000000000000000000000000
When no format is provided the default format is D which would include 32 digits separated by hyphens.
00000000-0000-0000-0000-000000000000
That would explain your binding error.

How to set Hive configuration property hive.exec.dynamic.partition from Java code

I have made a java script that will connect to hive using Hiveserver2 and will create table and manage tables, for simple create, drop, insert data works fine.
I want to create external table with partition, for this I need to change the value for the following hive property,
hive.exec.dynamic.partition = true
hive.exec.dynamic.partition.mode = nonstrict
In hive cli I can do it using SET and the property name, but how can this be done in java code.
Here is my Java code:
public class HiveJdbcClient {
private static String strDriverName = "org.apache.hive.jdbc.HiveDriver";
public static void main(String[] args) throws SQLException {
try{
Class.forName(strDriverName);
} catch (ClassNotFoundException e){
e.printStackTrace();
System.out.println("No class found");
System.exit(1);
}
Connection con = DriverManager.getConnection("jdbc:hive2://172.11.1.11:10000/default","root","root123");
Statement stmt = con.createStatement();
String strTableName = "testtable";
//stmt.execute("drop table " + strTableName);
//creating staging table that will load the data to partition data
String strStagingTableSql = "create table if not exists "+strTableName+"_staging "+ " (SEQUENCE_NO DECIMAL, DATE_KEY INT, ACTIVITY_TIME_KEY INT, Ds_KEY INT, Ds_VALUE DECIMAL, TL_DATE_KEY INT) ROW FORMAT DELIMITED FIELDS TERIMANTED BY '~'";
String strMainTableSql = "create external table if not exists "+strTableName+" (SEQUENCE_NO DECIMAL, ACTIVITY_TIME_KEY INT, Ds_KEY INT, Ds_VALUE DECIMAL, TL_DATE_KEY INT) PARTITIONED BY (DATE_KEY INT) ROW FORMAT DELIMITED FIELDS TERMINATED BY '~' LOCATION '/informatica/dwh/teradata/testtable'";
String strCreateSql = "create external table if not exists "+ strTableName + " (key int, value string) row format delimited fields terminated by ','";
boolean res = stmt.execute(strCreateSql);
//show tables
String sql = "show tables '" + strTableName + "'";
ResultSet res1 = stmt.executeQuery(sql);
if (res1.next()){
System.out.println(res1.getString(1));
}
sql = "describe "+ strTableName;
System.out.println("Running: "+ sql);
res1 = stmt.executeQuery(sql);
while (res1.next()){
System.out.println(res1.getString(1) + "\t" + res1.getString(2));
}
// load data into table
// NOTE: filepath has to be local to the hive server
// NOTE: /tmp/a.txt is a ctrl-A separated file with two fields per line
String strFilepath = "/informatica/testing_hive_client_java.txt";
sql = "load data inpath '" + strFilepath + "' into table " + strTableName;
System.out.println("Running: " + sql);
res = stmt.execute(sql);
sql = "select count(1) from "+ strTableName;
System.out.println("Running: "+ sql);
res1 = stmt.executeQuery(sql);
while(res1.next()){
System.out.println(res1.getString(1));
}
}// end of main
}// end of class
Experts please pour in your thoughts.
I was able to solve my problem by following code.
boolean resHivePropertyTest = stmt
.execute("SET hive.exec.dynamic.partition = true");
resHivePropertyTest = stmt
.execute("SET hive.exec.dynamic.partition.mode = nonstrict");
As the code is JDBC client code , so the execute will just go and execute this in hive and so that worked for me.

mariadb jdbc driver blob update not supported

After I replaced mysql jdbc driver 5.1 with mariadb jdbc driver 1.1.5 and tested the existing code base that connected with MySQL Server 5.0 and MariaDB Server 5.2, everything works fine except a JDBC call to update a blob field in a table.
The blob field contains XML configuration file. It can be read out, and convert to xml and insert some values.
Then convert it to ByteArrayInputStream object, and call the method
statement.updateBinaryStream(columnLabel, the ByteArrayInputStream object, its length)
but an exception is thrown:
Perhaps you have some incorrect SQL syntax?
java.sql.SQLFeatureNotSupportedException: Updates are not supported
at
org.mariadb.jdbc.internal.SQLExceptionMapper.getFeatureNotSupportedException(SQLExceptionMapper.java:165)
at
org.mariadb.jdbc.MySQLResultSet.updateBinaryStream(MySQLResultSet.java:1642)
at
org.apache.commons.dbcp.DelegatingResultSet.updateBinaryStream(DelegatingResultSet.java:511)
I tried updateBlob method, the same exception was thrown.
The code works well with mysql jdbc driver 5.1.
Any suggestions on how to work around with this situation?
See the ticket updating blob with updateBinaryStream, which in commnet states that it isn't supported.
A workaround would be to use two SQL statements. One which is used to select the data and other to update the data. Something like this:
final Statement select = connection.createStatement();
try {
final PreparedStatement update = connection.prepareStatement( "UPDATE table SET blobColumn=? WHERE idColumn=?" );
try {
final ResultSet selectSet = select.executeQuery( "SELECT idColumn,blobColumn FROM table" );
try {
final int id = selectSet.getInt( "idColumn" );
final InputStream stream = workWithSTreamAndRetrunANew( selectSet.getBinaryStream( "blobColumn" ) ) );
update.setBinaryStream( 1,stream );
update.setInt( 2,id );
update.execute();
}
finally {
if( selectSet != null )
selectSet.close();
}
}
finally {
if( update != null )
update.close();
}
}
finally {
if( select != null )
select.close();
}
But be aware that you need some information how to uniquely identify a table entry, in this example the column idColumn was used for that purpose. Furthermore is you stored empty stream in the
database you might get an SQLException.
A simpler work around is using binary literals (like X'2a4b54') and concatenation (UPDATE table SET blobcol = blobcol || X'2a4b54') like this:
int iBUFSIZ = 4096;
byte[] buf = new byte[iBUFSIZ];
int iLength = 0;
int iUpdated = 1;
for (int iRead = stream.read(buf, 0, iBUFSIZ);
(iUpdated == 1) && (iRead != -1) && (iLength < iTotalLength);
iRead = stream.read(buf, 0, iBUFSIZ))
{
String sValue = "X'" + toHex(buf,0,iRead) + "'";
if (iLength > 0)
sValue = sBlobColumn + " || " + sValue;
String sSql = "UPDATE "+sTable+" SET "+sBlobColumn+"= "+sValue;
Statement stmt = connection.createStatement();
iUpdated = stmt.executeUpdate(sSql);
stmt.close();
}

Return one value from a table using a function

I have this code:
public int GetUserIdByEmail(string email)
{
using (SqlConnection conn = new SqlConnection(ZincModelContainer.CONNECTIONSTRING))
{
using (SqlCommand cmd = conn.CreateCommand())
{
conn.Open();
cmd.CommandType = System.Data.CommandType.Text;
cmd.CommandText = String.Concat("SELECT [Zinc].[GetUserIdByEmail] (", email, ")"); //is this correct??? the problem lies here
return (int)cmd.ExecuteScalar();
}
}
}
I get the error here in above code. this is still not right
I have my function now as below suggested by veljasije
thanks
Modify your procedure:
CREATE PROCEDURE [Zinc].[GetUserIdByEmail]
(
#Email varchar (100)
)
AS
BEGIN
SELECT zu.UserId from Zinc.Users zu WHERE Email = #Email
END
And in you code change type of parameter from NVarChar to VarChar
Function
CREATE FUNCTION [Zinc].[GetUserIdByEmail]
(
#Email varchar(100)
)
RETURNS int
AS
BEGIN
DECLARE #UserId int;
SET #UserId = (SELECT zu.UserId from Zinc.Users zu WHERE Email = #Email)
RETURN #UserId
END
Firstly, specify the size for the #Email parameter in the sproc - without it, it will default to 1 character which will therefore not be attempting to match on the value you are expecting it to.
Always specify the size explicitly to avoid any issues (e.g. per Marc_s's comment, plus demo I blogged about here, it behaves differently bu defaulting to 30 chars when using CAST/CONVERT )
Secondly, use SqlCommand.ExecuteScalar()
e.g.
userId = (int)cmd.ExecuteScalar();

Storing .NET double value in Oracle DB

I'm using ODP.NET to access Oracle DB from C# .NET.
Please see following code:
OracleConnection con = new OracleConnection();
con.ConnectionString = "User Id=user;Password=pass;Data Source=localhost/orcl";
con.Open();
/* create table */
DbCommand command = con.CreateCommand();
command.CommandType = CommandType.Text;
try
{
command.CommandText = "DROP TABLE TEST";
command.ExecuteNonQuery();
}
catch
{
}
//command.CommandText = "CREATE TABLE TEST (VALUE BINARY_DOUBLE)";
command.CommandText = "CREATE TABLE TEST (VALUE FLOAT(126))";
command.ExecuteNonQuery();
/* now insert something */
double val = 0.8414709848078965;
command.CommandText = "INSERT INTO TEST VALUES (" + val.ToString(System.Globalization.CultureInfo.InvariantCulture) + ")";
command.ExecuteNonQuery();
/* and now read inserted value */
command.CommandText = "SELECT * FROM TEST";
DbDataReader reader = command.ExecuteReader();
reader.Read();
double res = (double) (decimal)reader[0];
Console.WriteLine("Inserted " + val + " selected " + res);
The output from this is always:
Inserted 0,841470984807897 selected 0,841470984807897
But looking at variable values under debugger
val == 0.8414709848078965
res == 0,841470984807897
Why res is rounded up?
I looked into DB and there is stored rounded-up value.
On the other hand I used Oracle SQL Developer to modify this value, and I'm able to store 0.8414709848078965 in database?
I tried types NUMBER, FLOAT(126), BINARY_DOUBLE... always the same result.
Why there is a problem using ODP.NET?
OK, I have found that it works if parameter type is OracleDbType.BinaryDouble. But it causes my code to be dependent of ODP.NET. I wanted to use ADO.NET types (DbType) to achieve my code independency.
Oracle actually has a higher precision for it's numbers than .net!
I tried this in straight Oracle and it works fine, I recommend changing to use a param
e.g.
-- CREATE TABLE TEST (VALUE NUMBER(38,38)); (initial test)
INSERT INTO TEST VALUES (0.8414709848078965);
SELECT * FROM TEST;
VALUE
----------------------
0.8414709848078965
(recommendation)
OracleParameter param = cmd.CreateParameter();
param.ParameterName = "NUMBERVALUE";
param.Direction = ParameterDirection.Input;
param.OracleDbType = OracleDbType.Decimal;
param.Value = "0.8414709848078965";
command.Parameters.Add(param);

Resources