Role of H2 database in Apache Ignite - h2

I have an Apache Spark Job and one of its components fires queries at Apache Ignite Data Grid using Ignite SQL and the query is a SQLFieldsQuery. I was going through the thread dump and in one of the Executor logs I saw the following :
org.h2.mvstore.db.TransactionStore.begin(TransactionStore.java:229)
org.h2.engine.Session.getTransaction(Session.java:1580)
org.h2.engine.Session.getStatementSavepoint(Session.java:1588)
org.h2.engine.Session.setSavepoint(Session.java:793)
org.h2.command.Command.executeUpdate(Command.java:252)
org.h2.jdbc.JdbcStatement.executeUpdateInternal(JdbcStatement.java:130)
org.h2.jdbc.JdbcStatement.executeUpdate(JdbcStatement.java:115)
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.connectionForThread(IgniteH2Indexing.java:428)
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.connectionForSpace(IgniteH2Indexing.java:360)
org.apache.ignite.internal.processors.query.h2.IgniteH2Indexing.queryLocalSqlFields(IgniteH2Indexing.java:770)
org.apache.ignite.internal.processors.query.GridQueryProcessor$5.applyx(GridQueryProcessor.java:892)
org.apache.ignite.internal.processors.query.GridQueryProcessor$5.applyx(GridQueryProcessor.java:886)
org.apache.ignite.internal.util.lang.IgniteOutClosureX.apply(IgniteOutClosureX.java:36)
org.apache.ignite.internal.processors.query.GridQueryProcessor.executeQuery(GridQueryProcessor.java:1666)
org.apache.ignite.internal.processors.query.GridQueryProcessor.queryLocalFields(GridQueryProcessor.java:886)
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.query(IgniteCacheProxy.java:698)
com.test.ignite.cache.CacheWrapper.queryFields(CacheWrapper.java:1019)
The last line in my code executes a sql fields query as follows :
SqlFieldsQuery sql = new SqlFieldsQuery(queryString).setArgs(args);
cache.query(sql);
According to my understanding, Ignite has its own data grid which it uses to store the cache data and indices. It only makes use of H2 database to parse the SQL query and get a query execution plan.
But, the Thread dump shows that updates are being executed and transactions are involved. I don't understand the need for transactions or updates in a SQL Select Query.
I want to know the following about the role of H2 database in Ignite :
I went into the open source code of Apache Ignite(version 1.7.0) and saw that it was trying to open a connection to a specific schema in H2 database by executing the query SET SCHEMA schema_name ( connectionForThread() method of IgniteH2Indexing class ). Is one schema or one table created for every cache ? If yes, what information does it contain since all the data is stored in ignite's data grid.
I also came across another interesting thing in the open source code which is that Ignite tries to derive the schema name in H2 from space name ( reference can be found in queryLocalSqlFields() method of IgniteH2Indexing class ). I want to know what does this space name indicate and is it something internal to Ignite or configurable ?
Would the setting of schema and connection to H2 db happen for each of my SQL query, if yes then is there any way to avoid this ?

Yes, we call executeUpdate to set schema. In Ignite 2.x we will be able to switch to Connection.setSchema for that. Right now we create SQL schema for each cache and you can create multiple tables in it, but this is going to be changed in the future. It does not actually contain anything, we just utilize some H2 APIs.
Space name is basically the same thing as a cache name. You can configure SQL schema name for a cache using CacheConfiguration.setSqlSchema.
If you run queries using the same cache instance, schema will not change.

Related

H2 database table getting clear automatically

I am using H2 Database to test my SpringBoot application. I do not use the file to store the data. instead I just use the In Memory datatabase. in the properties file, my JdbcUrl look like below:
spring.datasource.url=jdbc:h2:mem:;MODE=MSSQLServer;INIT=runscript from 'classpath:/schema.sql'\\;runscript from 'classpath:/data.sql'
Now When I run the tests, I have the following test scenario
Add Some Entities in a table (This adds some rows in a table)
search those entities by some criteria
Do the assertion
Now, sometime this runs successfully, but sometimes what happens is, the search query returns empty list, which causes the test to be failed.
I tried to add print statements just to check whether my entities are getting inserted properly, so in the insert function. after each insertion, I run the below query
SELECT * FROM tableName;
Which returns correct list. means each insertion is inserting in the table correctly. Now in the search function, before running the actual search query, I run the same query again
SELECT * from tableName;
And Surprisingly this is returning empty also, which means there is no data in the table. Please suggest what I check for?
Pretty sure #Evgenij Ryazanov's comment is correct here.
Closing the last connection to a database closes the database.
When using in-memory databases this means the content is lost.
After step 1) Add Some Entities in a table - is the connection closing?
If so to keep the database open, add ;DB_CLOSE_DELAY=-1 to the database URL.
e.g.
spring.datasource.url=jdbc:h2:mem:;DB_CLOSE_DELAY=-1;MODE=MSSQLServer;INIT=runscript from 'classpath:/schema.sql'\\;runscript from 'classpath:/data.sql'
Note, this can create a memory leak!
see: http://www.h2database.com/html/features.html#in_memory_databases

Embeded H2 Database for dynamic files

In our application, we need to load large CSV files and fetch some data out of it. For example, getting the distinct values from the CSV file. For this, we decided to go with in-memory DB's like H2, as there is no need to store the data in persistent storage.
However, the file is so dynamic that the columns may not be the same. I need to load the file to the H2 database to a table that is temporary for that session.
Tech Stack is Spring boot and H2.
The examples I see on forums is using a standard entity that knows what fields the table has. However my case the table columns will be dynamic
I tried the below in spring boot
public interface ImportCSVRepository extends JpaRepository<Object, String>
with
#Query(value = "CREATE TABLE TEST AS SELECT * FROM CSVREAD('test.csv');", nativeQuery = true)
But this gives unmanaged entity error. I understand why the error is thrown. However I am not sure how to achieve this. Also please clarify if I should use Spring-batch ?
You can use JdbcTemplate to manually create tables and query/update the data in them.
An example of how to create a table with JdbcTemplate
Dynamically creating tables and defining new entities (or modifying existing ones) is hardly possible with spring-data repositories and #Entity-ies. You probably should also check some NoSQL dbs like MongoDb - it's easier to define documents (or key-value objects - Redis) with dynamic structures in them.

Running multiple hive queries using tHiveRow component in Talend

Hi i want to tun multiple hive queries through a single component. Through tHiveRow i'm able to run single query but unable to run multiple queries at a time.
I know that we can run multiple sql queries after going through the following link http://www.vikramtakkar.com/2013/05/example-to-execute-multiple-sql-queries.html
But any one has any idea as how to run multiple queries?
Your link reference shows a MySQL connection... this says nothing about the Hive JDBC driver capabilities, since running multiple statements in one JDBC statement is a driver specific feature!
To run multiple queries:
Start with a tFixedFlowInput component. Configure one String column and choose table input option; you will get a table with one column. Each line, you add, will be one Hive statement. Now connect it with a tHiveRow component and use the column of the ingoing flow in the SQL textarea by <flowName>.<columnName> e.g.: row1.sqlStatement (if the String column in your tFixedFlowInput has the name "sqlStatement" and the connection between the tFixedFlowInput and the tHiveRow component is called "row1").

Index Check in OpenEdge 10.2b which uses Oracle schema

How to know index usage of particular module in Openedge 10.2 which uses Oracle db schema?
I have used XREF but .xrf does not give any index details for my module, so I have run below simple query and then checked in .xrf but no index detail available.
FOR EACH tablename NO-LOCK USE-INDEX indexname:
DISPLAY tablename.field.
END.
Please help me how to get index detail for Progress db using oracle schema.
First I assume you are using Oracle DataServer from Progress.
If that is the case, bear in mind that all USE-INDEX will be translated basically into ORDER BY in the resulting query, so mostly being used to order not to access the data.
If you want a know how your information is accessed you'll need to enable qt_debug when connecting to the schema holder, that will allow you to print many information about how your progress code is translated to SQL to access the Oracle DB. You'll need to analyze those SQL (SQL EXPLAIN as an example) to see the performance of your queries and how they are accessing the DB.

How to query the session in ASP.NET MVC with a dynamic query

I want to store some user data in memory, like some in-memory noSQL database.
But later on I want to query that data with a dynamic query constructed from the user. That query is stored in a classic DB like a string, so when I need to query the data stored in memory I would like to parse that string and construct the desired query (by some known rules).
I looked at Redis and I figured out it isn't maintained for Windows anymore, I have also looked at RavenDB but it's main query language is LINQ, even though it can be created dynamic Lucene Query.
Can you suggest me another in memory DB that work with ASP.NET and can be queried with a dynamically created query? Maybe I haven't seen all the options.
I prefer name-value or JSON based noSQL so it's schema can be easyly modified without the constraints of the relation type of DBs
I would suggest to simply use sqlite. It can be easily used as an in-memory database (just open the database using ":memory:" instead of a file name).
You can use a simple 2 columns table with a primary key to emulate a key/value store.
Here are a few links you might find helpful:
http://www.sqlite.org/inmemorydb.html
How to create asp.net web application using sqlite

Resources