TTL on Ignite 2.5.0 not working - spring

I tried enabling TTL for records in Ignite using 2 approaches, but didn't seems to be working. Need help to understand if I am missing something.
IgniteCache cache = ignite.getOrCreateCache(IgniteCfg.CACHE_NAME);
cache.query(new SqlFieldsQuery(
"CREATE TABLE IF NOT EXISTS City (id LONG primary key, name varchar, region varchar)"))
.getAll();
cache.withExpiryPolicy(new CreatedExpiryPolicy(new Duration(TimeUnit.SECONDS, 10)))
.query(new SqlFieldsQuery(
"INSERT INTO City (id, name, region) VALUES (?, ?, ?)").setArgs(1, "Forest Hill1", "GLB"))
.getAll();
So you see above I created table in Cache and inserted record mentioning expiry TTL for 10 seconds, but seems that it never expires.
I tried another approach of rather than setting TTL while inserting the record, I mentioned in CacheConfiguration while I initialize Ignite, below is the code sample
Ignition.setClientMode(true);
IgniteConfiguration cfg = new IgniteConfiguration();
// Disabling peer-class loading feature.
cfg.setPeerClassLoadingEnabled(false);
CacheConfiguration ccfg = createCacheConfiguration();
cfg.setCacheConfiguration(ccfg);
ccfg.setEagerTtl(true);
ccfg.setExpiryPolicyFactory(CreatedExpiryPolicy.factoryOf(new Duration(TimeUnit.SECONDS, 5)));
TcpCommunicationSpi commSpi = new TcpCommunicationSpi();
cfg.setCommunicationSpi(commSpi);
TcpDiscoveryVmIpFinder tcpDiscoveryFinder = new TcpDiscoveryVmIpFinder();
String[] addresses = { "127.0.0.1" };
tcpDiscoveryFinder.setAddresses(Arrays.asList(addresses));
TcpDiscoverySpi discoSpi = new TcpDiscoverySpi();
discoSpi.setIpFinder(tcpDiscoveryFinder);
cfg.setDiscoverySpi(discoSpi);
return Ignition.start(cfg);
Executing Ignite locally (not as in memory) as my final goal is to be able to connect to same Ignite from multiple instances of app or even multiple apps.

Ignite SQL currently doesn't interact with expiry policies and doesn't update TTL. There is a Feature Request for that: https://issues.apache.org/jira/browse/IGNITE-7687.

Related

Quartz creating two triggers for one job

I am using Quartz 2.3.0 with my Spring Boot project and I have one job that runs in the clustered environment every 45 minutes, below is my quartz.properties file content.
org.quartz.scheduler.instanceName = SSDIClusteredScheduler
org.quartz.scheduler.instanceId = AUTO
# thread-pool
org.quartz.threadPool.class=org.quartz.simpl.SimpleThreadPool
org.quartz.threadPool.threadCount=1
org.quartz.threadPool.threadsInheritContextClassLoaderOfInitializingThread=true
org.quartz.jobStore.isClustered = true
org.quartz.jobStore.clusterCheckinInterval = 20000
# Enable these properties for a JDBCJobStore using JobStoreTX
org.quartz.jobStore.class=org.quartz.impl.jdbcjobstore.JobStoreTX
org.quartz.jobStore.driverDelegateClass=org.quartz.impl.jdbcjobstore.StdJDBCDelegate
org.quartz.jobStore.dataSource=quartzDataSource
# Enable this property for JobStoreCMT
#org.quartz.jobStore.nonManagedTXDataSource=quartzDataSource
#============================================================================
# Configure Datasources
#============================================================================
org.quartz.dataSource.quartzDataSource.driver=oracle.jdbc.driver.OracleDriver
org.quartz.dataSource.quartzDataSource.URL=${quartz_datasource_url}
org.quartz.dataSource.quartzDataSource.user=${quartz_datasource_username}
org.quartz.dataSource.quartzDataSource.maxConnections = 5
org.quartz.dataSource.quartzDataSource.validationQuery=select 0 from dual
And below is my code to create a trigger:
#Bean
public Trigger someTrigger(#Qualifier("someJob") JobDetail jobDetail) {
Trigger trigger = TriggerBuilder.newTrigger().withIdentity(jobDetail.getKey().getName()).forJob(jobDetail)
.withSchedule(CronScheduleBuilder.cronSchedule(someCronExpression)).build();
return trigger;
}
But when I am running the job two triggers are getting created for one single job one with name quartzScheduler and another with my instance name i.e. SSDIClusteredScheduler for the same job and one trigger says that it is clustered and another is non-clustered.
I am not able to understand how this is happening I have explored a lot of documents but I am not able to find the cause of this.
Content of Triggers table
Content of Fired_Triggers table

Increasing redis key expiry on fetching that key data from redis cache

We all know that redis cache has ttl timeouts for the cache. I would like to know if there is a provision in redis cache to increase the ttl for every key based on fetch of the data .
That means if the data is fetched from the redis for a key then automatically the ttl is increased.
Pls help me to get some info on that.
You can achieve that with Lua scripting: get val and TTL (in millisecond), increment TTL, set new TTL:
local key = KEYS[1]
local pttl_incr = ARGV[1]
local val = redis.call("get", key)
if not val then return nil end
local pttl = redis.call("pttl", key)
pttl = pttl + pttl_incr
redis.call("expire", key, pttl)
return val

How to set the starting point when using the Redis scan command in spring boot

i want to migrate 70million data redis(sentinel-mode) to redis(cluster-mode)
ScanOptions options = ScanOptions.scanOptions().build();
Cursor<byte[]> c = sentinelTemplate.getConnectionFactory().getConnection().scan(options);
while(c.hasNext()){
count++;
String key = new String(c.next());
key = key.trim();
String value = (String)sentinelTemplate.opsForHash().get(key,"tc");
//Thread.sleep(1);
clusterTemplate.opsForHash().put(key, "tc", value);
}
I want to scan again from a certain point because redis connection disconnected at some point.
How to set the starting point when using the Redis scan command in spring boot?
Moreover, whenever the program is executed using the above code, the connection is broken when almost 20 million data are moved.

Access Always Encrypted data from Databricks

I have a table in Azure SQL managed instance with 'Always Encrypted' columns. I stored the Column and master keys in Azure key Vault.
My first, question is - How do I access the decrypted data in Azure SQL from Databricks. For that I connected to Azure SQL via jdbc. For Username and Password, I am passing my credentials manually
val jdbcHostname = "XXXXXXXXXXX.database.windows.net"
val jdbcPort = 1433
val jdbcDatabase = "ABCD"
val jdbcUrl = s"jdbc:sqlserver://${jdbcHostname}:${jdbcPort};database=${jdbcDatabase}"
// Create a Properties() object to hold the parameters.
import java.util.Properties
val connectionProperties = new Properties()
connectionProperties.put("user", s"${jdbcUsername}")
connectionProperties.put("password", s"${jdbcPassword}")
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
connectionProperties.setProperty("Driver", driverClass)
import java.sql.DriverManager
val connection = DriverManager.getConnection(jdbcUrl, jdbcUsername, jdbcPassword)
connection.isClosed()
val user = spark.read.jdbc(jdbcUrl, "dbo.bp_mp_user_test", connectionProperties)
display(user)
When I do this I am able to display the data, but it is encrypted data. How do I see the decrypted data
I am new to Azure and Databricks combo, so still learning Azure/Microsoft stack. Are there any other forms of jdbc connection syntax that allows you to decrypt.
I have the keys in Azure Keyvault. So how do I make use of those keys and the security associated with those keys, that way when someone accesses this table, it shows encrypted/decrypted data in the Databricks when accessed.

How to stabilize spark streaming application with a handful of super big sessions?

I am running a Spark Streaming application based on mapWithState DStream function . The application transforms input records into sessions based on a session ID field inside the records.
A session is simply all of the records with the same ID . Then I perform some analytics on a session level to find an anomaly score.
I couldn't stabilize my application because a handful of sessions are getting bigger at each batch time for extended period ( more than 1h) . My understanding is a single session (key - value pair) is always processed by a single core in spark . I want to know if I am mistaken , and if there is a solution to mitigate this issue and make the streaming application stable.
I am using Hadoop 2.7.2 and Spark 1.6.1 on Yarn . Changing batch time, blocking interval , partitions number, executor number and executor resources didn't solve the issue as one single task makes the application always choke. However, filtering those super long sessions solved the issue.
Below is a code updateState function I am using :
val updateState = (batchTime: Time, key: String, value: Option[scala.collection.Map[String,Any]], state: State[Seq[scala.collection.Map[String,Any]]]) => {
val session = Seq(value.getOrElse(scala.collection.Map[String,Any]())) ++ state.getOption.getOrElse(Seq[scala.collection.Map[String,Any]]())
if (state.isTimingOut()) {
Option(null)
} else {
state.update(session)
Some((key,value,session))
}
}
and the mapWithStae call :
def updateStreamingState(inputDstream:DStream[scala.collection.Map[String,Any]]): DStream[(String,Option[scala.collection.Map[String,Any]], Seq[scala.collection.Map[String,Any]])] ={//MapWithStateDStream[(String,Option[scala.collection.Map[String,Any]], Seq[scala.collection.Map[String,Any]])] = {
val spec = StateSpec.function(updateState)
spec.timeout(Duration(sessionTimeout))
spec.numPartitions(192)
inputDstream.map(ds => (ds(sessionizationFieldName).toString, ds)).mapWithState(spec)
}
Finally I am applying a feature computing session foreach DStream , as defined below :
def computeSessionFeatures(sessionId:String,sessionRecords: Seq[scala.collection.Map[String,Any]]): Session = {
val features = Functions.getSessionFeatures(sessionizationFeatures,recordFeatures,sessionRecords)
val resultSession = new Session(sessionId,sessionizationFieldName,sessionRecords)
resultSession.features = features
return resultSession
}

Resources