Composite index is not getting registered. It is struck in ENABLED status forever - janusgraph

I'm running janusgraph with cassandra and ES as backend. Following is my script used in building the composite index.
JanusGraphManagement mgmt = graph.openManagement()
def addPropertyKeyIfNotExists(JanusGraphManagement mgmt, String keyName, Class keyType, org.janusgraph.core.Cardinality cardinalityType) {
if (!mgmt.containsPropertyKey(keyName)) mgmt.makePropertyKey(keyName).dataType(keyType).cardinality(cardinalityType).make()
}
vertexCompositeIndexName = "vertex_data_composite"
addPropertyKeyIfNotExists(mgmt, "vertex_id", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "tenant", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "entity", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "entity_type", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "first_seen_at", Long.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "first_seen_source", String.class, org.janusgraph.core.Cardinality.SINGLE)
addPropertyKeyIfNotExists(mgmt, "last_seen_at", Long.class, org.janusgraph.core.Cardinality.SINGLE)
vertexId = mgmt.getPropertyKey("vertex_id")
tenant = mgmt.getPropertyKey("tenant")
entity = mgmt.getPropertyKey("entity")
entityType = mgmt.getPropertyKey("entity_type")
firstSeenAt = mgmt.getPropertyKey("first_seen_at")
firstSeenSource = mgmt.getPropertyKey("first_seen_source")
lastSeenAt = mgmt.getPropertyKey("last_seen_at")
if (!mgmt.containsGraphIndex(vertexCompositeIndexName)) {
mgmt.buildIndex(vertexCompositeIndexName, Vertex.class).
addKey(vertexId).
addKey(tenant).
addKey(entity).
addKey(firstSeenSource).
addKey(entityType).
addKey(firstSeenAt).
addKey(lastSeenAt).
buildCompositeIndex()
}
println(mgmt.printSchema())
mgmt.commit()
mgmt.close()
graph.close()
graph = JanusGraphFactory.open("/etc/opt/janusgraph/janusgraph.properties")
mgmt = graph.openManagement()
mgmt.awaitGraphIndexStatus(graph, vertexCompositeIndexName).call()
It has been more than an hour and still the composite index status is in ENABLED. It never became REGISTERED.
gremlin> mgmt.printSchema()
==>------------------------------------------------------------------------------------------------
Graph Index (Vertex) | Type | Unique | Backing | Key: Status |
---------------------------------------------------------------------------------------------------
vertex_data_composite | Composite | false | internalindex | vertex_id: ENABLED |
| | | | tenant: ENABLED |
| | | | entity: ENABLED |
| | | | first_seen_source: ENABLED |
| | | | entity_type: ENABLED |
| | | | first_seen_at: ENABLED |
| | | | last_seen_at: ENABLED |
---------------------------------------------------------------------------------------------------
I see the following in my logs
jce-janusgraph | 7953215 [gremlin-server-worker-1] INFO org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher - Some key(s) on index vertex_data_composite do not currently have status(es) [REGISTERED]: entity_type=ENABLED,vertex_id=ENABLED,first_seen_at=ENABLED,first_seen_source=ENABLED,last_seen_at=ENABLED,tenant=ENABLED,entity=ENABLED
jce-janusgraph | 7953216 [gremlin-server-worker-1] INFO org.janusgraph.graphdb.database.management.GraphIndexStatusWatcher - Timed out (PT1M) while waiting for index vertex_data_composite to converge on status(es) [REGISTERED]
Composite index not used while querying.
gremlin> g.V().has("vertex_id","ddd").profile()
==>Traversal Metrics
Step Count Traversers Time (ms) % Dur
=============================================================================================================
JanusGraphStep([],[vertex_id.eq(ddd)]) 49.819 100.00
constructGraphCentricQuery 15.133
constructGraphCentricQuery 0.074
GraphCentricQuery 19.326
\_condition=(vertex_id = ddd)
\_orders=[]
\_isFitted=false
\_isOrdered=true
\_query=[]
scan 17.445
\_query=[]
\_fullscan=true
\_condition=VERTEX
>TOTAL - - 49.819 -
We see following in the logs
jce-janusgraph | 339186 [gremlin-server-session-1] WARN org.janusgraph.graphdb.transaction.StandardJanusGraphTx - Query requires iterating over all vertices [(vertex_id = ddd)]. For better performance, use indexes

From JanusGraph index lifecycle you can see that your index is already ENABLED and can be used. You don't need to wait for the index to be REGISTERED in this particular case.
Normally index transition in the following order:
INSTALLED - The index is installed in the system but not yet registered with all instances in the cluster.
REGISTERED - The index is registered with all instances in the cluster but not (yet) enabled.
ENABLED - The index is enabled and in use.
DISABLED - The index is disabled and no longer in use.
Usually after your created the index it becomes INSTALLED, then you wait until all JanusGraph nodes pick up the newly created index and it changes the status to REGISTERED. As soon as it is REGISTERED (meaning all JanusGraph nodes know about it) you can enable the index or start REINDEX process which automatically enables the index after reindex is finished.
So, why your index changed the status to ENABLED immediately instead of transitioning from state to state? That's because JanusGraph has a special optimization which enables newly created indexes immediately if all it's keys were created in the same transaction. In your situation all properties were created in the same transaction (most likely). Thus, your index is now ENABLED. You don't need to do anything else because your index is already in use.
P.S. As a side topic, not directly related to this use-case but related to the issue when the index cannot change it's state from INSTALLED to REGISTERED you can checkout the following suggestions.

Related

Spring batch step read_count, write_count and commit_count keep on increasing without any rollbacks or skips

I am investigating a spring batch job developed by someone else.
Each step reads from the database in its reader, transforms the read JPA entities to DTOs in processor and the writer does several operations on each of these DTOs that may involve database queries, third party API calls, or both.
The JPQL used in the reader picks only 105 records (limit 105 is attached to the resulting SQL query) - I am using org.springframework.batch.item.database.JpaPagingItemReader and passing pageSize=105. Also, I have overridden org.springframework.batch.item.database.AbstractPagingItemReader#getPage to always return 0. This is because the table continuously gets new entries inserted; keeping getPage to its default implementation could risk missing some records. The JPQL used in reader itself takes care of the ordering and filtering of relevant records.
Each step is configured to operate on the read records in chunks of 15 records.
However, the read_count, write_count are much higher considering the record limit and chunk size configurations:
postgres_database=> select * from batch_step_execution where step_execution_id = 445124;
step_execution_id | version | step_name | job_execution_id | start_time | end_time | status | commit_count | read_count | filter_count | write_count | read_skip_count | write_skip_count |
process_skip_count | rollback_count | exit_code | exit_message | last_updated
-------------------+---------+-------------------------------------+------------------+-------------------------+-------------------------+-----------+--------------+------------+--------------+-------------+-----------------+------------------+
--------------------+----------------+-----------+--------------+-------------------------
445124 | 896 | step1 | 278076 | 2023-01-27 16:08:02.074 | 2023-01-29 21:41:09.375 | COMPLETED | 894 | 13395 | 0 | 13395 | 0 | 0 |
0 | 0 | COMPLETED | | 2023-01-29 21:41:09.375
(1 row)
As can be seen, the read_count is 13395 (while a limit of 105 gets attached to the reader JPQL as mentioned earlier), write_count = read_count. I was thinking in terms of possible retries, but based on what I found about Spring Batch, the rollback_count should be >0 (since a retry is preceded by a rollback).
Adding the step configuration for reference:
#Bean
public Step step1(
StepBuilderFactory stepBuilderFactory,
ItemReader<JPAEntity> reader,
ItemWriter<DTO> writer) {
return stepBuilderFactory
.get("step1")
.<JPAEntity, DTO>chunk(15)
.reader(reader)
.processor((ItemProcessor<JPAEntity, DTO>) Transformer::fromEntityToDto)
.writer(writer)
.faultTolerant()
.noRollback(Exception.class)
.build();
}
I have been unable to figure out why this could be happening.
I found Spring Batch docs weren't that helpful, and couldn't see anything similar being discussed in any other thread (on StackOverflow, or anywhere else).
Any kind of help would be highly appreaciated.
Thank you.
I went through Spring Batch docs, read a few articles, stackoverflow answers. But found nothing. Even ran a simple Spring Batch job but never saw it happen in that application.
The rollback count is zero because of .norollback(Exception.class). It is not going to rollback all exception. All exception in java extends Exception class. More details on doc below.
https://docs.spring.io/spring-batch/docs/current/api/org/springframework/batch/core/step/builder/FaultTolerantStepBuilder.html#noRollback(java.lang.Class)

Azure Log analytics - availability of a service

I am trying to visualize the availability of a windows service by using the System event log of a machine, which is stored into Azure Log analytics. When the service is running, I consider it as a 1 and when it is stopped, I consider it as a 0, so that I can graph it out
| where EventLog == "System" and Source == "Service Control Manager" and EventID == 7036 and Computer =="COMPNAME"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| where Windows_Service_Name == "Svc_Name"
| extend status = iif(Windows_Service_State == "running", "1", "0")
| summarize min(status) by bin(TimeGenerated , 1m)
| order by TimeGenerated, min_status desc
However, I am not getting any charts by selecting the charts on Azure Log analytics window. How can I produce a graph of availabilty using this query ?
Regards,
So there is two issues in this approach, the first is the actual syntax causing your issue. You are treating the status as strings and not numbers. If you remove the quotation marks hugging the numbers the chart will work.
The second part is that these logs are event based, which means that it will just log 1 event when its stopped and 1 event when its started. So looking at a specific time is impossible. You will in your graph just see last event change and after the retention time for the log passes and there has been no changes there will be no events at all.
Event
| where EventLog == "System" and Source == "Service Control Manager" and EventID == 7036 and Computer =="COMPNAME"
| parse kind=relaxed EventData with * '<Data Name="param1">' Windows_Service_Name '</Data><Data Name="param2">' Windows_Service_State '</Data>' *
| where Windows_Service_Name == "Svc_Name"
| extend status = iif(Windows_Service_State == "running", 1, 0)
| summarize min(status) by bin(TimeGenerated , 1m)
| order by TimeGenerated, min_status desc

Why does the first response takes more time in GAS web app?

Edit: I am using clasp. Updated the code to the actual GAS!
I have a GAS deployed as a web app. We send POST requests from Slack via a slash command and it needs a response in less than 3000ms because GAS can't handle asynchronous code.
At the first request, it takes more than 3000ms to send a response but on the following requests, it is around 1500ms.
The doPost function looks like the following.
var exports = exports || {};
var module = module || { exports: exports };
Logger = BetterLog.useSpreadsheet('spreadsheetId');
function doPost(request) {
var startExecutionDate = new Date();
var path = request.parameter.path;
Logger.log("Request received with path: " + path);
var response = Responses.Error;
var token = request.parameter.token;
if (path.startsWith('/slack')) {
Logger.log("Slack request");
var slackRouter = new SlackRouter();
response = slackRouter.post(request);
// ...
}
// ...
}
And this is the code for the Slack Router.
var exports = exports || {};
var module = module || { exports: exports };
var SlackRouter = (function () {
function SlackRouter() {
}
SlackRouter.prototype.post = function (request) {
var path = request.parameter.path;
switch (path) {
case Routes.Team:
Logger.log("For team");
// ...
}
};
return SlackRouter;
}());
exports.SlackRouter = SlackRouter;
I have the timestamps for each log.
First attempt
| Timestamp | Delta in ms | Log Message |
|--------------|-------------|---------------|
| 11:22:34:164 | 0 | Path: ... |
| 11:22:35:354 | 1190 | Slack request |
| 11:22:35:462 | 108 | For team |
Second attempt
| Timestamp | Delta in ms | Log Message |
|--------------|-------------|---------------|
| 11:22:45:047 | 0 | Path: ... |
| 11:22:45:164 | 117 | Slack request |
| 11:22:45:350 | 186 | For team |
I had several ideas already like the web app goes to a sleep state but since we calculate delta from the first log message it doesn't make sense.
So what is going on behind the scenes? Are you aware of any easy workarounds? If possible I don't want to build a microservice to send a response to Slack in time and later send the actual response.
The Apps Script servers don't keep every script ever written or deployed loaded in memory, and so scripts that haven't been run in a while need to be loaded from disk first. This is usually referred to as a "cold start time" in Cloud providers.
Answered by Eric Koleda on Google Apps Script Community forum
The most glaring issue is your use of ES6 syntax in your doPost() method.
Google Apps Script does not support ES6 template string syntax and only partially supports destructuring assignments. So that might be your issue. Your doPost() probably fails to return a value as a result so Slack likely repeats the request until it times out.
Try removing the BetterLog library. That may be causing the initial first-time delay.
https://developers.google.com/apps-script/guides/libraries
Warning: A script that uses a library does not run as quickly as it would if all the code were contained within a single script project. Although libraries can make development and maintenance more convenient, you should use them sparingly in projects where speed is critical. Because of this issue, library use should be limited in add-ons.

Spring Data JPA 1.5.2, one to Many self mapping issue

I have Process Table which maintains the logs of the processes executed along with the STATUS.
Now I have another functionality where user can revert/rollback the processes. So for each process there can at max one successful rollback and many failed rollback.
exa :
Main process : Load Data
Revert process 1:failed
Revert process 2: failed
Revert process 3: Success
These revert processes are also maintained in the same Process table so that they will have all the details like who , when and for name od revert process.
Table : Process
Class : Process
Class members :
#Column(name="process_name")
private String processName;
#Temporal(TemporalType.DATE)
#Column(name="trn_date")
private Date transactionDate;
#Column(name="user_name")
private String user;
#Temporal(TemporalType.TIMESTAMP)
#Column(name="start_time")
private Date startDateTime;
#Temporal(TemporalType.TIMESTAMP)
#Column(name="end_time")
private Date endDateTime;
#Column(name="process_status")
private String status;
#Column(name="status_detail")
private String detail;
I have created another table called Process_Reverse which has Id's of main process and it's reverse process. It will have id's which are present in the Process table and this table can be called as Join Table.
Sample Data :
Table : Process
Process Id | Process name | Date | Status | .....
1121 | DATA LOAD | 25-05-2018 | Success | .....
1125 | REV DATA LOAD | 26-05-2018 | Failed | .....
1127 | REV DATA LOAD | 26-05-2018 | Failed | .....
Table : Process_reverse
Process Id | Rev Process Id
1121 | 1125
1121 | 1127
Can you please advise what should be the configuration in the Process Java Class which I can use for the joining it to itself via JoinTable ?
I am using dependency
<dependency>
<groupId>org.springframework.data</groupId>
<artifactId>spring-data-jpa</artifactId>
<version>1.5.2.RELEASE</version>
</dependency>

Powershell - Add (default display) object property values from pipe to string

Update 2: I ended up figuring this out while writing it. I figured I'd post it for anyone else muddling through. See sub-heading 'Resolution - Get only default properties', or the answer. Please feel free to respond with alternate (especially better!) methods or comments.
Update 1: I started out without a way to do this, I've since found a way for ALL properties. I've left the build-up for anyone else confused like I was, but my current problem is that I want JUST the default display properties - see sub-heading 'Get all Properties'
Let's say I have a collection of objects in powershell - the specific example I'm working with is a collection of events acquired using the get-winevent cmdlet.
Does anyone know an elegant way to get the values of all the (default) properties of each object, from the pipeline, and add them to the end of a string? Especially a way that doesn't involve needing to know which properties you want.
For example, using variable $events containing some event log entries, if I simply call $events powershell will make some assumptions about the properties I want and format them as a list:
PS C:\> $events
TimeCreated ProviderName Id Message
----------- ------------ -- -------
11/09/2014 3:59:... Microsoft-Window... 4634 An account was l...
11/09/2014 3:58:... Microsoft-Window... 4634 An account was l...
However, if I try to precede the returned entries with a string, I get the property names rather than values:
PS C:\> $events | %{"NEW RECORD" + $_}
NEW RECORDSystem.Diagnostics.Eventing.Reader.EventLogRecord
NEW RECORDSystem.Diagnostics.Eventing.Reader.EventLogRecord
PS C:\> $events | %{"NEW RECORD" + $_.properties}
NEW RECORDSystem.Diagnostics.Eventing.Reader.EventProperty System.Diagnostics.E
venting.Reader.EventProperty System.Diagnostics.Eventing.Reader.EventProperty S
ystem.Diagnostics.Eventing.Reader.EventProperty System.Diagnostics.Eventing.Rea
der.EventProperty
The easiest work around I could think of involved using (and therefore knowing) the property values, and also losing the notation that format-table or format-list would provide:
PS C:\> $events | %{"NEW RECORD - TimeCreated: " + $_.TimeCreated + "; ProviderName: "`
+ $_.ProviderName + "; ID: " + $_.ID + "; Message: " + $_.message}
NEW RECORD - TimeCreated: 09/11/2014 15:58:08; ProviderName: Microsoft-Windows-
Security-Auditing; ID: 4672; Message: Special privileges assigned to new logon.
Subject:
Security ID: S-*-*-**-*********-**********-**********-*****
Account Name: **********
Account Domain: **********
Logon ID: 0x**********
Privileges: SeSecurityPrivilege
Get all Properties
So I've discovered I CAN get ALL the properties, and their names, like this:
PS C:\> $events | %{"NEW RECORD" + ($_.psobject.properties | %{$_.name ; ":" ; $_.value})}
NEW RECORDMessage : Special privileges assigned to new logon.
Subject:
Security ID: S-*-*-**-*********-**********-**********-*****
Account Name: **********
Account Domain: **********
Logon ID: 0x**********
Privileges: SeSecurityPrivilege Id : 4672 Version : 0 Qualifiers :
Level : 0 <and so on>
However, I'm now pulling a bunch of stuff the consumers of my data won't need, since I only need the default properties and their names, plus a self-defined delimiter.
Is anyone aware of a notation that will return all values of all default display properties without said properties needing to be spelled out? Either a generic container for values (eg. $_.properties.value, though I tried that and it didn't work), or something like expand-property only without needing to specify a particular property name?
Resolution - Get only default properties
So it turns out I was overthinking this. FOREACH (%{}) can obviously preserve data from the pipeline across statements, so if I use two statements I can achieve the desired effect:
PS C:\> $events | format-list | %{"NEW RECORD" ; $_}
NEW RECORD
Message : An account was successfully logged on.
<and etc>
I answered this while writing it, the details are above. In order to collect all properties from an object and their values, and include both as part of a string:
PS C:\> $events | %{"NEW RECORD" + ($_.psobject.properties | %{$_.name ; ":IMASTRING:" ; $_.value})}
The above method owes a lot to the answer by Shay Levy to this question.
To include only the default properties and their values, preceded by a string:
PS C:\> $events | format-list | %{"NEW RECORD" ; $_}
To include all properties and their values, preceded by a string but retain the default formatting:
PS C:\> $events | select-object * | format-list | %{"NEW RECORD"; $_}
I think you've done things the easy and maybe best way for your situation. There actually is a way to know the names of the default properties
PS Scripts:\> $x = gwmi -Class win32_operatingsystem
PS Scripts:\> $x.psstandardmembers
PSStandardMembers {DefaultDisplayPropertySet}
PS Scripts:\> $x.psstandardmembers.DefaultDisplayPropertySet
ReferencedPropertyNames : {SystemDirectory, Organization, BuildNumber, RegisteredUser...}
MemberType : PropertySet
Value : DefaultDisplayPropertySet {SystemDirectory, Organization, BuildNumber, RegisteredUser,
SerialNumber, Version}
TypeNameOfValue : System.Management.Automation.PSPropertySet
Name : DefaultDisplayPropertySet
IsInstance : False
PS Scripts:\> $x.psstandardmembers.DefaultDisplayPropertySet.ReferencedPropertyNames
SystemDirectory
Organization
BuildNumber
RegisteredUser
SerialNumber
Version
This is the post that I found this information on PSStandard Members.

Resources