How to troubleshoot the snapshot command in Oracle NoSQL Database - oracle-nosql

The command “snapshot create” is showing the following message when something is not wrong. But there is not details about what was happened
kv -> snapshot create -name TEST
Create data snapshot succeeded but not on all components
Successfully backup configurations on sn1, sn2, sn3
Is there a way, to know the components failing ?
There is not a lot of information in the documentation about how to do this

Yes, you can use the flag -json. the JSON output shows more information and allow to see what exactly happened it
Here an example when everything is ok
snapshot create -name BACKUP -json
{
"operation" : "snapshot operation",
"returnCode" : 5000,
"description" : "Operation ends successfully",
"returnValue" : {
"snapshotName" : "210705-133631-BACKUP",
"successSnapshots" : [ "admin1", "admin2", "rg1-rn1", "rg1-rn2", "rg1-rn3", "rg2-rn1", "rg2-rn2", "rg2-rn3", "rg3-rn1", "rg3-rn2", "rg3-rn3" ],
"failureSnapshots" : [ ],
"successSnapshotConfigs" : [ "sn1", "sn2", "sn3" ],
"failureSnapshotConfigs" : [ ]
}
}
Here an example when there is a failure
snapshot create -name BACKUP -json
{
"operation" : "snapshot operation",
"returnCode" : 5500,
"description" : "Operation ends successfully",
"returnValue" : {
"snapshotName" : "210705-133737-BACKUP",
"successSnapshots" : [ "admin1", "admin2", "rg1-rn1", "rg1-rn2", "rg1-rn3", "rg2-rn1", "rg2-rn2", "rg2-rn3", "rg3-rn1", "rg3-rn2" ],
"failureSnapshots" : [ "rg3-rn3" ],
"successSnapshotConfigs" : [ "sn1", "sn2", "sn3" ],
"failureSnapshotConfigs" : [ ]
}
}
See the field returnCode.

Related

Databricks Autoloader is getting stuck and does not pass to the next batch

I have a simple job scheduled every 5 min. Basically it listens to cloudfiles on storage account and writes them into delta table, extremely simple. The code is something like this:
df = (spark
.readStream
.format("cloudFiles")
.option('cloudFiles.format', 'json')
.load(input_path, schema = my_schema)
.select(cols)
.writeStream
.format("delta")
.outputMode("append")
.option("checkpointLocation", f"{output_path}/_checkpoint")
.trigger(once = True)
.start(output_path))
Sometimes there are new files, sometimes not. After 40-60 batches it gets stuck on one particular batchId, as if there are no new files in the folder. If i run the script manually i get the same result: it points to the last actually processed batch.
{
"id" : "xxx,
"runId" : "xxx",
"name" : null,
"timestamp" : "2022-01-13T15:25:07.512Z",
"batchId" : 64,
"numInputRows" : 0,
"inputRowsPerSecond" : 0.0,
"processedRowsPerSecond" : 0.0,
"durationMs" : {
"latestOffset" : 663,
"triggerExecution" : 1183
},
"stateOperators" : [ ],
"sources" : [ {
"description" : "CloudFilesSource[/mnt/source/]",
"startOffset" : {
"seqNum" : 385,
"sourceVersion" : 1,
"lastBackfillStartTimeMs" : 1641982820801,
"lastBackfillFinishTimeMs" : 1641982823560
},
"endOffset" : {
"seqNum" : 385,
"sourceVersion" : 1,
"lastBackfillStartTimeMs" : 1641982820801,
"lastBackfillFinishTimeMs" : 1641982823560
},
"latestOffset" : null,
"numInputRows" : 0,
"inputRowsPerSecond" : 0.0,
"processedRowsPerSecond" : 0.0,
"metrics" : {
"numBytesOutstanding" : "0",
"numFilesOutstanding" : "0"
}
} ],
"sink" : {
"description" : "DeltaSink[/mnt/db/table_name]",
"numOutputRows" : -1
}
}
But if I run only the readStream part - it correctly reads the entire list of files ( and starts a new batchId: 0 ). The strangest part is: I have absolutely no Idea what causes it and why it takes around 40-60 batches to get this kind of error. Can anyone help? Or give me some suggestion?
I was thinking about using ForeachBatch() to append new data. Or using trigger .trigger(continuous='5 minutes')
I'm new to AutoLoader
Thank you so much!
I resolved it by using
.option('cloudFiles.useIncrementalListing', 'false')
My filenames are composed of flowname + timestamp, like this:
flow_name_2022-01-18T14-19-50.018Z.json
So my guess is: some combination of dots make the rocksdb go into non-existing directory, that's why the it reports that "found no new files". Once I disabled incremental listing rocksdb stopped making its mini checkpoints based on filenames and now reads the whole directory. This is the only explanation that I have.
If anyone is having the same issue try changing the filename

Cosmos DB Collection not using _id index when querying by _id?

I have a CosmosDb - MongoDb collection that I'm using purely as a key/value store for arbitrary data where the _id is the key for my collection.
When I run the query below:
globaldb:PRIMARY> db.FieldData.find({_id : new BinData(3, "xIAPpVWVkEaspHxRbLjaRA==")}).explain(true)
I get this result:
{
"_t" : "ExplainResponse",
"ok" : 1,
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "data.FieldData",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [ ]
},
"winningPlan" : {
},
"rejectedPlans" : [ ]
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 106,
"totalKeysExamined" : 0,
"totalDocsExamined" : 3571,
"executionStages" : {
},
"allPlansExecution" : [ ]
},
"serverInfo" : #REMOVED#
}
Notice that the totalKeysExamined is 0 and the totalDocsExamined is 3571 and the query took over 106ms. If i run without .explain() it does find the document.
I would have expected this query to be lightning quick given that the _id field is automatically indexed as a unique primary key on the collection. As this collection grows in size, I only expect this problem to get worse.
I'm definitely not understanding something about the index and how it works here. Any help would be most appreciated.
Thanks!

Google places API - Can I separate out the output?

Hiting the endpoint:
https://maps.googleapis.com/maps/api/place/findplacefromtext/json?input=Hoboken%20NJ&fields=formatted_address,name&inputtype=textquery&key=xxxxxxxxxxxxxxxxxxx
Getting the result:
{
"candidates" : [
{
"formatted_address" : "New Jersey, USA",
"name" : "Hoboken"
}
],
"debug_log" : {
"line" : []
},
"status" : "OK"
}
What bugs me is that I can't find a way to separate out the region and country - Yes, I know I can parse the result myself. But is there an option I get shoot out to Google Places API to have the response separate out city/state(or region)/country in the returned JSON?
Something like:
{
"candidates" : [
{
"state" : "New Jersey",
"country" : "USA",
"name" : "Hoboken"
}
],
"debug_log" : {
"line" : []
},
"status" : "OK"
}
As far as I know, it isn't possible, you'll have to parse it. Places API is designed to search businesses and POIs at first place.
Google does have, however a geocoding API which seems to give out Postal Code, Country, State, Address, separetely.
There are also some free alternatives

Why distance matrix api returning the status "ZERO_RESULTS" even coordinates are correctly mentioned?

{
"destination_addresses" : [ "14.868924,79.873609" ],
"origin_addresses" : [ "14.843799,79.862726" ],
"rows" : [
{
"elements" : [
{
"status" : "ZERO_RESULTS"
}
]
}
],
"status" : "OK"
}
url for this above response is given below only change the API key to verify it.
https://maps.googleapis.com/maps/api/distancematrix/json?units=metric&origins=14.843799,79.862726&destinations=14.868924,79.873609&mode=walking&key=XXXX
Although bing distance calculation api is successfully returning distance for this same coordinates.
I reported this issues to Google issue tracker and after that they fixed it within 1 month approximately.
https://issuetracker.google.com/u/1/issues/38478121
Thanks

Mongo says coordinates out of bounds

I have a Spring app thats connected to a mongo database, and am using the following code to get documents from a certain collection that are within a radius of a point on Earth:
"query" : {
"location" : {
"$geoWithin" : {
"$centerSphere" : [
[
37.33240731,
-122.03046898
],
0.0018924144710663706
]
}
}
}
Usually this works (for locations in england), however, for some reason these coordinates (in america) give this error:
'{ "ok" : 0.0, "errmsg" : "longitude/latitude is out of bounds, lng: 37.3323
lat: -122.031", "code" : 2 }
What could be causing this?
Faced the same problem while using spring data mongodb, the very silly mistake was the exchange of lat and long values.
Query query = new BasicQuery(
"{ location: { $geoWithin: { $centerSphere: [ [
-122.031,
37.3323
"+ rangeInMiles / 3963.2 +
" ] } }, visible : true, status : 'ACTIVE' }"
);

Resources