I have a APP that monitors some external jobs (among other things). This does this monitoring every 5 Mins
I'm trying to create a prometheus gauge to get the count of currently running jobs.
Here is how I declared my gauge
JobStats= promauto.NewGaugeVec(
prometheus.GaugeOpts{
Namespace: "myapi",
Subsystem: "app",
Name: "job_count",
Help: "Current running jobs in the system",
ConstLabels: nil,
},
[]string{"l1", "l2", "l3"},
)
in the code that actually counts the jobs I do
metrics.JobStats.WithLabelValues(l1,l2,l3).add(float64(jobs_cnt))
when I query the /metrics endpoint I get the number
The thing is, this metrics only keeps increasing. If I restart the app this get resets to zer & again keeps increasing
I'm using grafana to graph this in a dashboard.
My question is
Get the graph to show the actual number of jobs (instead of ever increasing line)?
Should this be handled in code (like setting this to zero before every collection?) or in grafana?
I have some Azure Storage Accounts (StorageV2) located in West Europe. All blobs uploaded are by default in the Hot tier and I have this lifecycle rule defined on them:
{
"rules": [
{
"enabled": true,
"name": "moveToCool",
"type": "Lifecycle",
"definition": {
"actions": {
"baseBlob": {
"enableAutoTierToHotFromCool": true,
"tierToCool": {
"daysAfterLastAccessTimeGreaterThan": 1
}
}
},
"filters": {
"blobTypes": [
"blockBlob"
]
}
}
}
]
}
Somehow the uploaded blobs are moved to cool but then after I access them again, in the portal they still appear under Cool tier. Any idea why? (I have waited more than 24 for the rule to be in effect)
Some more questions about: "enableAutoTierToHotFromCool": true:
does it depend on the blob size? (for example if some blobs were moved to cool and then they accessed simultaneously the time between a 1 Gib is moved back to hot is the same for 10KiB blob)
does it depend on the number of blobs that are accessed? (it there a queue and if multiple blobs from cool are accessed in the same time, the requests are served based on a queue order)
The enableAutoTierToHotFromCool property is a Boolean value that indicates whether a blob should automatically be tiered from cool back to hot if it is accessed again after being tiered to cool.
And to apply new policy it takes 48hrs, and enableAutoTierToHotFromCool": true doesn’t depend on size of blob , and not depends on the number of blobs
If you enable firewall rules for your storage account, lifecycle management requests may be blocked. You can unblock these requests by providing exceptions for trusted Microsoft services. For more information, refer this document the Exceptions section in Configure firewalls and virtual networks.
A lifecycle management policy must be read or written in full. Partial updates are not supported. So try with writing
"prefixMatch": [
"containerName/log"
]
For more details refer this document:
If I have developed a NiFi flow and a support person wants to view what's the current state and which processor is currently running, which processor already ran, which ones completed?
I mean to say any dashboard kind of utility provided by NiFi to monitor activities ?
You can use the Reporting tasks and NiFi itself, or a new NiFi instance, that is what I choose.
To do that you must do the following:
Open the reporting task menu
And add the desired reporting tasks
And configure it properly
Then create a flow to manage the reporting data
In my case I am putting the information into an Elasticsearch
There are numerous ways to monitor NiFi flows and status. The status bar along the top of the UI shows running/stopped/invalid processor counts, and cluster status, thread count, etc. The global menu at the top right has options for monitoring JVM usage, flowfiles processed/in/out, CPU, etc.
Each individual processor will show a status icon for running/stopped/invalid/disabled, and can be right-clicked for the same JVM usage, flowfile status, etc. graphs as the global view, but for the individual processor. There are also some Reporting Tasks provided by default to integrate with external monitoring systems, and custom reporting tasks can be written for any other desired visualization or monitoring dashboard.
NiFi doesn’t have the concept of batch/job processing, so processors aren’t “complete”.
1. In built monitoring in Apache NiFi.
Bulletin Board
The bulletin board shows the latest ERROR and WARNING getting generated by NiFi processors in real time. To access the bulletin board, a user will have to go the right hand drop down menu and select the Bulletin Board option. It refreshes automatically and a user can disable it also. A user can also navigate to the actual processor by double-clicking the error. A user can also filter the bulletins by working out with the following −
by message
by name
by id
by group id
Data provenance UI
To monitor the Events occurring on any specific processor or throughout NiFi, a user can access the Data provenance from the same menu as the bulletin board. A user can also filter the events in data provenance repository by working out with the following fields −
by component name
by component type
by type
NiFi Summary UI
Apache NiFi summary also can be accessed from the same menu as the bulletin board. This UI contains information about all the components of that particular NiFi instance or cluster. They can be filtered by name, by type or by URI. There are different tabs for different component types. Following are the components, which can be monitored in the NiFi summary UI −
Processors
Input ports
Output ports
Remote process groups
Connections
Process groups
In this UI, there is a link at the bottom right hand side named system diagnostics to check the JVM statistics.
2. Reporting Tasks
Apache NiFi provides multiple reporting tasks to support external monitoring systems like Ambari, Grafana, etc. A developer can create a custom reporting task or can configure the inbuilt ones to send the metrics of NiFi to the externals monitoring systems. The following table lists down the reporting tasks offered by NiFi 1.7.1.
Reporting Task:
AmbariReportingTask - To setup Ambari Metrics Service for NiFi.
ControllerStatusReportingTask - To report the information from the NiFi summary UI for the last 5 minute.
MonitorDiskUsage - To report and warn about the disk usage of a specific directory.
MonitorMemory To monitor the amount of Java Heap used in a Java Memory pool of JVM.
SiteToSiteBulletinReportingTask To report the errors and warning in bulletins using Site to Site protocol.
SiteToSiteProvenanceReportingTask To report the NiFi Data Provenance events using Site to Site protocol.
3. NiFi API
There is an API named system diagnostics, which can be used to monitor the NiFI stats in any custom developed application.
Request
http://localhost:8080/nifi-api/system-diagnostics
Response
{
"systemDiagnostics": {
"aggregateSnapshot": {
"totalNonHeap": "183.89 MB",
"totalNonHeapBytes": 192819200,
"usedNonHeap": "173.47 MB",
"usedNonHeapBytes": 181894560,
"freeNonHeap": "10.42 MB",
"freeNonHeapBytes": 10924640,
"maxNonHeap": "-1 bytes",
"maxNonHeapBytes": -1,
"totalHeap": "512 MB",
"totalHeapBytes": 536870912,
"usedHeap": "273.37 MB",
"usedHeapBytes": 286652264,
"freeHeap": "238.63 MB",
"freeHeapBytes": 250218648,
"maxHeap": "512 MB",
"maxHeapBytes": 536870912,
"heapUtilization": "53.0%",
"availableProcessors": 4,
"processorLoadAverage": -1,
"totalThreads": 71,
"daemonThreads": 31,
"uptime": "17:30:35.277",
"flowFileRepositoryStorageUsage": {
"freeSpace": "286.93 GB",
"totalSpace": "464.78 GB",
"usedSpace": "177.85 GB",
"freeSpaceBytes": 308090789888,
"totalSpaceBytes": 499057160192,
"usedSpaceBytes": 190966370304,
"utilization": "38.0%"
},
"contentRepositoryStorageUsage": [
{
"identifier": "default",
"freeSpace": "286.93 GB",
"totalSpace": "464.78 GB",
"usedSpace": "177.85 GB",
"freeSpaceBytes": 308090789888,
"totalSpaceBytes": 499057160192,
"usedSpaceBytes": 190966370304,
"utilization": "38.0%"
}
],
"provenanceRepositoryStorageUsage": [
{
"identifier": "default",
"freeSpace": "286.93 GB",
"totalSpace": "464.78 GB",
"usedSpace": "177.85 GB",
"freeSpaceBytes": 308090789888,
"totalSpaceBytes": 499057160192,
"usedSpaceBytes": 190966370304,
"utilization": "38.0%"
}
],
"garbageCollection": [
{
"name": "G1 Young Generation",
"collectionCount": 344,
"collectionTime": "00:00:06.239",
"collectionMillis": 6239
},
{
"name": "G1 Old Generation",
"collectionCount": 0,
"collectionTime": "00:00:00.000",
"collectionMillis": 0
}
],
"statsLastRefreshed": "09:30:20 SGT",
"versionInfo": {
"niFiVersion": "1.7.1",
"javaVendor": "Oracle Corporation",
"javaVersion": "1.8.0_151",
"osName": "Windows 7",
"osVersion": "6.1",
"osArchitecture": "amd64",
"buildTag": "nifi-1.7.1-RC1",
"buildTimestamp": "07/12/2018 12:54:43 SGT"
}
}
}
}
You also can use nifi-api for monitoring. You can receive detailed information about each processor group, controller service or processor.
You can use MonitoFi. It is an open source tool that is highly configurable, uses nifi-api to collect stats about various nifi processors and stores those in influxdb. It also comes with Grafana Dashboards and Alerting functionality.
http://www.monitofi.com
or
https://github.com/microsoft/MonitoFi
I have a RabbitMQ server setup with thousands of queues. Of which only about 5 of these are persistent queues. Every now and then there is a back up of a queue that will have about 5-10 messages in a ready state. These messages do not appear to be in the persistent queues. I want to find out which queues had the messages in a ready state, but the only indication that it is happening is on the overview page of the web management console which is for all queues.
Is there a way to query Rabbit to tell me the stat info for messages that were in a ready state for a period of minutes and which queue they were in?
I would use the HTTP API.
http://rabbit-broker:15672/api/queues
This will give you a list of the current queue states in JSON so you'll have to keep polling it. Store the "messages_ready" for given queue "name" for the period you want to monitor. Now you'll be able to see which queues have that backlog spike.
You can use simple curl as well as whichever platform you prefer with an HTTP client.
Please note: the user you'll connect will have to have monitor tag to access all the queue information.
Out of the box there is no easy way AFAIK, you'd have to manually click through the queues and look at their graphs in the UI for the last hour, which is tedious.
I had similar requirements and I found a better way than polling. The docs say that you may get raw samples via api if you use special parameters in the request.
For example in your case, if you are interested in messages with ready state, you may ask your queue for a history of queue lengths, for example last 60 seconds with samples every 1 second (note 15672 is the default port used by rabbitmq_management):
http://rabbitHost:15672/api/queues/vhost/queue?lengths_age=60&lengths_incr=1
For default vhost=/ it will be:
http://rabbitHost:15672/api/queues/%2F/queue?lengths_age=60&lengths_incr=1
Then in the result json there will be some additional _details objects like this:
"messages_ready_details": {
"avg": 8.524590163934427,
"avg_rate": 0.08333333333333333,
"samples": [{
"timestamp": 1532699694000,
"sample": 5
}, {
"timestamp": 1532699693000,
"sample": 11
},
<... more samples ...>
],
"rate": -6.0
},
"messages_ready": 5,
Then on this raw data you may do any stats you need.
Other raw data samples appear if you use differen parameters in
What sampling will appear? What parameters are required for it to appear?
Messages sent and received msg_rates_age / msg_rates_incr
Bytes sent and received data_rates_age / data_rates_incr
Queue lengths lengths_age / lengths_incr
Node statistics (e.g. file descriptors, disk space free) node_stats_age / node_stats_incr
I am using translate API to translate some texts in my page, those texts are large html formated texts, so I had to develop a function that splits these texts into smaller pieces less than 4500 characters (including html tags) to avoid the limit of 5000 characters per request, also I had to modify the Google PHP API to allow send requests via POST.
I have enabled the paid version of the api in Goole Developers Console, and changed the total quota to 50M of characters per day and 500 requests/second/urser.
Now I am translating the whole database of texts with a script, it works fine but at some random points I revive the error "(403) User Rate Limit Exceeded", and I have to wait some minutes to re-run the script because when reached the error the api is returning the same error over and over until I wait some time.
I don't know why it keeps returning the error if I don't pass the number of requests, it's like it has some kind of maximum chaaracters per each interval of time or something...
You probably exceed the quota limits you set before: this is either the daily billable or the limit on the request characters per second.
To change the usage limits or request an increase to your quota, do the following:
1. Go to the Google Developers Console "https://console.developers.google.com/".
2. Select a project.
3. On the left sidebar, expand APIs & auth.
4. Click APIs.
5. Click the name of an activated API you're interested in "i.e. The Translate API".
6. Near the top of the info page for the API, click Quota.
If you have the billing enabled, just click Quota and it will take you to the quota page where you can view and change the quota-related settings.
If not, clicking Quota shows information about any free quota and limits that apply to the Translate API.
Google Developer Console has a rate limit of 10 requests per second, regardless of the settings or limits you may have changed.
You may be exceeding this limit.
I was unable to find any documentation around this, but could verify it myself with various API requests.
You control the characters limitation but not the concurrency
You are either making more than 500 concurrent request/second or you are using another Google API that is hitting such concurrency limitation.
The referer header is not set by default, but it is possible to add the headers to a request like so:
$result = $t->translate('Hola Mundo', [
'restOptions' => [
'headers' => [
'referer' => 'https://your-uri.com'
]
]
]);
If it makes more sense for you to set the referer at the client level (so all requests flowing through the client receive the header), this is possible as well:
$client = new TranslateClient([
'key' => 'my-api-key',
'restOptions' => [
'headers' => [
'referer' => 'https://your-uri.com'
]
]
]);
This worked for me!
Reference
In my case, this error was caused by my invalid payment information. Go to Billing area and make sure everything is ok.