Slack's files list API gives warning max_page_limit - slack

I am using below API and listing 200 files per page.
https://slack.com/api/files.list?count=200&page={{pageNumber}}
I have 60000 files in my slack account. So on first API call received 200 files with pagination response like below.
"paging": {
"count": 200,
"total": 60000,
"page": 1,
"pages": 300
}
We continue fetching files with increasing page number in API query parameter like 2,3,4,.......
https://slack.com/api/files.list?count=200&page=2
"paging": {
"count": 200,
"total": 60000,
"page": 2,
"pages": 300
}
When we reached page number 101 the page parameter in paging response becomes 1 with warning max_page_limit. Can't we list all files with same pagination fashion? or Slack file list API allows us to list files till page 100 only? We didn't find anything in Slack documentation for this use case. Any help regarding this issue will be much appreciated.
https://slack.com/api/files.list?count=200&page=101
"paging": {
"count": 200,
"total": 60000,
"page": 1,
"pages": 300,
"warnings": [
"max_page_limit"
]
}

Here is what I got reply from slack forum.
There is indeed a page limit of 100 pages on files.list. I've contacted the documentation team to add this detail to the documentation for the method. You should be able to get your 60000 files with a highter count of 600 though.
There are other ways to filter down the expected number of results. For example, you could specify a time period for file creation date using the ts_from and ts_to arguments and do batches of calls within specified time periods, or batch your searches by channel by passing the channel argument. These techniques should always allow you to keep a batch within 100,000 files, as 1000 would be the max accepted limit.

Related

ElasticSearch BulkShardRequest failed due to org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor

I am storing logs into elastic search from my reactive spring application. I am getting the following error in elastic search:
Elasticsearch exception [type=es_rejected_execution_exception, reason=rejected execution of processing of [129010665][indices:data/write/bulk[s][p]]: request: BulkShardRequest [[logs-dev-2020.11.05][1]] containing [index {[logs-dev-2020.11.05][_doc][0d1478f0-6367-4228-9553-7d16d2993bc2], source[n/a, actual length: [4.1kb], max length: 2kb]}] and a refresh, target allocation id: WwkZtUbPSAapC3C-Jg2z2g, primary term: 1 on EsThreadPoolExecutor[name = 10-110-23-125-common-elasticsearch-apps-dev-v1/write, queue capacity = 200, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor#6599247a[Running, pool size = 2, active threads = 2, queued tasks = 221, completed tasks = 689547]]]
My index settings:
{
"logs-dev-2020.11.05": {
"settings": {
"index": {
"highlight": {
"max_analyzed_offset": "5000000"
},
"number_of_shards": "3",
"provided_name": "logs-dev-2020.11.05",
"creation_date": "1604558592095",
"number_of_replicas": "2",
"uuid": "wjIOSfZOSLyBFTt1cT-whQ",
"version": {
"created": "7020199"
}
}
}
}
}
I have gone through this site:
https://www.elastic.co/blog/why-am-i-seeing-bulk-rejections-in-my-elasticsearch-cluster
I thought adjusting "write" size in thread-pool will resolve, but it is mentioned as not recommended in the site as below:
Adjusting the queue sizes is therefore strongly discouraged, as it is like putting a temporary band-aid on the problem rather than actually fixing the underlying issue.
So what else can we do improve the situation?
Other info:
Elastic Search version 7.2.1
Cluster health is good and they are 3 nodes in cluster
Index will be created on daily basis, there are 3 shards per index
While you are right, that increasing the thread_pool size is not a permanent solution, you will be glad to know that elasticsearch itself increased the size of write thread_pool(use in your bulk requests) from 200 to 10k in just a minor version upgrade. Please see the size of 200 in ES 7.8, while 10k of ES 7.9 .
If you are using the ES 7.X version, then you can also increase the size to if not 10k, then at least 1k(to avoid rejecting the requests).
If you want a proper fix, you need to do the below things
Find out if it's consistent or just some short-duration burst of write requests, while gets cleared in some time.
If it's consistent, then you need to figure out if have all the write optimization is in place, please refer to my short-tips to improve index speed.
See, if you have reached the full-capacity of your data-nodes, and if yes, scale your cluster to handle the increased/legitimate load.

Connect 3+ OpenDaylight controllers to mininet topology

I would like to ask
I have created a cluster according to this
https://docs.opendaylight.org/en/stable-magnesium/getting-started-guide/clustering.html
And i would like to verify it is working can someone help me how to do it?
Also is it able to connect this cluster or those 3 controllers to one mininet topology? Or it cant be done?
EDIT
I would like to ask why
Not all bundle are active?
Is there gonna be some problem with that ?
I'm not sure if you can specify multiple controllers on the mininet command
line, but it's worth a try. Otherwise you can try like this person explains
in this post setting up the controllers in a mininet .py config file.
To verify the cluster is working, there are many ways, but you can try some
rest calls to check the status of things. We have some examples in upstream
CSIT tests. If you install the feature odl-jolokia, you can send a GET to:
jolokia/read/org.opendaylight.controller:Category=Shards,name=member-1-shard-default-config,type=DistributedConfigDatastore
that is checking the default shard status for the config datastore. You'll get
some output like this:
content={
"request": {
"mbean": "org.opendaylight.controller:Category=Shards,name=member-1-shard-default-config,type=DistributedConfigDatastore",
"type": "read"
},
"status": 200,
"timestamp": 1588524930,
"value": {
"AbortTransactionsCount": 0,
"CommitIndex": 70,
"CommittedTransactionsCount": 0,
"CurrentTerm": 7,
"FailedReadTransactionsCount": 0,
"FailedTransactionsCount": 0,
"FollowerInfo": [],
"FollowerInitialSyncStatus": true,
"InMemoryJournalDataSize": 33,
"InMemoryJournalLogSize": 1,
"LastApplied": 70,
"LastCommittedTransactionTime": "1970-01-01 00:00:00.000",
"LastIndex": 70,
"LastLeadershipChangeTime": "2020-05-03 16:54:45.034",
"LastLogIndex": 70,
"LastLogTerm": 7,
"LastTerm": 7,
"Leader": "member-2-shard-default-config",
"LeadershipChangeCount": 1,
"PeerAddresses": "member-3-shard-default-config: akka.tcp://opendaylight-cluster-data#10.30.170.119:2550/user/shardmanager-config/member-3-shard-default-config, member-2-shard-default-config: akka.tcp://opendaylight-cluster-data#10.30.170.113:2550/user/shardmanager-config/member-2-shard-default-config",
"PeerVotingStates": "member-3-shard-default-config: true, member-2-shard-default-config: true",
"PendingTxCommitQueueSize": 0,
"RaftState": "Follower",
"ReadOnlyTransactionCount": 0,
"ReadWriteTransactionCount": 0,
"ReplicatedToAllIndex": 69,
"ShardName": "member-1-shard-default-config",
"SnapshotCaptureInitiated": false,
"SnapshotIndex": 69,
"SnapshotTerm": 7,
"StatRetrievalError": null,
"StatRetrievalTime": "557.3 \u03bcs",
"TxCohortCacheSize": 0,
"VotedFor": "member-2-shard-default-config",
"Voting": true
}
}
Lots of info there, but the raftstate says Follower, so you know this node
is one of the two followers. One node will be leader.
Another thing we check is syncstatus to make sure it's "true". Use this
URI:
jolokia/read/org.opendaylight.controller:Category=ShardManager,name=shard-manager-operational,type=DistributedOperationalDatastore
example output

increase pageSize of Generic Test Data results in sonarqube 6.2

I have imported test results according to https://docs.sonarqube.org/display/SONAR/Generic+Test+Data into SonarQube 6.2.
I can look at the detailed test results in sonar by navigating to the test file and then by clicking menu "Show Measures". The opened page then shows me the correct total number of tests 293 of which 31 failed. The test result details section however only shows 100 test results.
This page seems to get its data through a request like: http://localhost:9000/api/tests/list?testFileId=AVpC5Jod-2ky3xCh908m
with a result of:
{
paging: {
pageIndex: 1,
pageSize: 100,
total: 293
},
tests: [
{
id: "AVpDK1X_-2ky3xCh91QQ",
name: "GuiButton:Type Checks->disabledBackgroundColor",
fileId: "AVpC5Jod-2ky3xCh908m",
fileKey: "org.sonarqube:Scripting-Tests-Publishing:dummytests/ScriptingEngine.Objects.GuiButtonTest.js",
fileName: "dummytests/ScriptingEngine.Objects.GuiButtonTest.js",
status: "OK",
durationInMs: 8
...
}
From this I gather that the page size is set to 100 in the backend. Is there a way to increase it so that I can see all test results?
You can certainly call the web service with a larger page size parameter value, but you cannot change the page size requested by the UI

Golang: healthd and healthtop of the library "gocraft/health"

Im using gocraft/health to check the health of my service and have the metrics of each endPoint.
Im usin The JSON polling sink to get the metrics.
sink := health.NewJsonPollingSink(time.Minute*5, time.Minute*5)
stream.AddSink(sink)
I want to use healthtop and healthd here Link they explain how.
I fixed the environment variables: export HEALTHD_MONITORED_HOSTPORTS=:5001 HEALTHD_SERVER_HOSTPORT=:5002 healthd
as they said
after they said "Now you can run it". how, they didn't give any command to do it.I didn't realy understand what they mean.
I navigated to src/github.com/gocraft/health/cmd/healthd. I found main.go when I run it I got that in the console
[openrtb#sd-69536 healthd]$ go run main.go
[2015-06-17T23:04:20.871743758Z]: job:general event:starting kvs:[health_host_port::5002 monitored_host_ports::5001,:5002 server_host_port::5002]
[2015-06-17T23:04:20.87810814Z]: job:poll status:success time:4 ms kvs:[host_port::5002]
[2015-06-17T23:04:20.881896459Z]: job:poll status:success time:8 ms kvs:[host_port::5001]
[2015-06-17T23:04:20.882338024Z]: job:recalculate status:success time:231 μs
[2015-06-17T23:04:23.275370787Z]: job:recalculate status:success time:6 μs
[2015-06-17T23:04:30.875230839Z]: job:poll status:success time:1573 μs kvs:[host_port::5002]
[2015-06-17T23:04:30.881415193Z]: job:poll status:success time:7 ms kvs:[host_port::5001]
.
.
but no reslute on the those endpoints
localhost:5002/jobs: Lists top jobs
localhost:5002/hosts: Lists all monitored hosts and their statuses
it gave me {"error": "not_found"}
excepte this localhost:5002/health I got this JSON responce
{
"instance_id": "sd-69536.1291",
"interval_duration": 3600000000000,
"aggregations": [
{
"interval_start": "2015-06-18T01:00:00+02:00",
"serial_number": 48,
"jobs": {
"general": {
"timers": {},
"events": {
"starting": 1
},
"event_errs": {},
"count": 0,
"nanos_sum": 0,
"nanos_sum_squares": 0,
"nanos_min": 0,
"nanos_max": 0,
"count_success": 0,
"count_validation_error": 0,
"count_panic": 0,
"count_error": 0,
"count_junk": 0
},
"poll": {
"timers": {},
"events": {},
"event_errs": {},
"count": 24,
"nanos_sum": 107049159,
"nanos_sum_squares": 6.06770682813009e+14,
"nanos_min": 1581783,
"nanos_max": 8259442,
"count_success": 24,
"count_validation_error": 0,
"count_panic": 0,
"count_error": 0,
"count_junk": 0
},
"recalculate": {
"timers": {},
"events": {},
"event_errs": {},
"count": 23,
"nanos_sum": 3501601,
"nanos_sum_squares": 6.75958305123e+11,
"nanos_min": 70639,
"nanos_max": 290877,
"count_success": 23,
"count_validation_error": 0,
"count_panic": 0,
"count_error": 0,
"count_junk": 0
}
},
"timers": {},
"events": {
"starting": 1
},
"event_errs": {}
}
]
}
but no idea what this result mean, because it doesn't have any relation with my
localhost:5001/health EndPoint that should normaly aggregate as they said.
What you downloaded is a binary so you can just invoke it with healthd if you're in the correct directory, they actually provide this example;
HEALTHD_MONITORED_HOSTPORTS=:5020 HEALTHD_SERVER_HOSTPORT=:5032 healthd
Which isn't setting env var as much as invoking healthd with those two values (export or something would be required to persist the change beyond the one command). healthtop more clearly states what it is but as you can see by their paths, they're both commands gocraft/health/cmd/healthtop. They have several examples of using healthtop from bash, not so explicit about healthd but it's the same.
If you ran that command (as you show in your question) then you may want to try healthtop jobs or something to that effect. I don't know a ton about this project and don't care to research it but from what I can tell healthd is just a service that collects results from various /health endpoints and makes them available in on API. It seems like they intend for you to use healthtop to on top of it to view reports.
Also note this;
Great! To get a sense of the type of data healthd serves, you can manually navigate to:
/jobs: Lists top jobs
/aggregations: Provides a time series of aggregations
/aggregations/overall: Squishes all time series aggregations into one aggregation.
/hosts: Lists all monitored hosts and their statuses.
However, viewing raw JSON is just to give you a sense of the data. See the next section...
I'm not sure what the domain is (localhost:5032 if you're running locally?) but you should probably just be able to go to localhost:5032/jobs and see the healthd is running and doing something. Also check your apps to confirm it's up and running. Don't expect any output from it directly, that's what healthtop is for.

The best approach to index dynamic query in MongoDB

I am working on log managment system where user will be able to upload logs from file. I have 'event' collection where I store all events from all sources (each source can have different log format and in one collection I can have e.g. 10 000 000 records - 5 000 000 for 'source1' and 5 000 000 for 'source2'). I want to provide filter option to the user (filter option will be only for source so user can filter data by level, data etc) and for better performance I want to create indexes and also compound indexes. Before user upload logs to the system he/she will decide what filter queries wants to use during filtering. So I can have different queries. The problem is that I can have many different sources in one collection and that mean many different filter queries but mongoDB only allow 64 indexes per collection.
So what is the best solution if I want have good read performance and I want let user to filter logs (the user will decide how he wants to filter data before logs will be uploaded to the sytem)? I was thinking to create new collection for each source as I will never reach 64 indexes per collection.
Queries sample:
db.events.ensureIndex({"source_id": 1, "timestamp" : 1})
db.events.ensureIndex({"source_id": 1, "timestamp" : 1, "level": 1})
db.events.ensureIndex({"source_id": 1, "diagnostic_context": 1})
db.events.ensureIndex({"source_id": 1, "timestamp" : 1, "statusCode": 1})
db.events.ensureIndex({"source_id": 1, "host" : 1})
Event collection sample:
{ _id: ObjectId("507f1f77bcf86cd799439011"),
timestamp: ISODate("2012-09-27T03:42:10Z"),
thread: "[http-8080-3]",
level: "INFO",
diagnostic_context: "User 99999",
message: "existing customer saving"},
source_id: "source1"
{ _id: ObjectId("507f1f77bcf86cd799439012"),
host: "144.18.39.44",
timestamp: ISODate("2012-09-01T03:42:10Z"),
request: "GET /resources.html HTTP/1.1",
statusCode: 200
bytes_sent: 3458,
url: "http://www.aivosto.com/",
agent: "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)",
source_id: "source2"
}

Resources