Google Workflows only returning the first 20 documents in a collection - google-workflows

I have a simple Workflows that starts with retrieving the documents from a Firebase collection, but for some reason, the call only returns the first 20 documents from the collection.
Workflows is written in YAML -- Is there a limitation I am unaware of?
main:
params: [input]
steps:
- initialize:
assign:
- project: "dev"
- collection: "shops"
- shops:
steps:
- get_documents:
call: googleapis.firestore.v1.projects.databases.documents.get
args:
name: ${"projects/" + project + "/databases/(default)/documents/" + collection}
result: documents
- endit:
return: ${documents.documents}

Resolved, I added a pageSize to the query param
https://cloud.google.com/firestore/docs/reference/rest/v1beta1/projects.databases.documents/list
- shops:
steps:
- get_documents:
call: googleapis.firestore.v1.projects.databases.documents.get
args:
name: ${"projects/" + project + "/databases/(default)/documents/" + collection}
query:
pageSize: 10000
result: documents

Related

Redocly OpenAPI structure error. Property `openapi` is not expected here

I am trying to create an api documentation using redocly.
On my openapi.yaml it is linking to a yaml that has the api docs called kpi-documentation.yaml
link/$ref in openapi.yaml
/kpiDocumentation:
$ref: paths/kpi-documentation.yaml
I have an error in my visual studio code redocly preview extension that says
We found structural problems in your definition, please check the files below before running the preview.
File: /Users/xx/Desktop/Projects/api-docs/openapi/paths/kpi-documentation.yaml
Problem: Property `openapi` is not expected here.
File: /Users/xx/Desktop/Projects/api-docs/openapi/paths/kpi-documentation.yaml
Problem: Property `info` is not expected here.
File: /Users/xx/Desktop/Projects/api-docs/openapi/paths/kpi-documentation.yaml
Problem: Property `paths` is not expected here.
File: /Users/xx/Desktop/Projects/api-docs/openapi/paths/kpi-documentation.yaml
Problem: Property `components` is not expected here.
Part of code that I have in the kpi-documentation.yaml that appears to be throwing the error is
openapi: "3.1"
info:
title: KPI API
version: '1.0'
description: Documentation of API endpoints of KPI
servers:
- url: https://api.redocly.com
paths:
I have checked the documentation on the redocly website and it looks like my yaml structure is fine.
Also to note the kpi-documentation previews fine by itself when using the preview, but not when I preview the openapi.yaml which is the root file that needs to work.
https://redocly.com/docs/openapi-visual-reference/openapi/#OAS-3.0
rootfile
openapi.yaml
openapi: 3.1.0
info:
version: 1.0.0
title: KPI API documentation
termsOfService: 'https://example.com/terms/'
contact:
name: Brendan
url: 'http://example.com/contact'
license:
name: Apache 2.0
url: 'http://www.apache.org/licenses/LICENSE-2.0.html'
x-logo:
url: 'https://www.feedgy.solar/wp-content/uploads/2022/07/Sans-titre-1.png'
tags:
- name: Insert Service 1
description: Example echo operations.
- name: Insert Service 2
description: Operations about users.
- name: Insert Service 3
description: This is a tag description.
- name: Insert Service 4
description: This is a tag description.
servers:
- url: 'https://{tenant}/api/v1'
variables:
tenant:
default: www
description: Your tenant id
- url: 'https://example.com/api/v1'
paths:
'/users/{username}':
$ref: 'paths/users_{username}.yaml'
/echo:
$ref: paths/echo.yaml
/pathItem:
$ref: paths/path-item.yaml
/pathItemWithExamples:
$ref: paths/path-item-with-examples.yaml
/kpiDocumentation:
$ref: 'paths/kpi-documentation.yaml'
pathitem file
kpi-documentation.yaml
openapi: "3.1"
info:
title: KPI API
version: '1.0'
description: Documentation of API endpoints of KPI
servers:
- url: https://api.redocly.com
paths:
"/api/v1/corrected-performance-ratio/plants/{id}":
get:
summary: Same as the Performance Ratio, but the ratio is done using Corrected Reference Yield, so it considers thermal losses in the panels as normal. The WCPR represents the losses in the BoS (balance of system), so everything from the panel DC output to the AC output.
operationId: corrected_performance_ratio_plants_retrieve
parameters:
- in: query
name: date_end
schema:
type: string
format: date
required: true
- in: query
name: date_start
schema:
type: string
format: date
required: true
- in: query
name: frequency
schema:
enum:
- H
- D
- M
- Y
type: string
default: H
minLength: 1
- in: path
name: id
schema:
type: integer
description: A unique integer value identifying this plant.
required: true
- in: query
name: threshold
schema:
type: integer
default: 50
tags:
- corrected-performance-ratio
security:
- tokenAuth: []
- cookieAuth: []
- {}
responses:
"200":
content:
application/json:
schema:
$ref: "#/components/schemas/KPIResponse"
description: ""
"/api/v1/corrected-performance-ratio/plants/{id}/inverters":

Drop log lines to Loki using multiple conditions with Promtail

I want to drop lines in Promtail using an AND condition from two different JSON fields.
I have JSON log lines like this.
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET /path HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 1"}
{"timestamp":"2022-03-26T15:40:41+00:00","remote_addr":"1.2.3.4","remote_user":"","request":"GET / HTTP/1.1","status": "200","body_bytes_sent":"939","request_time":"0.000","http_referrer":"http://5.6.7.8","http_user_agent":"user agent 2"}
My local Promtail config looks like this.
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- drop:
source: "http_user_agent"
expression: "user agent 1"
# I want this to be AND
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
static_configs:
- labels:
job: my-job
Using a Promtail config like this drops lines using OR from my two JSON fields.
How can I adjust my config so that I only drop lines where http_user_agent = user agent 1 AND request = GET / HTTP/1.1?
If you provide multiple options they will be treated like an AND clause, where each option has to be true to drop the log.
If you wish to drop with an OR clause, then specify multiple drop stages.
https://grafana.com/docs/loki/latest/clients/promtail/stages/drop/#drop-stage
Drop logs by time OR length
Would drop all logs older than 24h OR longer than 8kb bytes
- json:
expressions:
time:
msg:
- timestamp:
source: time
format: RFC3339
- drop:
older_than: 24h
- drop:
longer_than: 8kb
Drop logs by regex AND length
Would drop all logs that contain the word debug AND are longer than 1kb bytes
- drop:
expression: ".*debug.*"
longer_than: 1kb
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: testing-my-job-drop
pipeline_stages:
- match:
selector: '{job="my-job"}'
stages:
- json:
expressions:
http_user_agent:
request:
- labels:
http_user_agent:
request:
#### method 1
- match:
selector: '{http_user_agent="user agent 1"}'
stages:
- drop:
source: "request"
expression: "GET / HTTP/1.1"
drop_counter_reason: my_job_healthchecks
## they are both conditions match will drop
#### method 2
- match:
selector: '{http_user_agent="user agent 1",request="GET / HTTP/1.1"}'
action: drop
#### method 3, incase regex pattern.
- match:
selector: '{http_user_agent="user agent 1"} |~ "(?i).*GET / HTTP/1.1.*"'
action: drop
static_configs:
- labels:
job: my-job
match stage include match stage.

Is it possible to take snapshot and restore with elasticsearch-curator without loosing the updates in the destination index?

I am able to run the curator to take the snapshot from the source index and restore the same snapshot in the destination index.
But all the updations that I did on the destination index are lost after the next snapshot and restore action.
Is it possible to specify not to overwrite the updations of the destination index?
source index: test_index
destination index: dest_test_index
snapshot-action.yml file
actions:
1:
action: snapshot
description: Snapshot selected indices to 'repository' with the snapshot name or name pattern in 'name'. Use all other options as assigned
options:
repository: esbackup
name:
wait_for_completion: True
max_wait: 3600
wait_interval: 10
filters:
- filtertype: pattern
kind: regex
value: '^(test_index)$'
exclude:
restore-action.yml file
actions:
1:
action: create_index
description: "Create the temporary index with dest_index_v2 name"
options:
name: dest_index_v2
2:
action: close
description: >-
Close index dest_indiex_v2.
options:
ignore_empty_list: True
skip_flush: False
delete_aliases: False
ignore_sync_failures: True
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
3:
action: restore
description: >-
Restore test_index from the most recent snapshot in temp index dest_index_v2.
options:
repository: esbackup
# If name is blank, the most recent snapshot by age will be selected
name:
# If indices is blank, all indices in the snapshot will be restored
indices: ['test_index']
rename_pattern: test_index
rename_replacement: dest_index_v2
wait_for_completion: True
max_wait: 3600
wait_interval: 10
filters:
- filtertype: none
4:
action: open
description: >-
Open index pattern dest_index_v2.
options:
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
exclude:
5:
description: "Reindex dest_index_v2 into dest_test_index"
action: reindex
options:
wait_interval: 9
max_wait: -1
request_body:
source:
index: dest_index_v2
dest:
index: dest_test_index
filters:
- filtertype: none
6:
action: delete_indices
description: >-
Delete index dest_index_v2. Ignore the error if the filter does not result in an
actionable list of indices (ignore_empty_list) and exit cleanly.
options:
ignore_empty_list: False
disable_action: False
filters:
- filtertype: pattern
kind: prefix
value: dest_index_v2
If you are looking for some elasticsearch setting which will merge in updated destination index with the source index(which you are restoring from snapshot), then the short answer is NO.
You can write custom code to perform following operation to make sure destination index update is not lost.
Restore source index(test_index) to a temporary index in the cluster
, lets call this index as temp_index
Retrieve documents from temp_index and insert in destination index (dest_test_index) with op_type=create
Operation type create will make sure that the index operation will fail if a document by that id already exists in the index.
You can refer to documentation here
Hope this solves your purpose.

how do you start a workflow from another workflow and retrieve the return value of called workflow

I am testing google workflow and would like to call a workflow from another workflow but as a separate process (not a subworkflow)
I am able to start the execution but currently unable to retrieve the return value. I receive instead an instance of the execution:
{
"argument": "null",
"name": "projects/xxxxxxxxxxxx/locations/us-central1/workflows/child-workflow/executions/9fb4aa01-2585-42e7-a79f-cfb4b57b22d4",
"startTime": "2020-12-09T01:38:07.073406981Z",
"state": "ACTIVE",
"workflowRevisionId": "000003-cf3"
}
parent-workflow.yaml
main:
params: [args]
steps:
- callChild:
call: http.post
args:
url: 'https://workflowexecutions.googleapis.com/v1beta/projects/my-project/locations/us-central1/workflows/child-workflow/executions'
auth:
type: OAuth2
scope: 'https://www.googleapis.com/auth/cloud-platform'
result: callresult
- returnValue:
return: ${callresult.body}
child-workflow.yaml:
- getCurrentTime:
call: http.get
args:
url: https://us-central1-workflowsample.cloudfunctions.net/datetime
result: CurrentDateTime
- readWikipedia:
call: http.get
args:
url: https://en.wikipedia.org/w/api.php
query:
action: opensearch
search: ${CurrentDateTime.body.dayOfTheWeek}
result: WikiResult
- returnOutput:
return: ${WikiResult.body[1]}
also as an added question how can create a dynamic url from a variable. ${} doesn't seem to work there
As Executions are async API calls, you need to POLL for the workflow to see when finished.
You can have the following algorithm:
main:
steps:
- callChild:
call: http.post
args:
url: ${"https://workflowexecutions.googleapis.com/v1beta/projects/"+sys.get_env("GOOGLE_CLOUD_PROJECT_ID")+"/locations/us-central1/workflows/http_bitly_secrets/executions"}
auth:
type: OAuth2
scope: 'https://www.googleapis.com/auth/cloud-platform'
result: workflow
- waitExecution:
call: CloudWorkflowsWaitExecution
args:
execution: ${workflow.body.name}
result: workflow
- returnValue:
return: ${workflow}
CloudWorkflowsWaitExecution:
params: [execution]
steps:
- init:
assign:
- i: 0
- valid_states: ["ACTIVE","STATE_UNSPECIFIED"]
- result:
state: ACTIVE
- check_condition:
switch:
- condition: ${result.state in valid_states AND i<100}
next: iterate
next: exit_loop
- iterate:
steps:
- sleep:
call: sys.sleep
args:
seconds: 10
- process_item:
call: http.get
args:
url: ${"https://workflowexecutions.googleapis.com/v1beta/"+execution}
auth:
type: OAuth2
result: result
- assign_loop:
assign:
- i: ${i+1}
- result: ${result.body}
next: check_condition
- exit_loop:
return: ${result}
What you see here is that we have a CloudWorkflowsWaitExecution subworkflow which will loop 100 times at most, also has a 10 second delay, it will stop when the workflow has finished, and returns the result.
The output is:
argument: 'null'
endTime: '2020-12-09T13:00:11.099830035Z'
name: projects/985596417983/locations/us-central1/workflows/call_another_workflow/executions/05eeefb5-60bb-4b20-84bd-29f6338fa66b
result: '{"argument":"null","endTime":"2020-12-09T13:00:00.976951808Z","name":"projects/985596417983/locations/us-central1/workflows/http_bitly_secrets/executions/2f4b749c-4283-4c6b-b5c6-e04bbcd57230","result":"{\"archived\":false,\"created_at\":\"2020-10-17T11:12:31+0000\",\"custom_bitlinks\":[],\"deeplinks\":[],\"id\":\"j.mp/2SZaSQK\",\"link\":\"//<edited>/2SZaSQK\",\"long_url\":\"https://cloud.google.com/blog\",\"references\":{\"group\":\"https://api-ssl.bitly.com/v4/groups/Bg7eeADYBa9\"},\"tags\":[]}","startTime":"2020-12-09T13:00:00.577579042Z","state":"SUCCEEDED","workflowRevisionId":"000001-478"}'
startTime: '2020-12-09T13:00:00.353800247Z'
state: SUCCEEDED
workflowRevisionId: 000012-cb8
in the result there is a subkey that holds the results from the external Workflow execution.
The best method is now the workflows.executions.run helper method, which formats the request and blocks until the workflow execution has completed:
- run_execution:
try:
call: googleapis.workflowexecutions.v1.projects.locations.workflows.executions.run
args:
workflow_id: ${workflow}
location: ${location} # Defaults to current location
project_id: ${project} # Defaults to current project
argument: ${arguments} # Arguments could be specified inline as a map instead.
result: r1
except:
as: e
steps: ... # handle a failed execution

Curator 4.0 : Unable to take snapshot or run any action. Following examples from the document

I am trying to take snapshot of elastic index using curator 4. (Windows machine)
Getting below error (Getting same error for all actions).
Failed to complete action: snapshot. : Not an IndexList object. Type:
Any idea when we get this ?
I am following the examples provided in the documentation
https://www.elastic.co/guide/en/elasticsearch/client/curator/current/snapshot.html
Action yaml file :
actions:
1:
action: snapshot
description: >-
Snapshot logstash- prefixed indices older than 1 day (based on index
creation_date) with the default snapshot name pattern of
'curator-%Y%m%d%H%M%S'. Wait for the snapshot to complete. Do not skip
the repository filesystem access check. Use the other options to create
the snapshot.
options:
repository: myrepo
name: shan
ignore_unavailable: False
include_global_state: True
partial: False
wait_for_completion: True
skip_repo_fs_check: False
timeout_override:
continue_if_exception: False
disable_action: False
filters:
- filtertype: age
source: creation_date
direction: younger
unit: days
unit_count: 1
field:
stats_result:
epoch:
exclude:
OutPut :
2016-07-25 22:16:40,929 INFO Action #1: snapshot
2016-07-25 22:16:40,929 INFO Starting new HTTP connection (1): 127.0.0.1
2016-07-25 22:16:40,944 INFO GET http://127.0.0.1:9200/ [status:200 request:0.015s]
2016-07-25 22:16:40,946 INFO GET http://127.0.0.1:9200/_all/_settings?expand_wildcards=open%2Cclosed [status:200 request:0.002s]
2016-07-25 22:16:40,950 INFO GET http://127.0.0.1:9200/_cluster/state/metadata/.marvel-es-1-2016.06.27,.marvel-es-1-2016.06.28,.marvel-es-1-2016.06.29,.marvel-es-1-2016.06.30,.marvel-es-data-1,shan-claim-1 [status:200 request:0.004s]
2016-07-25 22:16:40,993 INFO GET http://127.0.0.1:9200/.marvel-es-1-2016.06.27,.marvel-es-1-2016.06.28,.marvel-es-1-2016.06.29,.marvel-es-1-2016.06.30,.marvel-es-data-1,shan-claim-1/_stats/store,docs [status:200 request:0.042s]
2016-07-25 22:16:40,993 ERROR Failed to complete action: snapshot. <class 'TypeError' at 0x000000001DFCC400>: Not an IndexList object. Type: <class 'curator.indexlist.IndexList' at 0x0000000002DB39B8>.
You need to add another filtertype so curator knows which indexs to run against. For example if your indexes are named logstash- your filters would look like
filters:
- filtertype: pattern
kind: prefix
value: logstash-
exclude:
- filtertype: age
source: creation_date
direction: younger
unit: days
unit_count: 1
field:
stats_result:
epoch:
exclude:
There is a bad identation at the beginning of yourfile. The acion list should be within the "actions" keyword. This is your root level.

Resources