zap (docker) api scan against graphql specifying include or exclude queries - graphql

Thank you in advance for your time on this.
Is there a way to tell zap api scan, using docker run -i owasp/zap2docker-stable zap-api-scan.py, what queries and/or mutations from my graphql schema to hit during scan and which to exclude from the scan or do I need to set up my schema file to only include what I want scanned?
My problem is that the schema I am trying to scan is massive. I only want to scan like 15 mutations out of about 200...
Something like:
docker run -i owasp/zap2docker-stable zap-api-scan.py \
-t https://mytarget.com -f graphql \
-f graphql \
-schema schema-file.graphql \
--include-mutations file-with-list-of-mutations-to-include

The packaged scans are quite flexible, and do allow you to specify exactly which scan rules to run and which 'strength' to use for each rule.
However there are limits to what you can easily acheive, so you might want to look at the Automation Framework which is much more flexible.

Related

How to get only the priority from the entire JSON object in Bash Script?

I am trying to get the highest priority used for the application gateway rules. I can get the rules using the below command:
az network application-gateway rule list -g $resource_group_name --gateway-name $app_gateway_name
How can we retrieve the priority from that data in a bash script?
try to use jq tool, it requires installation, but it allows to manipulate JSON structures and even modify it.
I see that 'az' command also support JMESpath. So you can filter exactly what you need.
--query JMESPath query string. See http://jmespath.org/ for more information and examples.
I didn't know what is output structure. So as example command can be look like this:
az network application-gateway rule list \
-g $resource_group_name \
--gateway-name $app_gateway_name \
--query '[].priority | sort(#) | [0]'

AsyncAPI: Only generate payload

Is it possible to skip generation of specific files using asyncapi-generator?
I am using the Go generator but I only need the payload.go. Right now it always generates all files:
handlers.go payloads.go publishers.go router.go server.go subscribers.go
The command I am using is:
$ docker run --rm -it \
-v ${PWD}/asyncapi.yaml:/app/asyncapi.yml \
-v ${PWD}/output:/app/output \
asyncapi/generator -o /app/output /app/asyncapi.yml #asyncapi/go-watermill-template --force-write
You cannot selectively generate only selected files yet. I encourage you to join the related discussion on GitHub
From what I understand is that you are interested only in models generation. So maybe you should just use directly the Modelina tool that is used there in go-watermill-template.
Modelina is already integrated with AsyncAPI CLI and you can do asyncapi generate models golang asyncapi.yml

How to bulk load data into a dgraph/standalone:graphql container?

Assuming I've a db like the quick-start of https://graphql.dgraph.io/docs/quick-start/
i.e.
type Product {
productID: ID!
name: String #search(by: [term])
reviews: [Review] #hasInverse(field: about)
}
type Customer {
custID: ID!
name: String #search(by: [hash, regexp])
reviews: [Review] #hasInverse(field: by)
}
type Review {
id: ID!
about: Product! #hasInverse(field: reviews)
by: Customer! #hasInverse(field: reviews)
comment: String #search(by: [fulltext])
rating: Int #search
}
Now I would like to import millions of entries and therefore would like to use the bulk loader. My dataset is a bug folder full of .json files.
To what I've seen, I should be able to run a command like
dgraph bulk -f folderOfJsonFiles -s goldendata.schema --map_shards=4 --reduce_shards=2 --http localhost:8000 --zero=localhost:5080
But to run my server, I am using the dgraph/standalone:graphql image ran docker run -v $(pwd):/dgraph -p 9000:9000 -it dgraph/standalone:graphql
Now how to start the bulk import ?
1:
Should I run the command within the docker container itself (and share the volume (folder) containing all my .json files ) or install dgraph on my host and run the dgraph bulk command from the host ?
2: What should be the format of the .json files ?
3: Would the bulk loader support blank nodes (id which are not _:0x1234) ?
[edit]
bulk loader seems not to support graphql schema, the schema should be converted to rdf first. To achieve this, I exported the schema and data right after importing the graphql schema curl 'localhost:8080/admin/export?format=json'
Here a few things to understand:
the bulk loader is not an offline version of the live loader. It is a tool which purpose is to prepare the data for the Dgraph Alpha(s) server(s).
the bulk loader, seems to be only able to load triples
the bulk loader can load a schema and files but this is not the graphql schema, the graphql schema must be loaded apart later.
So to answer the question:
start the dgraph graphql server using docker run -v $(pwd)/dgraph:/dgraph -p 8000:8000 -p 9000:9000 -p 8080:8080 -p 9080:9080 -p 5080:5080 -it dgraph/standalone:graphql for your information, this image launch the /tmp/run.sh script which will itself run dgraph-ratel & dgraph zero & dgraph alpha --lru_mb $lru_mb & dgraph graphql (where lru_mb is the memory you give to dgraph alpha). Keep the container's id for later find it using docker ps if you lost it.
Unless you have + 5 millions of entries (or no time), try using the live loader. If you have troubles with the live loader like: it became very slow after few hundred of thousands entries (300k in my case), this is very likely because your alpha does not have sufficient memory. In my case, I had to tune docker to provide 16Gb of memory to the engine, the script gives to the $lru_mb variable a third of the host memory.
Once you imported your full set of data using live loader, you can export the data using docker exec -it yourDockerContainerId curl localhost:8080/admin/export?format=json, the export will generate 2 files for instance: g01.json.gzand g01.schema.gz which corresponds to your entries and their schema (which is not the graphql schema).
To import those 2 files g01.json.gzand g01.schema.gz back to your dgraph graphql instance, you need to convert them to group’s "p" directory output. To what I understood, the "p" directory holds all the data for the Dgraph Alpha. If you delete it, you lose your data, if you replace it with another set, you will replace / restore the data with the one you just copied. Bulk loader is not an instance of dgraph, it is only the tool which will generate those "p" directory outputs. I have been successful running it within the container. Just run docker exec -it yourDockerContainerId dgraph bulk -f export/pathTo/g01.json.gz -s export/pathTo/g01.schema.gz --map_shards=1 --reduce_shards=1 --http localhost:8001 --zero=localhost:5080. I will be honest, I do not understand the purpose of the http localhost:8001 argument in this command. If the bulk loader ran successfully, it created an out/0/p folder containing the data you can use in your Dgraph Alpha. Stop your docker container docker stop yourDockerContainerId then Replace your current Dgraph Alpha's p folder with the one generated by bulk loader. (Re)start your docker container and you should have your imported data. (perhaps trash the w and zw folders as well, I have no clue about their use).
The data is imported but you will have an warning saying something like there is no graphql schema. Okay let's import our schema (assuming you have it at path dgraph/schemas/schema.graphql) schema=$(cat dgraph/schemas/schema.graphql | tr '\\n' ' ');jq -n --arg schema \"$schema\" '{ query: \"mutation addSchema($sch: String!) { addSchema(input: { schema: $sch }) { schema { schema } } }\", variables: { sch: $schema }}' | curl -X POST -H \"Content-Type: application/json\" http://localhost:9000/admin -d #- This might take few minutes as graph will likely have to index your data according to your graphql schema's indexing rule (typically related to the #search decorator)
You're done…
Now, I am still not completely answering the question because the data we are importing back is the one we just exported (and the one we actually imported using the live loader). So unfortunately, the bulk loader cannot import nice data like live loader, you have to feed him with triples. Therefore you have to prepare the data to load using bulk loader in that format. To help you in this talk, I suggest to
Run the dgraph graphql server docker run -v $(pwd)/dgraph:/dgraph -p 8000:8000 -p 9000:9000 -p 8080:8080 -p 9080:9080 -p 5080:5080 -it dgraph/standalone:graphql
import a graphql schema (assuming the schema is at path dgraph/schemas/schema.graphql ) schema=$(cat dgraph/schemas/schema.graphql | tr '\\n' ' ');jq -n --arg schema \"$schema\" '{ query: \"mutation addSchema($sch: String!) { addSchema(input: { schema: $sch }) { schema { schema } } }\", variables: { sch: $schema }}' | curl -X POST -H \"Content-Type: application/json\" http://localhost:9000/admin -d #-
create one or two basic / template entries using a graphql client. You can install the Altair chrome extension, connect to http://localhost:9000/graphql then add some data, something like:
mutation {
addCustomer(input:{name:"Toto"}){
name
}
}
You can also using a file and the live loader
Then export your small template data docker exec -it yourDockerContainerId curl localhost:8080/admin/export?format=json
Open the g01.json.gz and you will find an example of the data the bulk loader expects to be fed with.
What about blank ids ? I am not sure but as the bulk loader is doing a 2 levels mapping on ids, I can imagine you can provide your ids and those will be converted to dgraph ids later.

Disable scheduling on second instance of same project on AWS

I have 2 instances of the same deployment/project on AWS Elastic Beanstalk.
Both contain a Laravel project which contains scheduling code which runs various commands which can be found in the schedule method/function of the Kernel.php class within 'app/Console' - the problem I have is that if a command runs from one instance then it will also run the command from the second instance which is not what I want to happen.
What I would like to happen is that the commands get run from only one instance and not the other. How do I achieve this in the easiest way possible?
Is there a Laravel package which could help me achieve this?
From Laravel 5.6:
Laravel provides a onOneServer method which you can use if your applications share a single cache server. You could use something like ElastiCache to host Redis or Memcached and use it as your cache server for both of your application instances. Then you would be able to use the onOneServer method like this:
$schedule->command('report:generate')
->fridays()
->at('17:00')
->onOneServer();
For older versions of Laravel:
You could use the jdavidbakr/multi-server-event package. Once you have it set up you should be able to use it like:
$schedule->command('inspire')
->daily()
->withoutOverlappingMultiServer();
I had the same issue to run some cronjobs (nothing related to Laravel) and I found a nice solution (don't remember where I found it)
What I do is check if the instance running the code is the first instance on the Auto Scaling Group, if it's the first then I execute the command otherwise just exit.
This is the way it's implemented:
#!/bin/bash
INSTANCE_ID=`curl http://169.254.169.254/latest/meta-data/instance-id 2>/dev/null`
REGION=`curl -s http://169.254.169.254/latest/dynamic/instance-identity/document 2>/dev/null | jq -r .region`
# Find the Auto Scaling Group name from the Elastic Beanstalk environment
ASG=`aws ec2 describe-tags --filters "Name=resource-id,Values=$INSTANCE_ID" \
--region $REGION --output json | jq -r '.[][] | select(.Key=="aws:autoscaling:groupName") | .Value'`
# Find the first instance in the Auto Scaling Group
FIRST=`aws autoscaling describe-auto-scaling-groups --auto-scaling-group-names $ASG \
--region $REGION --output json | \
jq -r '.AutoScalingGroups[].Instances[] | select(.LifecycleState=="InService") | .InstanceId' | sort | head -1`
# If the instance ids are the same exit 0
[ "$FIRST" = "$INSTANCE_ID" ]
Try implementing those calls using PHP and it should work.

Dump all documents of Elasticsearch

Is there any way to create a dump file that contains all the data of an index among with its settings and mappings?
A Similar way as mongoDB does with mongodump
or as in Solr its data folder is copied to a backup location.
Cheers!
Here's a new tool we've been working on for exactly this purpose https://github.com/taskrabbit/elasticsearch-dump. You can export indices into/out of JSON files, or from one cluster to another.
Elasticsearch supports a snapshot function out of the box:
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html
We can use elasticdump to take the backup and restore it, We can move data from one server/cluster to another server/cluster.
1. Commands to move one index data from one server/cluster to another using elasticdump.
# Copy an index from production to staging with analyzer and mapping:
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=http://staging.es.com:9200/my_index \
--type=analyzer
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=http://staging.es.com:9200/my_index \
--type=mapping
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=http://staging.es.com:9200/my_index \
--type=data
2. Commands to move all indices data from one server/cluster to another using multielasticdump.
Backup
multielasticdump \
--direction=dump \
--match='^.*$' \
--limit=10000 \
--input=http://production.es.com:9200 \
--output=/tmp
Restore
multielasticdump \
--direction=load \
--match='^.*$' \
--limit=10000 \
--input=/tmp \
--output=http://staging.es.com:9200
Note:
If the --direction is dump, which is the default, --input MUST be a URL for the base location of an ElasticSearch server (i.e. http://localhost:9200) and --output MUST be a directory. Each index that does match will have a data, mapping, and analyzer file created.
For loading files that you have dumped from multi-elasticsearch, --direction should be set to load, --input MUST be a directory of a multielasticsearch dump and --output MUST be a Elasticsearch server URL.
The 2nd command will take a backup of settings, mappings, template and data itself as JSON files.
The --limit should not be more than 10000 otherwise, it will give an exception.
Get more details here.
For your case Elasticdump is the perfect answer.
First, you need to download the mapping and then the index
# Install the elasticdump
npm install elasticdump -g
# Dump the mapping
elasticdump --input=http://<your_es_server_ip>:9200/index --output=es_mapping.json --type=mapping
# Dump the data
elasticdump --input=http://<your_es_server_ip>:9200/index --output=es_index.json --type=data
If you want to dump the data on any server I advise you to install esdump through docker. You can get more info from this website Blog Link
ElasticSearch itself provides a way to create data backup and restoration. The simple command to do it is:
CURL -XPUT 'localhost:9200/_snapshot/<backup_folder name>/<backupname>' -d '{
"indices": "<index_name>",
"ignore_unavailable": true,
"include_global_state": false
}'
Now, how to create, this folder, how to include this folder path in ElasticSearch configuration, so that it will be available for ElasticSearch, restoration method, is well explained here. To see its practical demo surf here.
At the time of writing this answer(2021), the official way of backing up an ElasticSearch cluster is to snapshot it. Refer to: https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html
The data itself is one or more lucene indices, since you can have multiple shards. What you also need to backup is the cluster state, which contains all sorts of information regarding the cluster, the available indices, their mappings, the shards they are composed of etc.
It's all within the data directory though, you can just copy it. Its structure is pretty intuitive. Right before copying it's better to disable automatic flush (in order to backup a consistent view of the index and avoiding writes on it while copying files), issue a manual flush, disable allocation as well. Remember to copy the directory from all nodes.
Also, next major version of elasticsearch is going to provide a new snapshot/restore api that will allow you to perform incremental snapshots and restore them too via api. Here is the related github issue: https://github.com/elasticsearch/elasticsearch/issues/3826.
You can also dump elasticsearch data in JSON format by http request:
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
CURL -XPOST 'https://ES/INDEX/_search?scroll=10m'
CURL -XPOST 'https://ES/_search/scroll' -d '{"scroll": "10m", "scroll_id": "ID"}'
To export all documents from ElasticSearch into JSON, you can use the esbackupexporter tool. It works with index snapshots. It takes the container with snapshots (S3, Azure blob or file directory) as the input and outputs one or several zipped JSON files per index per day. It is quite handy when exporting your historical snapshots. To export your hot index data, you may need to make the snapshot first (see the answers above).
If you want to massage the data on its way out of Elasticsearch, you might want to use Logstash. It has a handy Elasticsearch Input Plugin.
And then you can export to anything, from a CSV file to reindexing the data on another Elasticsearch cluster. Though for the latter you also have the Elasticsearch's own Reindex.

Resources