Thank you in advance for your time on this.
Is there a way to tell zap api scan, using docker run -i owasp/zap2docker-stable zap-api-scan.py, what queries and/or mutations from my graphql schema to hit during scan and which to exclude from the scan or do I need to set up my schema file to only include what I want scanned?
My problem is that the schema I am trying to scan is massive. I only want to scan like 15 mutations out of about 200...
Something like:
docker run -i owasp/zap2docker-stable zap-api-scan.py \
-t https://mytarget.com -f graphql \
-f graphql \
-schema schema-file.graphql \
--include-mutations file-with-list-of-mutations-to-include
The packaged scans are quite flexible, and do allow you to specify exactly which scan rules to run and which 'strength' to use for each rule.
However there are limits to what you can easily acheive, so you might want to look at the Automation Framework which is much more flexible.
I am running standalone ElasticSearch/Kibana servers for multiple tenants. I would like to pull the cluster stats from each single instance and would like to import them into my own ElasticSearch/Kibana. How would I go about doing this? I have started to export the cluster stats to a file already.
curl -XGET 'http://localhost:9200/_cluster/stats?human&pretty' > tenant01.json
I then transfer the tenant01.json file to my own ElasticSearch/Kibana. How would I start to import the data into a new index?
You should use the bulk API into the new index
curl -XPUT localhost:9200/newIndex/_bulk --data-binary #shakespeare.json
Follow the Bulk API for the correct file format, notice it uses \n to separate metadata from source values,so it needs not to be pretty printed
action_and_meta_data\n
optional_source\n
action_and_meta_data\n
optional_source\n
....
action_and_meta_data\n
optional_source\n
I am attempting to load a JSON document from Hadoop HDFS into Couchbase using sqoop. I am able to load the documents correctly, but the TTL of the document is 0. I would like to expire the documents over a period of time and not have them live forever. Is that possible with the Couchbase connector for Sqoop?
As I said, the documents are loaded correctly, just without a TTL.
The document looks like this:
key1#{"key": "key1", "message": "A message here"}
key2#{"key": "key2", "message": "Another message"}
The sqoop call looks like this:
sqoop export -D mapred.map.child.java.opts="-Xmx4096m" \
-D mapred.job.map.memory.mb=6000 \
--username ${COUCHBASE_BUCKET} \
--password-file ${COUCHBASE_PASSWORD_FILE} \
--table ignored \
--connect ${COUCHBASE_URL} \
--export-dir ${INPUT_DIR} \
--verbose \
--input-fields-terminated-by '#' \
--lines-terminated-by '\n' \
-m 2
Thank you for your help.
I do not think there's a straightforward UI/settings to do it. The code would have to be modified within the connector.
There is no TTL option in the current sqoop plugin version. However, if you just want to set the same TTL for all the imported objects, you can quite easily add the code yourself. Take a look at line 212 here: https://github.com/couchbase/couchbase-hadoop-plugin/blob/master/src/java/com/couchbase/sqoop/mapreduce/db/CouchbaseOutputFormat.java#L212
You just need to add a TTL parameter to the set calls. If you want to be thorough about it, you can take the TTL value from the command line and put it in the DB configuration object, so you can use it in code.
How do I move Elasticsearch data from one server to another?
I have server A running Elasticsearch 1.1.1 on one local node with multiple indices.
I would like to copy that data to server B running Elasticsearch 1.3.4
Procedure so far
Shut down ES on both servers and
scp all the data to the correct data dir on the new server. (data seems to be located at /var/lib/elasticsearch/ on my debian boxes)
change permissions and ownership to elasticsearch:elasticsearch
start up the new ES server
When I look at the cluster with the ES head plugin, no indices appear.
It seems that the data is not loaded. Am I missing something?
The selected answer makes it sound slightly more complex than it is, the following is what you need (install npm first on your system).
npm install -g elasticdump
elasticdump --input=http://mysrc.com:9200/my_index --output=http://mydest.com:9200/my_index --type=mapping
elasticdump --input=http://mysrc.com:9200/my_index --output=http://mydest.com:9200/my_index --type=data
You can skip the first elasticdump command for subsequent copies if the mappings remain constant.
I have just done a migration from AWS to Qbox.io with the above without any problems.
More details over at:
https://www.npmjs.com/package/elasticdump
Help page (as of Feb 2016) included for completeness:
elasticdump: Import and export tools for elasticsearch
Usage: elasticdump --input SOURCE --output DESTINATION [OPTIONS]
--input
Source location (required)
--input-index
Source index and type
(default: all, example: index/type)
--output
Destination location (required)
--output-index
Destination index and type
(default: all, example: index/type)
--limit
How many objects to move in bulk per operation
limit is approximate for file streams
(default: 100)
--debug
Display the elasticsearch commands being used
(default: false)
--type
What are we exporting?
(default: data, options: [data, mapping])
--delete
Delete documents one-by-one from the input as they are
moved. Will not delete the source index
(default: false)
--searchBody
Preform a partial extract based on search results
(when ES is the input,
(default: '{"query": { "match_all": {} } }'))
--sourceOnly
Output only the json contained within the document _source
Normal: {"_index":"","_type":"","_id":"", "_source":{SOURCE}}
sourceOnly: {SOURCE}
(default: false)
--all
Load/store documents from ALL indexes
(default: false)
--bulk
Leverage elasticsearch Bulk API when writing documents
(default: false)
--ignore-errors
Will continue the read/write loop on write error
(default: false)
--scrollTime
Time the nodes will hold the requested search in order.
(default: 10m)
--maxSockets
How many simultaneous HTTP requests can we process make?
(default:
5 [node <= v0.10.x] /
Infinity [node >= v0.11.x] )
--bulk-mode
The mode can be index, delete or update.
'index': Add or replace documents on the destination index.
'delete': Delete documents on destination index.
'update': Use 'doc_as_upsert' option with bulk update API to do partial update.
(default: index)
--bulk-use-output-index-name
Force use of destination index name (the actual output URL)
as destination while bulk writing to ES. Allows
leveraging Bulk API copying data inside the same
elasticsearch instance.
(default: false)
--timeout
Integer containing the number of milliseconds to wait for
a request to respond before aborting the request. Passed
directly to the request library. If used in bulk writing,
it will result in the entire batch not being written.
Mostly used when you don't care too much if you lose some
data when importing but rather have speed.
--skip
Integer containing the number of rows you wish to skip
ahead from the input transport. When importing a large
index, things can go wrong, be it connectivity, crashes,
someone forgetting to `screen`, etc. This allows you
to start the dump again from the last known line written
(as logged by the `offset` in the output). Please be
advised that since no sorting is specified when the
dump is initially created, there's no real way to
guarantee that the skipped rows have already been
written/parsed. This is more of an option for when
you want to get most data as possible in the index
without concern for losing some rows in the process,
similar to the `timeout` option.
--inputTransport
Provide a custom js file to us as the input transport
--outputTransport
Provide a custom js file to us as the output transport
--toLog
When using a custom outputTransport, should log lines
be appended to the output stream?
(default: true, except for `$`)
--help
This page
Examples:
# Copy an index from production to staging with mappings:
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=http://staging.es.com:9200/my_index \
--type=mapping
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=http://staging.es.com:9200/my_index \
--type=data
# Backup index data to a file:
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=/data/my_index_mapping.json \
--type=mapping
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=/data/my_index.json \
--type=data
# Backup and index to a gzip using stdout:
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=$ \
| gzip > /data/my_index.json.gz
# Backup ALL indices, then use Bulk API to populate another ES cluster:
elasticdump \
--all=true \
--input=http://production-a.es.com:9200/ \
--output=/data/production.json
elasticdump \
--bulk=true \
--input=/data/production.json \
--output=http://production-b.es.com:9200/
# Backup the results of a query to a file
elasticdump \
--input=http://production.es.com:9200/my_index \
--output=query.json \
--searchBody '{"query":{"term":{"username": "admin"}}}'
------------------------------------------------------------------------------
Learn more # https://github.com/taskrabbit/elasticsearch-dump`enter code here`
Use ElasticDump
1) yum install epel-release
2) yum install nodejs
3) yum install npm
4) npm install elasticdump
5) cd node_modules/elasticdump/bin
6)
./elasticdump \
--input=http://192.168.1.1:9200/original \
--output=http://192.168.1.2:9200/newCopy \
--type=data
You can use snapshot/restore feature available in Elasticsearch for this. Once you have setup a Filesystem based snapshot store, you can move it around between clusters and restore on a different cluster
There is also the _reindex option
From documentation:
Through the Elasticsearch reindex API, available in version 5.x and later, you can connect your new Elasticsearch Service deployment remotely to your old Elasticsearch cluster. This pulls the data from your old cluster and indexes it into your new one. Reindexing essentially rebuilds the index from scratch and it can be more resource intensive to run.
POST _reindex
{
"source": {
"remote": {
"host": "https://REMOTE_ELASTICSEARCH_ENDPOINT:PORT",
"username": "USER",
"password": "PASSWORD"
},
"index": "INDEX_NAME",
"query": {
"match_all": {}
}
},
"dest": {
"index": "INDEX_NAME"
}
}
I've always had success simply copying the index directory/folder over to the new server and restarting it. You'll find the index id by doing GET /_cat/indices and the folder matching this id is in data\nodes\0\indices (usually inside your elasticsearch folder unless you moved it).
I tried on ubuntu to move data from ELK 2.4.3 to ELK 5.1.1
Following are the steps
$ sudo apt-get update
$ sudo apt-get install -y python-software-properties python g++ make
$ sudo add-apt-repository ppa:chris-lea/node.js
$ sudo apt-get update
$ sudo apt-get install npm
$ sudo apt-get install nodejs
$ npm install colors
$ npm install nomnom
$ npm install elasticdump
in home directory goto
$ cd node_modules/elasticdump/
execute the command
If you need basic http auth, you can use it like this:
--input=http://name:password#localhost:9200/my_index
Copy an index from production:
$ ./bin/elasticdump --input="http://Source:9200/Sourceindex" --output="http://username:password#Destination:9200/Destination_index" --type=data
If you can add the second server to cluster, you may do this:
Add Server B to cluster with Server A
Increment number of replicas for indices
ES will automatically copy indices to server B
Close server A
Decrement number of replicas for indices
This will only work if number of replaces equal to number of nodes.
If anyone encounter the same issue, when trying to dump from elasticsearch <2.0 to >2.0 you need to do:
elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=analyzer
elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=mapping
elasticdump --input=http://localhost:9200/$SRC_IND --output=http://$TARGET_IP:9200/$TGT_IND --type=data --transform "delete doc.__source['_id']"
We can use elasticdump or multielasticdump to take the backup and restore it, We can move data from one server/cluster to another server/cluster.
Please find a detailed answer which I have provided here.
You can take a snapshot of the complete status of your cluster (including all data indices) and restore them (using the restore API) in the new cluster or server.
If you simply need to transfer data from one elasticsearch server to another, you could also use elasticsearch-document-transfer.
Steps:
Open a directory in your terminal and run
$ npm install elasticsearch-document-transfer.
Create a file config.js
Add the connection details of both elasticsearch servers in config.js
Set appropriate values in options.js
Run in the terminal
$ node index.js
i guess that you can copy the folder data.
Another great new tool which uses the _bulk API to reindex data between server is esm:
Download and Install
wget https://github.com/medcl/esm/releases/download/v0.6.1/migrator-linux-amd64
mv migrator-linux-amd64 esm
chmod +x esm
Migrate One Index
Migrate a single index between 2 servers using 40 workers:
./esm -s https://my.source.server.com:9200 \
-m elastic:*** \
-d http://my.destination.server.com:9200 \
-n elastic:*** \
-x myindex \
-w 40
It may be necessary to create your index (or index template) on the destination server first.
See docs for further examples of how to migrate all or multiple indices.
If you don't want to use the elasticdump like a console tool. You can use next node.js script
I need to search a text in a SVN repository, inside all revisions.
I am trying this software, which indexes all repositories and then you can search:
http://www.supose.org/projects/supose/wiki
#create the index
./bin/supose sc -U "svn://..." --username ... -p ... --fromrev 0 --create
#search
./bin/supose se -Q "+contents:class"
if i understood correctly, this should search across all files for the text "class".
this should return a lot of matches, as there a lot of java files.
however, it only returns maches in some .xml; it is ignoring java files.
why?
and what is the "search --fields" option? what is a field?