How to solve malformed URI while using elasticdump? - elasticsearch

I am using Elastcsearch 7.9 and elasticdump 6.7
I am trying to get a dump (.json) file from elastcsearch with all the documents of a index.
I am getting
Thu, 27 May 2021 06:26:35 GMT | starting dump
Thu, 27 May 2021 06:26:35 GMT | Error Emitted => URI malformed
Thu, 27 May 2021 06:26:35 GMT | Error Emitted => URI malformed
Thu, 27 May 2021 06:26:35 GMT | Total Writes: 0
Thu, 27 May 2021 06:26:35 GMT | dump ended with error (get phase) => URIError: URI malformed
My command
elasticdump \
--input=https://username:password#elasticsearchURL:9200/index \
--output=/home/ubuntu/dump.json \
--type=data
Here the problem is password have many special characters.
I cant change the password.
I tried
quotes for password.
Escape special character.
Encoding the url
for all cases I am getting same error
Please help me to send password with special characters(# & % ^ * : ) , $)
Thanks in advance.

actually, ı tried elastisdump but properly not working,you should use elastic snapshot,maybe working in less data but if data is big elasticdump wont work
loook at the elastic snapshot
snapshot
for example using snapshot
PUT /_snapshot/my_fs_backup?pretty
{
"type": "fs",
"settings": {
"location": "/home/zeus/Desktop/denesnapshot",
"compress": true
}
}
#chmod 777 /home/zeus/Desktop/denesnapshot
#create repo
PUT /_snapshot/my_fs_backup/deneme
#get all repo
GET /_snapshot/my_fs_backup/_all
#if you want delete repo
DELETE /_snapshot/my_fs_backup/deneme
#restore
POST /_snapshot/my_fs_backup/deneme/_restore
extra you should add
path.repo: /home/zeus/Desktop/denesnapshot
to elasticsearch.yml file
I imported 1.5 million data and that was 300GB

I made mistake that I have encoded the complete URL.Don't encode the complete URL.
Encode the password while adding password into url.
for example
Password - a#b)c#d%e*f.^gh
Encoded password - a%40b%29c%23d%25e*f.%5Egh
My script will be:
elasticdump \
--input=https://username:a%40b%29c%23d%25e*f.%5Egh#elasticsearchURL:9200/index \
--output=/home/ubuntu/dump.json \
--type=data
Please refer ASCII Encoding Reference for encoding the password

Related

AMP update-cache resulting in 404 or 410 error from origin

I've been trying to update the AMP cached pages on my website for a couple of days now to no avail.
While the documentation for updating the cache exists, it was probably written by a Google engineer, and as a result, isn't the easiest read.
https://developers.google.com/amp/cache/update-cache
I've followed the directions to the best of my ability.
I've created a private-key and public-key. Created a signature.bin and verified it using the procedure in Google's own documentation.
~$ openssl dgst -sha256 -signature signature.bin -verify
public-key.pem url.txt
Verified OK
The public-key.pem has been renamed to apikey.pub and uploaded to the following directory:
https://irecover.ca/.well-known/amphtml/apikey.pub
To validate that there has been no issue in the copying, I checked the signature using the following:
$ openssl dgst -sha256 -signature signature.bin -verify <(curl https://irecover.ca/.well-known/amphtml/apikey.pub) url.txt
% Total % Received % Xferd Average Speed Time Time Time
Current
Dload Upload Total Spent Left Speed
100 450 100 450 0 0 2653 0 --:--:-- --:--:--
--:--:-- 2662
Verified OK
Now I convert the signature file to base64 and replace the / with _ and the + with -
cat signature.bin | base64 > signature.b64
sed 's///_/g' signature.b64 > signature.b64a
sed 's/+/-/g' signature.b64a > signature.b64b
sed 's/=//g' signature.b64b > signature.b64c
cat signature.b64c | tr -d '\n' > signature.b64
I have made a script that makes the update-cache url for me. It also creates a timestamp right that moment and uses it for the amp_ts variable (So the amp_ts is never out by more than 1 second). I then append that to the end of the query which is about to be cURL'd by the script I have made, so it looks like so:
https://irecover-ca.cdn.ampproject.org/update-cache/c/s/irecover.ca/article?amp_action=flush&amp_ts=1581446499&amp_url_signature=KDaKbX0AbVbllwkTpDMFPOsFCRNw2sbk6Vd552bbG3u5QrecEmQ1SoMzmMR7iSXinO7LfM2bRCgJ1aD4y2cCayzrQuICrGz6b_PH7gKpo6tqETz06WmVeiP89xh_pBOu-pyN5rRHf0Pbu8oRkD2lRqgnGrLXDfIrFTTMRmHlO0bsa8GknyXL8RNXxk9ZQaufXAz-UJpoKaZBvT6hJWREAzxoZ-rGnDPVaC3nlBCu3yPorFcTbbr0CBz2svbfGgAYLQl54lLQmUpxI8661AEe1rdOLqAyLIUb4ZiSbO65-PmIkdZWVPFHMdbpSv4GMNdvodleCWBfMAcG2C09v-LR6g
However, this always results in the same error code from google.
Invalid public key due to ingestion error: 404 or 410 error from origin
Does anyone have any idea what I'm doing wrong?
A couple of things to check for apikey.pub accessibility:
The /.well-known/amphtml/apikey.pub file is accessible to both mobile and desktop user agents (e.g no redirect for non-mobile as the AMP cache client may redirect)
The public key is not excluded in robots.txt e.g:
User-agent: *
Allow: /.well-known/amphtml/apikey.pub
The public key response has the expected headers (e.g content-type: text/plain):
curl -I https://amp.example.com/.well-known/amphtml/apikey.pub
HTTP/2 200
date: Sun, 26 Jul 2020 23:48:55 GMT
content-type: text/plain
vary: Accept-Encoding
etag: W/"1c3-173478a8840"
last-modified: Sun, 26 Jul 2020 23:48:55 GMT
With those things in place, I get an "OK" success response from the update/cache endpoint

formatting email output in bash

I'm writing a script that sends a user an email when their AWS access keys are too old.
Right now the output looks like this:
Hello tdunphy, \n Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days. \n Regards, \n Cloud Ops
I want the message to look like this:
Hello tdunphy,
Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days.
Regards,
Cloud Ops
This is the line that sets up the body of the message:
MAIL_TXT1="Hello $user_name, \\n Your access key: $user_access_key1 was created on $date1 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days. \\n Regards, \\n Cloud Ops"
Why are the newlines not working in this example? How can I get this to work?
You can use:
template=$'Hello tdunphy, \n Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days. \n Regards, \n Cloud Ops'
Then either echo works:
$ echo "$template"
Hello tdunphy,
Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days.
Regards,
Cloud Ops
Or printf:
$ printf "%s" "$template"
Your access key: AKIAICDOHVTMEAB6RM5Q was created on Wed Feb 7 22:55:51 EST 2018 and needs to be replaced. All AWS Keys need to be replaced if they are older than 90 days.
Regards,
Cloud Ops
With printf you can assemble the template fields more easily as well:
$ t2=$'Fields:\n\tf1=%s\n\tf2=%s'
$ printf "$t2" 101 102
Fields:
f1=101
f2=102

Get source code of this website

I would like to get some data from some books I want to buy. But for that I need to get the source code of the page and I can not.
A exemplo URL is:
http://www.mcu.es/webISBN/tituloDetalle.do?sidTitul=793927&action=busquedaInicial&noValidating=true&POS=0&MAX=50&TOTAL=0&prev_layout=busquedaisbn&layout=busquedaisbn&language=es
I'm testing with various possibilities in curl, wget, lynx, accepting cookies, etc.
# curl http://www.mcu.es/webISBN/tituloDetalle.do?sidTitul=793927&action=busquedaInicial&noValidating=true&POS=0&MAX=50&TOTAL=0&prev_layout=busquedaisbn&layout=busquedaisbn&language=es
[1] 1680
[2] 1681
[3] 1682
[4] 1683
[5] 1684
[6] 1685
[7] 1686
[8] 1687
If I see the headers, I marked a 302
curl -I 'http://www.mcu.es/webISBN/tituloDetalle.do?sidTitul=793927&action=busquedaInicial&noValidating=true&POS=0&MAX=50&TOTAL=0&prev_layout=busquedaisbn&layout=busquedaisbn&language=es'
**HTTP/1.1 302 Movido tempor�lmente**
Date: Fri, 08 Jul 2016 09:31:07 GMT
Server: Apache
X-Powered-By: Servlet 2.4; JBoss-4.2.1.GA (build: SVNTag=JBoss_4_2_1_GA date=200707131605)/Tomcat-5.5
Location: http://www.mcu.es/paginaError.html
Vary: Accept-Encoding,User-Agent
Content-Type: text/plain; charset=ISO-8859-1
The same goes for me if I use '', "", \? \&, wget, lynx -source, accept cookies, etc.The only thing I get download error page (where I send the code 302)
You know how I can download the source code of the URL that I put an example? (Bash, php, python, perl ...)
Thank you very much.
The page you are looking for isn't available. Try visiting the website on your browser, you will still not be able to get the information you need. If you need the source you need to give the -L flag and it will get the source code.

set Expiry Date S3 AWS cloud front, update with s3cmd

i have uploaded to s3 images for my website and now i want to update the Expiry Date recursively.
i have used the following command:
s3cmd --recursive modify --add-header="Cache-Control:max-age=31536000" s3://ccc-public/
but when i view the image in aws console, it shows Matadata set for Cache Control as specified, but the Expiry Date is still set to None.
i have also tried:
s3cmd --recursive modify --add-header="Expires: Sat, 02 Aug 2016 18:46:39 GMT" --add-header="Cache-Control:max-age=31536000" s3://ccc-public/
and again, this put the metadata for the Expires, but the images still don't have a Expiry Date.
how do i modify all the files so that there is an Expiry Date using s3cmd tool?
any advice much appreciated.
Your command looks correct. Try removing the space after the colon:
--add-header="Expires:Sat, 02 Aug 2016 18:46:39 GMT"
^

How to properly handle a gzipped page when using curl?

I wrote a bash script that gets output from a website using curl and does a bunch of string manipulation on the html output. The problem is when I run it against a site that is returning its output gzipped. Going to the site in a browser works fine.
When I run curl by hand, I get gzipped output:
$ curl "http://example.com"
Here's the header from that particular site:
HTTP/1.1 200 OK
Server: nginx
Content-Type: text/html; charset=utf-8
X-Powered-By: PHP/5.2.17
Last-Modified: Sat, 03 Dec 2011 00:07:57 GMT
ETag: "6c38e1154f32dbd9ba211db8ad189b27"
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: must-revalidate
Content-Encoding: gzip
Content-Length: 7796
Date: Sat, 03 Dec 2011 00:46:22 GMT
X-Varnish: 1509870407 1509810501
Age: 504
Via: 1.1 varnish
Connection: keep-alive
X-Cache-Svr: p2137050.pubip.peer1.net
X-Cache: HIT
X-Cache-Hits: 425
I know the returned data is gzipped, because this returns html, as expected:
$ curl "http://example.com" | gunzip
I don't want to pipe the output through gunzip, because the script works as-is on other sites, and piping through gzip would break that functionality.
What I've tried
changing the user-agent (I tried the same string my browser sends, "Mozilla/4.0", etc)
man curl
google search
searching stackoverflow
Everything came up empty
Any ideas?
curl will automatically decompress the response if you set the --compressed flag:
curl --compressed "http://example.com"
--compressed
(HTTP) Request a compressed response using one of the algorithms libcurl supports, and save the uncompressed document. If this option is used and the server sends an unsupported encoding, curl will report an error.
gzip is most likely supported, but you can check this by running curl -V and looking for libz somewhere in the "Features" line:
$ curl -V
...
Protocols: ...
Features: GSS-Negotiate IDN IPv6 Largefile NTLM SSL libz
Note that it's really the website in question that is at fault here. If curl did not pass an Accept-Encoding: gzip request header, the server should not have sent a compressed response.
In the relevant bug report Raw compressed output when not using --compressed but server returns gzip data #2836 the developers says:
The server shouldn't send content-encoding: gzip without the client having signaled that it is acceptable.
Besides, when you don't use --compressed with curl, you tell the command line tool you rather store the exact stream (compressed or not). I don't see a curl bug here...
So if the server could be sending gzipped content, use --compressed to let curl decompress it automatically.

Resources