How to configure NiFi InvokeHTTP to post insert in Clickhouse - apache-nifi

I've in a NiFi attribute the full INSERT command that works with CURL:
echo "INSERT INTO default.PERFTEST_BUFFER VALUES (1, '2020-04-09 19:06:02', 48.8644, 'A')" \
| curl 'http://xxx.xxx.xxx.xxx:8123/?' --data-binary #-
Insert in Attribut :
INSERT INTO default.PERFTEST_BUFFER
VALUES (1, '2020-04-09 19:06:02', 48.8644, 'A')
I can't figure out what should be all the parameters of the InvokeHTTP processor.
I've used for the first two :
POST
http://xxx.xxx.xxx.xxx:8123/
then I'm lost.
Any idea how to configure it ?

If your query is static you can pass it in the query string in the encoded format:
# GET
curl 'http://xxx.xxx.xxx.xxx:8123/?query=INSERT%20INTO%20default.PERFTEST_BUFFER%20VALUES%20%281%2C%20%272020-04-09%2019%3A06%3A02%27%2C%2048.8644%2C%20%27A%27%29'
# POST
curl -d '' 'http://xxx.xxx.xxx.xxx:8123/?query=INSERT%20INTO%20default.PERFTEST_BUFFER%20VALUES%20%281%2C%20%272020-04-09%2019%3A06%3A02%27%2C%2048.8644%2C%20%27A%27%29'
It needs to pass the required values to just two params: HTTP Method and Remote URL of InvokeHTTP-processor.

The way I made it work thanks to daggett and and example here :
https://github.com/zezutom/NiFiByExample
Use before the InvokeHTTP a processor ReplaceText that replace the flowfile content by the variable "Insert" value that have the full Insert command
Search Value (?s)(^.*$)
Replacement Value ${Insert}
So the flow file content is
INSERT INTO default.PERFTEST_BUFFER VALUES (1, '2020-04-09 19:06:02', 48.8644, 'A')
Follow by a processor InvokeHTTP the following parameters
HTTP Method POST
Remote URL http://xxx.xxx.xxx.xxx:8123/
SSL Context Service No value set
Connection Timeout 5 secs
Read Timeout 15 secs
Include Date Header True
Follow Redirects True
Attributes to Send No value set
Basic Authentication Username No value set
Basic Authentication Password No value set
Proxy Configuration Service No value set
Proxy Host No value set
Proxy Port No value set
Proxy Type http
Proxy Username No value set
Proxy Password No value set
Put Response Body In Attribute No value set
Max Length To Put In Attribute 256
Use Digest Authentication false
Always Output Response true
Add Response Headers to Request false
Content-Type ${mime.type}
Send Message Body true
Use Chunked Encoding false
Penalize on "No Retry" false
Use HTTP ETag false
Maximum ETag Cache Size 10MB

Related

Does AWS apigateway change http body? How can I stop it from doing this?

Does AWS apigateway change http body? How can I stop it from doing this?
My application:
(1) A front end "UI" that sends a "http request" using "POST method" that contains a "zip file" in "body" through "form-data".
(2) AWS "apigateway" receives this request and forward it to "Lambda Proxy"
(3) AWS "Lambda" implemented by python coding receives this request and decompresses this zip file to a temporary folder.
The problem I'm facing:
(1) and (2) works fine, but in (3) the pythong program at lambda failed to decompress the file.
My finding:
It seems that when sending from the "UI" the body contains the binary data of the zip file
like below:
"PK\x03\x04\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00x2.txtPK\x03\x04\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00x1.txtPK\x01\x02\x14\x00\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00x2.txtPK\x01\x02\x14\x00\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00$\x00\x00\x00x1.txtPK\x05\x06\x00\x00\x00\x00\x02\x00\x02\x00h\x00\x00\x00H\x00\x00\x00\x00\x00"
But at (3) the python code at lambda, if we just simply returns the response like below:
response = {
"statusCode": 200,
"headers": {
"lambda-response": str(event["body"])
},
"body": "",
"isBase64Encoded": False
}
return response
will find that the binary data in the body,
seems like apigateway has changed the content
from:
"PK\x03\x04\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00x2.txtPK\x03\x04\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00x1.txtPK\x01\x02\x14\x00\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00\x00\x00\x00\x00x2.txtPK\x01\x02\x14\x00\n\x00\x00\x00\x00\x00\xd6;TO\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x06\x00\x00\x00\x00\x00\x00\x00\x00\x00
\x00\x00\x00$\x00\x00\x00x1.txtPK\x05\x06\x00\x00\x00\x00\x02\x00\x02\x00h\x00\x00\x00H\x00\x00\x00\x00\x00"
into:
"PK\u0003\u0004\n\u0000\u0000\u0000\u0000\u0000\ufffd;TO\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0006\u0000\u0000\u0000x2.txtPK\u0003\u0004\n\u0000\u0000\u0000\u0000\u0000\ufffd;TO\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0006\u0000\u0000\u0000x1.txtPK\u0001\u0002\u0014\u0000\n\u0000\u0000\u0000\u0000\u0000\ufffd;TO\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0006\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000
\u0000\u0000\u0000\u0000\u0000\u0000\u0000x2.txtPK\u0001\u0002\u0014\u0000\n\u0000\u0000\u0000\u0000\u0000\ufffd;TO\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0006\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000\u0000
\u0000\u0000\u0000$\u0000\u0000\u0000x1.txtPK\u0005\u0006\u0000\u0000\u0000\u0000\u0002\u0000\u0002\u0000h\u0000\u0000\u0000H\u0000\u0000\u0000\u0000\u0000\r\n"
Which is weird, what can I do to stop this?
(2019/12/17 update) below the lambda code I'm using.
import json # to decode json
import os # file IO
import shutil # file IO (use this to recursively force remove a directory)
print('Loading function')
def decompress_zip_file(src_file_path, dest_dir_path):
'''
Decompress a zip file into a directory.
Args:
src_file_path (Srting): source zip file's path.
dest_dir_path (Srting): the destination of the output directory.
Returns:
isSuccess (bool): the operation is successful or not.
'''
error_msg = "Nothing."
try:
if(os.path.isdir(dest_dir_path)):
shutil.rmtree(dest_dir_path)
with zipfile.ZipFile(src_file_path, 'r') as zip_ref:
zip_ref.extractall(dest_dir_path)
except Exception as ep:
error_msg = "Error in decompress_zip_file(), ep={}:{}".format(type(ep).__name__, str(ep))
print(error_msg)
return (False, error_msg)
return (True, error_msg)
def decompress_zip_file_from_content_in_binary(src_file_in_binary, dest_dir_path):
'''
Decompress a zip file content into a directory.
Args:
src_file_in_binary (byte array): source zip file's content in binary format.
dest_dir_path (Srting): the destination of the output directory.
Returns:
isSuccess (bool): the operation is successful or not.
'''
# write the obtained binary data into a tmp zip file
tmp_file_path = "/tmp/tmp.zip"
if(os.path.isfile(tmp_file_path)):
os.remove(tmp_file_path)
output_file = open(tmp_file_path, 'wb')
output_file.write(src_file_in_binary)
output_file.close()
(isSuccess, error_msg) = decompress_zip_file(tmp_file_path, dest_dir_path)
return (isSuccess, error_msg)
def convert_from_http_body_encoding_to_local_binary(extracted_file_from_http_body_str):
'''
Extract the file (in binary string format) from event['body'] encoding to local binary encoding.
Args:
extracted_file_from_http_body_str (string): the event['body'] file (in binary string format),.
Returns:
zipfile_binary1 (binary array): the conversion result.
'''
zipfile_binary1 = bytes(extracted_file_from_http_body_str, encoding = "ascii") # convert into a zipfile in binary format
return zipfile_binary1
def extract_zipfile_binary_from_body(body_str):
'''
Extract the zipfile (in binary format) from event['body'] string.
Args:
body_str (string): the event['body'] string.
Returns:
(binary array): the conversion result.
'''
retValue = ""
tmpArray = body_str.split("application/zip") # split the content based on MIME part field data; cut the head
if(len(tmpArray) > 1):
retValue += "entered-Lv1."
tmpArray = tmpArray[1].split("PK") # split the content based on zip file header.
if(len(tmpArray) > 1):
retValue += "entered-Lv2."
zipfile_str = "PK" + 'PK'.join(tmpArray[1:]) # add back the zip file header
tmpArray = zipfile_str.split("------WebKitFormBoundary") # split the content based on MIME part field data; cut the tail
if(len(tmpArray) > 1):
zipfile_str = tmpArray[0]
zipfile_binary = convert_from_http_body_encoding_to_local_binary(zipfile_str)
retValue = zipfile_binary
return retValue
def handler(event, context):
'''Provide an event that contains the following keys:
- operation: one of the operations in the operations dict below
- payload: a parameter to pass to the operation being performed
'''
# set the mapping table for "operation" x "return value"
operations = {
'unzip': lambda x: decompress_zip_file_from_content_in_binary(**x), # unzip an uploaded file
'ping': lambda x: 'pong' # respond to ping req.
}
# because we use "Lambda Proxe", means we have api-gateway forward the whole packet without resolving it for lambda.
event_headers = event["headers"]
operation = event_headers['operation']
event_body = event["body"]
if(operation == 'unzip'):
src_file_in_binary = extract_zipfile_binary_from_body(event_body)
payload_json = {}
payload_json['src_file_in_binary'] = src_file_in_binary
payload_json['dest_dir_path'] = "/tmp/tmp_zipfile_output"
event_headers["payload"] = payload_json
if operation in operations:
responseBody = operations[operation](event_headers.get('payload'))
response = {
"statusCode": 200,
"headers": {
"lambda-response": str(responseBody) # the api-gateway will forward the header to the front end.
},
"body": "",
"isBase64Encoded": False
}
return response
else:
raise ValueError('Unrecognized operation "{}"'.format(operation))
Below is a response from AWS support. LGTM. Leave it here so that people can see the solution to this issue in the future.
=====================Below is the response from AWS support ==================
Hi,
Thank you for contacting AWS Premium Support. I am Jyoti, and I will assist you with this case today.
From the case correspondence, I understand that you are concerned that API Gateway modifies
the binary data payload before proxying to your Lambda function. Please correct me if my understanding is wrong.
Expected Behaviour:
API gateway does modify the binary data payload into UTF-8 encoded JSON strings if
the API is configured at its default settings. Hence this is an expected behaviour.
Kindly note, as per [1], we must configure the API to support binary payloads for
our API in API Gateway. API Gateway can not send binary as is, since it has to send
a JSON body to the lambda proxy. Hence, it encodes the data/payload in UTF-8 by default.
Solution:
In order to overcome the aforementioned challenge, we need to add the desired
binary media types (application/zip in this case) to the binaryMediaTypes list
on the RestApi resource's settings page. For further information on how to achieve
this, please refer here --> [2]. If this property is not defined, the payloads
are handled as UTF-8 encoded JSON strings as mentioned in [1].
This is why the file in your request looks UTF-8 encoded. After configuring the API,
the event received by the Lambda would be a Base64-encoded string.
If you want to conduct operations on this object (the encoded request body or 'event["body"]'),
then you may decode the base64-encoded string to its orginal binary form by following
the below lines (in case of python runtime) :
import base64
coded_string = str(event["body"])
base64.b64decode(coded_string)
Troubleshooting:
I tried to replicate your setup in my environment. Instead of the frontend 'UI' of the application,
I used Postman as a client, while the rest of the setup (API Gateway and Lambda) are similar.
I am making a POST request to my API from Postman, with the request headers 'Content-Type' and 'Accept',
both set to the value 'application/zip', which is the binary media type that is being sent and
also being expected in the response. My API has been configured to support binary media types being
passed in the request body. I have added 'application/zip' in the binaryMediaTypes list for the API.
Finally, in the Lambda function I am decoding the base64-encoded request body (i.e. event["body"])
to its original binary form by using the base64 library (in python).
If you still want to confirm the consistency of your request's form-data out by returning the binary
data in your response, you can refer to the following snippet:
response {
'isBase64Encoded': True, #Ensure the body is base encoded
'statusCode': 200,
'headers': { "Content-Type": "applicaiton/zip" }, #Define the Content-Type
'body': event["body"] #Response Body returns the Base64-encoded value
}
We set the isBase64Encoded parameter to True and API Gateway automatically decodes the
response body depending on the Content-Type (i.e. the binary data/media type) that the
client (in my case Postman), is set to receive (i.e. application/zip). Kindly note, the 'Accept'
header that I had sent in my header, is to validate that the response body contains the binary
data type, the request was made for.
The above response body was the same as the request body binary data that was first sent
through the API, in my setup.
Hope I have addressed your concerns. However, if you still need help with the implementation,
please contact us again and I will be happy to assist you.
References:
=-=-=-=--=-=-=-=-=-=
[1] Support Binary Payloads in API Gateway: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings.html
[2] Enable Binary Support Using the API Gateway Console: https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-payload-encodings-configure-with-console.html
Best regards,
Jyoti Prakash P.
Amazon Web Services
2019/12/20 update
I realize that my content type is actually multipart instead of application/zip so I modified again the setting then it worked.
Below is the help from AWS support. Many thanks to their help.
Hi,
Thanks a lot for elaborating your application flow and the logs. I understand now that your HTTP Request header 'content-type' is set to 'multipart/form-data'. I agree that for a web form to upload a file it is quite common to set content type as form-data and AWS API Gateway does support it. You would like to know if you could prevent UTF-8 encoding without changing the front end code. Please correct me if my understanding is wrong.
For the ease of discussion, I would like to separate the approach of troubleshooting for the HTTP request and response.
For the request to the API:
Please add 'multipart/form-data' as one of the values in the binaryMediaType list in your "API settings page in the API Gateway console. You would not have to alter your code or HTTP request or any of it's headers. Kindly note to handle binary media/data in API Gateway, the HTTP Request Content-Type header must match the values in binaryMediaType list.
In your use case, if you want to send the binary media back in a response for your request, the HTTP Request 'Content-Type' and 'Accept' headers, the binaryMediaType value of the API and the HTTP Response 'Content-Type' must all be set to 'multipart/form-data'. I tried the above and it worked for me with Postman Client. The 'boundary' directive is set up by Postman automatically if the HTTP Request 'Content-Type' is set to 'multipart/form-data'. Hence, you would have to only add 'multipart/form-data' in the 'binaryMediaType' list. Please have a look at my HTTP request, below:
POST /stg-with-logs HTTP/1.1
Host: <some-api-id>.execute-api.us-east-1.amazonaws.com
Content-Type: multipart/form-data; boundary=----WebKitFormBoundary7MA4YWxkTrZu0gW
Accept: multipart/form-data
Cache-Control: no-cache
Postman-Token: 123b64f9-5669-f794-b9df-34a7561e9708
------WebKitFormBoundary7MA4YWxkTrZu0gW
Content-Disposition: form-data; name="File"; filename="archive.zip"
Content-Type: application/zip
------WebKitFormBoundary7MA4YWxkTrZu0gW--
For the response from the API:
I noticed while going through your API Gateway Logs, the header 'isBase64Encoded' was not set. Kindly set that to true. API Gateway automatically decodes any base64-encoded string in the body of your HTTP response, if 'isBase64Encoded' is set to true. Please have a look at the HTTP Response from my lambda below:
(a6729f56-b245-45a4-9ac4-7e00b23c8957) Endpoint response body before transformations:
{
"isBase64Encoded": true,
"statusCode": 200,
"headers": {
"Content-Type": "multipart/form-data",
"Accpet": "multipart/form-data"
},
"body": "LS0tLS0tV2ViS2l0Rm9ybUJvdW5kYXJ5SmxkSW1aV1lHczlSTndPWQ0KQ29udGVudC1EaXNwb3NpdGlvbjogZm9ybS1kYXRhOyBuYW1lPSJGaWxlIjsgZmlsZW5hbWU9ImFyY2hpdmUuemlwIg0KQ29udGVudC1UeXBlOiBhcHBsaWNhdGlvbi96aXANCg0KUEsDBBQAAAAIAKF4kE9/Mo7/XgAAAJcAAAAaABwASGVsbG8tV29ybGQtNjY3MzMxNTI4MS50eHRVVAkAA8ZP910SUPdddXgLAAEEHZHreQTMewNxNY1BDgIxDAPvvIVPOY3SEC+9WCrfJ13EZWTNHKwKkzMmxIp5dpsnFMlqrjzBF/SKxCW2/8dl3ttGGjTqnkdMG+Wwj96jA3/YJsC2QF9iesuLUXPfv80KrpaVYeDjC1BLAQIeAxQAAAAIAKF4kE9/Mo7/XgAAAJcAAAAaABgAAAAAAAEAAACkgQAAAABIZWxsby1Xb3JsZC02NjczMzE1MjgxLnR4dFVUBQADxk/3XXV4CwABBB2R63kEzHsDcVBLBQYAAAAAAQABAGAAAACyAAAAAAANCi0tLS0tLVdlYktpdEZvcm1Cb3VuZGFyeUpsZEltWldZR3M5Uk53T1kNCkNvbnRlbnQtRGlzcG9zaXRpb246IGZvcm0tZGF0YTsgbmFtZT0iVGVzdCBEYXRhIg0KDQpUZXN0aW5nIEJvdW5kYXJ5IGluIG11bHRpcGFydC9mb3JtLWRhdGENCi0tLS0tLVdlYktpdEZvcm1Cb3VuZGFyeUpsZEltWldZR3M5Uk53T1ktLQ0K"
}
Along with this correspondence I am attaching my API Gateway Swagger file and Lambda function code for your reference. The setup worked fine for me and I was able to return the binary payload upon making an HTTP Request. If you want to test it out in your environment, please set the appropriate credentials and lambda uri in the Swagger file.
Hope this addresses your concern. However, if the issue still persists or you have any further questions, please contact us again and I will be happy to assist you.
To see the file named 'binaryPost-stg-with-logs-oas30-apigateway.yaml,python-binary-response.py' included with this correspondence, please use the case link given below the signature.
Best regards,
Jyoti Prakash P.
Amazon Web Services
Check out the AWS Support Knowledge Center, a knowledge base of articles and videos that answer customer questions about AWS services: https://aws.amazon.com/premiumsupport/knowledge-center/?icmpid=support_email_category

Airflow SimpleHttpOperator for HTTPS

I'm trying to use SimpleHttpOperator for consuming a RESTful API. But, As the name suggests, it only supporting HTTP protocol where I need to consume a HTTPS URI. so, now, I have to use either "requests" object from Python or handle the invocation from within the application code. But, It may not be a standard way. so, I'm looking for any other options available to consume HTTPS URI from within Airflow. Thanks.
I dove into this and am pretty sure that this behavior is a bug in airflow. I have created a ticket for it here:
https://issues.apache.org/jira/browse/AIRFLOW-2910
For now, the best you can do is override SimpleHttpOperator as well as HttpHook in order to change the way that HttpHook.get_conn works (to accept https). I may end up doing this, and if I do I'll post some code.
Update:
Operator override:
from airflow.operators.http_operator import SimpleHttpOperator
from airflow.exceptions import AirflowException
from operators.https_support.https_hook import HttpsHook
class HttpsOperator(SimpleHttpOperator):
def execute(self, context):
http = HttpsHook(self.method, http_conn_id=self.http_conn_id)
self.log.info("Calling HTTP method")
response = http.run(self.endpoint,
self.data,
self.headers,
self.extra_options)
if self.response_check:
if not self.response_check(response):
raise AirflowException("Response check returned False.")
if self.xcom_push_flag:
return response.text
Hook override
from airflow.hooks.http_hook import HttpHook
import requests
class HttpsHook(HttpHook):
def get_conn(self, headers):
"""
Returns http session for use with requests. Supports https.
"""
conn = self.get_connection(self.http_conn_id)
session = requests.Session()
if "://" in conn.host:
self.base_url = conn.host
elif conn.schema:
self.base_url = conn.schema + "://" + conn.host
elif conn.conn_type: # https support
self.base_url = conn.conn_type + "://" + conn.host
else:
# schema defaults to HTTP
self.base_url = "http://" + conn.host
if conn.port:
self.base_url = self.base_url + ":" + str(conn.port) + "/"
if conn.login:
session.auth = (conn.login, conn.password)
if headers:
session.headers.update(headers)
return session
Usage:
Drop-in replacement for SimpleHttpOperator.
This is a couple of months old now, but for what it is worth I did not have any issue with making an HTTPS call on Airflow 1.10.2.
In my initial test I was making a request for templates from sendgrid, so the connection was set up like this:
Conn Id : sendgrid_templates_test
Conn Type : HTTP
Host : https://api.sendgrid.com/
Extra : { "authorization": "Bearer [my token]"}
and then in the dag code:
get_templates = SimpleHttpOperator(
task_id='get_templates',
method='GET',
endpoint='/v3/templates',
http_conn_id = 'sendgrid_templates_test',
trigger_rule="all_done",
xcom_push=True
dag=dag,
)
and that worked. Also notice that my request happens after a Branch Operator, so I needed to set the trigger rule appropriately (to "all_done" to make sure it fires even when one of the branches is skipped), which has nothing to do with the question, but I just wanted to point it out.
Now to be clear, I did get an Insecure Request warning as I did not have certificate verification enabled. But you can see the resulting logs below
[2019-02-21 16:15:01,333] {http_operator.py:89} INFO - Calling HTTP method
[2019-02-21 16:15:01,336] {logging_mixin.py:95} INFO - [2019-02-21 16:15:01,335] {base_hook.py:83} INFO - Using connection to: id: sendgrid_templates_test. Host: https://api.sendgrid.com/, Port: None, Schema: None, Login: None, Password: XXXXXXXX, extra: {'authorization': 'Bearer [my token]'}
[2019-02-21 16:15:01,338] {logging_mixin.py:95} INFO - [2019-02-21 16:15:01,337] {http_hook.py:126} INFO - Sending 'GET' to url: https://api.sendgrid.com//v3/templates
[2019-02-21 16:15:01,956] {logging_mixin.py:95} WARNING - /home/csconnell/.pyenv/versions/airflow/lib/python3.6/site-packages/urllib3/connectionpool.py:847: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
InsecureRequestWarning)
[2019-02-21 16:15:05,242] {logging_mixin.py:95} INFO - [2019-02-21 16:15:05,241] {jobs.py:2527} INFO - Task exited with return code 0
I was having the same problem with HTTP/HTTPS when trying to set the connections using environment variables (although it works when i set the connection on the UI).
I've checked the issue #melchoir55 opened (https://issues.apache.org/jira/browse/AIRFLOW-2910) and you don't need to make a custom operator for that, the problem is not that HttpHook or HttpOperator can't use HTTPS, the problem is the way get_hook parse the connection string when dealing with HTTP, it actually understand that the first part (http:// or https://) is the connection type.
In summary, you don't need a custom operator, you can just set the connection in your env as the following:
AIRFLOW_CONN_HTTP_EXAMPLE=http://https%3a%2f%2fexample.com/
Instead of:
AIRFLOW_CONN_HTTP_EXAMPLE=https://example.com/
Or set the connection on the UI.
It is not a intuitive way to set up a connection but I think they are working on a better way to parse connections for Ariflow 2.0.
In Airflow 2.x you can use https URLs by passing https for schema value while setting up your connection and can still use SimpleHttpOperator like shown below.
my_api = SimpleHttpOperator(
task_id="my_api",
http_conn_id="YOUR_CONN_ID",
method="POST",
endpoint="/base-path/end-point",
data=get_data,
headers={"Content-Type": "application/json"},
)
Instead of implementing HttpsHook, we could just put one line of codes into HttpsOperator(SimpleHttpOperator)#above as follows
...
self.extra_options['verify'] = True
response = http.run(self.endpoint,
self.data,
self.headers,
self.extra_options)
...
in Airflow 2, the problem is been resolved.
just check out that :
host name in Connection UI Form, don't end up with /
'endpoint' parameter of SimpleHttpOperator starts with /
I am using Airflow 2.1.0,and the following setting works for https API
In connection UI, setting host name as usual, no need to specify 'https' in schema field, don't forget to set login account and password if your API server request ones.
Connection UI Setting
When constructing your task, add extra_options parameter in SimpleHttpOperator, and put your CA_bundle certification file path as the value for key verify, if you don't have a certification file, then use false to skip verification.
Task definition
Reference: here

HAproxy+Lua: Return requests if validation fails from Lua script

We are trying to build an incoming request validation platform using HAProxy+Lua.
Our use-case is to create a LUA scripts that will essentially make a socket call to a Validation API, and based on the response
from Validation API we want to redirect the request to a backend API, and if the validation fails we would want to return
the request right from the LUA script. For example, for 200 response we would want to redirect the request to backend api, and for 404 we would want
to return the request. From the documentation, I understand that there are various default functions available
with Lua-Haproxy integration.
core.register_action() --> I'm using this. Take TXN as input
core.register_converters() --> Essentially used for string manipulations.
core.register_fetches() --> Takes TXN as input and returns string; Mainly used for representing dynamic backend profiles in haproxy config
core.register_init() --> Used for initialization
core.register_service() --> You have to return the response mandatorily while using this function, which doesn't satisfy our requirements
core.register_task() --> For using normal functions. No mandatory input class. TXN is required to fetch header details from request
I have tried all of the functions from above list, I understand that core.register_service is basically to return a response from the
Lua script. However, what is problematic is, we must send the response from the LUA script and it will not redirect the request to BACKEND.
Currently, I am using core.register_action to interrupt the requests, but I'm not able to return the request using this function. Here's
what my code looks like:
local http_socket = require("socket.http")
local pretty_print = require("pl.pretty")
function add_http_request_header(txn, name, value)
local headerName = name
local headerValue = value
txn.http:req_add_header(headerName, headerValue)
end
function call_validation_api()
local request, code, header = http_socket.request {
method = "GET", -- Validation API Method
url = "http://www.google.com/" -- Validation API URL
}
-- Using core.log; Print in some cases is a blocking operation http://www.arpalert.org/haproxy-lua.html#h203
core.Info( "Validation API Response Code: " .. code )
pretty_print.dump( header )
return code
end
function failure_response(txn)
local response = "Validation Failed"
core.Info(response)
txn.res:send(response)
-- txn:close()
end
core.register_action("validation_action", { "http-req", "http-res" }, function(txn)
local validation_api_code = call_validation_api()
if validation_api_code == 200 then
core.Info("Validation Successful")
add_http_request_header(txn, "test-header", "abcdefg")
pretty_print.dump( txn.http:req_get_headers() )
else
failure_response(txn) --->>> **HERE I WANT TO RETURN THE RESPONSE**
end
end)
Following is the configuration file entry:
frontend http-in
bind :8000
mode http
http-request lua.validation_action
#Capturing header of the incoming request
capture request header test-header len 64
#use_backend %[lua.fetch_req_params]
default_backend app
backend app
balance roundrobin
server app1 127.0.0.1:9999 check
Any help is much appreciated in achieving this functionality. Also, I understand that SOCKET call from Lua script is a blocking call, which is opposite to HAProxy's default nature of keep-alive connection. Please feel free to suggest any other utility to achieve this functionality, if you have already used it.
Ok I have figured out the answer to this question:
I created 2 backends for success and failure of requests, and based on the response I am returning 2 different strings. In "failure_backend", I have called a different service, which essentially is a core.register_service and can return the response. I'm pasting code for both the configuration file and lua script
HAProxy conf file:
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2
maxconn 4000
user haproxy
group haproxy
daemon
#lua file load
lua-load /home/aman/coding/haproxy/http_header.lua
# turn on stats unix socket
stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode http
log global
option httplog
option dontlognull
retries 3
timeout http-request 90s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend http-in
bind :8000
mode http
use_backend %[lua.validation_fetch]
default_backend failure_backend
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend success_backend
balance roundrobin
server app1 172.23.12.94:9999 check
backend failure_backend
http-request use-service lua.failure_service
# For displaying HAProxy statistics.
frontend stats
bind :8888
default_backend stats
backend stats
stats enable
stats hide-version
stats realm Haproxy Statistics
stats uri /haproxy/stats
stats auth aman:rjil#123
Lua script:
local http_socket = require("socket.http")
local pretty_print = require("pl.pretty")
function add_http_request_header(txn, name, value)
local headerName = name
local headerValue = value
txn.http:req_add_header(headerName, headerValue)
end
function call_validation_api()
local request, code, header = http_socket.request {
method = "GET", -- Validation API Method
url = "http://www.google.com/" -- Validation API URL
}
-- Using core.log; Print in some cases is a blocking operation http://www.arpalert.org/haproxy-lua.html#h203
core.Info( "Validation API Response Code: " .. code )
pretty_print.dump( header )
return code
end
function failure_response(txn)
local response = "Validation Failed"
core.Info(response)
return "failure_backend"
end
-- Decides back-end based on Success and Failure received from validation API
core.register_fetches("validation_fetch", function(txn)
local validation_api_code = call_validation_api()
if validation_api_code == 200 then
core.Info("Validation Successful")
add_http_request_header(txn, "test_header", "abcdefg")
pretty_print.dump( txn.http:req_get_headers() )
return "success_backend"
else
failure_response(txn)
end
end)
-- Failure service
core.register_service("failure_service", "http", function(applet)
local response = "Validation Failed"
core.Info(response)
applet:set_status(400)
applet:add_header("content-length", string.len(response))
applet:add_header("content-type", "text/plain")
applet:start_response()
applet:send(response)
end)

JMeter: doesn't increment counter value

I'm working on a load test for Eclipse RAP-based application in JMeter. I need to replace requestCounter in each http request with variable. Variable should be incremented by 1 in next requests.
Tried to use JMeter's __counter function ${__counter(FALSE)} for my ${COUNTER} variable:
1) in User Defined Variables (didn't increment counter http://screencast.com/t/5MP1ovdHJN5p),
2) in CSV Data Set (function body was inserted instead of number http://screencast.com/t/mUIaYF3Oc).
Replaced ${__counter(FALSE)} with ${__javaScript(${__counter(FALSE)})} - still no success.
3) Counter config element brings the same result as User Defined Variable - not incrementing value for next request.
Plan:
+Thread_group
+Counter config element/User Defined Variables/CSV Data Set
+Request group
++ HTTP request1: {"head":{"requestCounter":${COUNTER}} #should be 1
++ HTTP request2: {"head":{"requestCounter":${COUNTER}} #should be 2
++ HTTP request2: {"head":{"requestCounter":${COUNTER}} #should be 3
...
Any suggestions?
you don't need to update the request counter manually. For RAP 2.1+ we updated the Wiki with the current info about load testing:
https://wiki.eclipse.org/RAP/LoadTesting

Sending An HTTP Request using Intersystems Cache

I have the following Business Process defined within a Production on an Intersystems Cache Installation
/// Makes a call to Merlin based on the message sent to it from the pre-processor
Class sgh.Process.MerlinProcessor Extends Ens.BusinessProcess [ ClassType = persistent, ProcedureBlock ]
{
Property WorkingDirectory As %String;
Property WebServer As %String;
Property CacheServer As %String;
Property Port As %String;
Property Location As %String;
Parameter SETTINGS = "WorkingDirectory,WebServer,Location,Port,CacheServer";
Method OnRequest(pRequest As sgh.Message.MerlinTransmissionRequest, Output pResponse As Ens.Response) As %Status
{
Set tSC=$$$OK
Do ##class(sgh.Utils.Debug).LogDebugMsg("Packaging an HTTP request for Saved form "_pRequest.DateTimeSaved)
Set dateTimeSaved = pRequest.DateTimeSaved
Set patientId = pRequest.PatientId
Set latestDateTimeSaved = pRequest.LatestDateTimeSaved
Set formName = pRequest.FormName
Set formId = pRequest.FormId
Set episodeNumber = pRequest.EpisodeNumber
Set sentElectronically = pRequest.SentElectronically
Set styleSheet = pRequest.PrintName
Do ##class(sgh.Utils.Debug).LogDebugMsg("Creating HTTP Request Class")
set HTTPReq = ##class(%Net.HttpRequest).%New()
Set HTTPReq.Server = ..WebServer
Set HTTPReq.Port = ..Port
do HTTPReq.InsertParam("DateTimeSaved",dateTimeSaved)
do HTTPReq.InsertParam("HospitalNumber",patientId)
do HTTPReq.InsertParam("Episode",episodeNumber)
do HTTPReq.InsertParam("Stylesheet",styleSheet)
do HTTPReq.InsertParam("Server",..CacheServer)
Set Status = HTTPReq.Post(..Location,0) Quit:$$$ISERR(tSC)
Do ##class(sgh.Utils.Debug).LogDebugMsg("Sent the following request: "_Status)
Quit tSC
}
}
The thing is when I check the debug value (which is defined as a global) all I get is the number '1' - I have no idea therefore if the request has succeeded or even what is wrong (if it has not)
What do I need to do to find out
A) What is the actual web call being made?
B) What the response is?
There is a really slick way to get the answer the two questions you've asked, regardless of where you're using the code. Check the documentation out on the %Net.HttpRequest object here: http://docs.intersystems.com/ens20102/csp/docbook/DocBook.UI.Page.cls?KEY=GNET_http and the class reference here: http://docs.intersystems.com/ens20102/csp/documatic/%25CSP.Documatic.cls?APP=1&LIBRARY=ENSLIB&CLASSNAME=%25Net.HttpRequest
The class reference for the Post method has a parameter called test, that will do what you're looking for. Here's the excerpt:
method Post(location As %String = "", test As %Integer = 0, reset As %Boolean = 1) as %Status
Issue the Http 'post' request, this is used to send data to the web server such as the results of a form, or upload a file. If this completes correctly the response to this request will be in the HttpResponse. The location is the url to request, e.g. '/test.html'. This can contain parameters which are assumed to be already URL escaped, e.g. '/test.html?PARAM=%25VALUE' sets PARAM to %VALUE. If test is 1 then instead of connecting to a remote machine it will just output what it would have send to the web server to the current device, if test is 2 then it will output the response to the current device after the Post. This can be used to check that it will send what you are expecting. This calls Reset automatically after reading the response, except in test=1 mode or if reset=0.
I recommend moving this code to a test routine to view the output properly in terminal. It would look something like this:
// To view the REQUEST you are sending
Set sc = request.Post("/someserver/servlet/webmethod",1)
// To view the RESPONSE you are receiving
Set sc = request.Post("/someserver/servlet/webmethod",2)
// You could also do something like this to parse your RESPONSE stream
Write request.HttpResponse.Data.Read()
I believe the answer you want to A) is in the Server and Location properties of your %Net.HttpRequest object (e.g., HTTPReq.Server and HTTPReq.Location).
For B), the response information should be in the %Net.HttpResponse object stored in the HttpResponse property (e.g. HTTPReq.HttpResponse) after your call is completed.
I hope this helps!
-Derek
(edited for formatting)
From that code sample it looks like you're using Ensemble, not straight-up Cache.
In that case you should be doing this HTTP call in a Business Operation that uses the HTTP Outbound Adapter, not in your Business Process.
See this link for more info on HTTP Adapters:
http://docs.intersystems.com/ens20102/csp/docbook/DocBook.UI.Page.cls?KEY=EHTP
You should also look into how to use the Ensemble Message Browser. That should help with your logging needs.

Resources