Get metadata for a column with google Sheets API - google-api

I have a Google spreadsheet that I am connecting to and interacting with using the google-python-api-client package. Following this description on metadata search, and the links in it for the request body, I have written a function to get metadata for a range:
def get_metadata_by_range(range_: Union[dict, str]) -> dict:
if isinstance(range_, str):
print("String range: ", range_)
request_body = {"dataFilters": \
{"a1Range": range_}}
elif isinstance(range_, dict):
print("Dict range: ", range_)
request_body = {"dataFilters": \
[{"gridRange": range_}]}
else:
return None
request = service.spreadsheets().developerMetadata().\
search(spreadsheetId=SPREADSHEET_ID, body=request_body)
return request.execute()
Calling this with a range, either A1 notation or a gridRange will cause an error to occur though. For example, calling it with this line get_metadata_by_range("Metadata!A:A") will cause the following traceback.
String range: Metadata!A:A
Traceback (most recent call last):
File "oqc_server/fab/gapc.py", line 82, in <module>
get_metadata_by_range("Metadata!A:A")
File "oqc_server/fab/gapc.py", line 69, in get_metadata_by_range
return request.execute()
File "/media/kajsa/Storage/Projects/oqc_server/venv/lib/python3.7/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/media/kajsa/Storage/Projects/oqc_server/venv/lib/python3.7/site-packages/googleapiclient/http.py", line 856, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 500 when requesting https://sheets.googleapis.com/v4/spreadsheets/1RhheCsI3kHrm8yK2Yio2kAOU4VOzYdz-eK0vjiMY7co/developerMetadata:search?alt=json returned "Internal error encountered."
Any ideas on what is causing this and how to solve it?

You want to search and retrieve the developer metadata from the range using the method of Method: spreadsheets.developerMetadata.search of Sheets API.
You want to achieve this using google-api-python-client with python.
You have already been able to get and put values for Spreadsheet with Sheets API.
If my understanding is correct, how about this answer? Please think of this as just one of several possible answers.
Modification points:
When you want to search the developer metadata with the range, please set the gridrange to dataFilters[].developerMetadataLookup.metadataLocation.dimensionRange.
When the range is set to dataFilters[].a1Range and dataFilters[].gridRange, I could confirm that the same error occurs.
Sample script:
The sample script for retrieving the developer metadata from the range is as follows. Before you use this, please set the variables of spreadsheet_id and sheet_id.
service = build('sheets', 'v4', credentials=creds)
spreadsheet_id = '###' # Please set the Spreadsheet ID.
sheet_id = ### # Please set the sheet ID.
search_developer_metadata_request_body = {
"dataFilters": [
{
"developerMetadataLookup": {
"metadataLocation": {
"dimensionRange": {
"sheetId": sheet_id,
"dimension": "COLUMNS",
"startIndex": 0,
"endIndex": 1
}
}
}
}
]
}
request = service.spreadsheets().developerMetadata().search(
spreadsheetId=spreadsheet_id, body=search_developer_metadata_request_body)
response = request.execute()
print(response)
Above script retrieves the developer metadata from the column "A" of sheet_id.
Note:
Please modify above script for your actual script.
In the current stage, the Developer Metadata can be added to the Spreadsheet, each sheet in the Spreadsheet and row and column. Please be careful this. Ref
References:
Method: spreadsheets.developerMetadata.search
Adding Developer Metadata- DeveloperMetadataLookup
If I misunderstood your question and this was not the direction you want, I apologize.

There is a bug with developerMetadata related to a1Range objects being passed as filters.
Edit
I've checked the bug again and a fix has been implemented.

Related

boto3 and lambda: Invalid type for parameter KeyConditionExpression when using DynamoDB resource and localstack

I'm having very inconsistent results when trying to use boto3 dynamodb resources from my local machine vs from within a lambda function in localstack. I have the following simple lambda handler, that just queries a table based on the Hash Key:
import boto3
from boto3.dynamodb.conditions import Key
def handler(event, context):
dynamodb = boto3.resource(
"dynamodb", endpoint_url=os.environ["AWS_EP"]
)
table = dynamodb.Table("precalculated_scores")
items = table.query(
KeyConditionExpression=Key("customer_id").eq(event["customer_id"])
)
return items
The environment variable "AWS_EP" is set to my localstack DNS when protyping (http://localstack:4566).
When I call this lamdba I get the following error:
{
"errorMessage": "Parameter validation failed:\nInvalid type for parameter KeyConditionExpression, value: <boto3.dynamodb.conditions.Equals object at 0x7f7440201960>, type: <class 'boto3.dynamodb.conditions.Equals'>, valid types: <class 'str'>",
"errorType": "ParamValidationError",
"stackTrace": [
" File \"/opt/code/localstack/localstack/services/awslambda/lambda_executors.py\", line 1423, in do_execute\n execute_result = lambda_function_callable(inv_context.event, context)\n",
" File \"/opt/code/localstack/localstack/services/awslambda/lambda_api.py\", line 782, in exec_local_python\n return inner_handler(event, context)\n",
" File \"/var/lib/localstack/tmp/lambda_script_l_dbef16b3.py\", line 29, in handler\n items = table.query(\n",
" File \"/opt/code/localstack/.venv/lib/python3.10/site-packages/boto3/resources/factory.py\", line 580, in do_action\n response = action(self, *args, **kwargs)\n",
" File \"/opt/code/localstack/.venv/lib/python3.10/site-packages/boto3/resources/action.py\", line 88, in __call__\n response = getattr(parent.meta.client, operation_name)(*args, **params)\n",
" File \"/opt/code/localstack/.venv/lib/python3.10/site-packages/botocore/client.py\", line 514, in _api_call\n return self._make_api_call(operation_name, kwargs)\n",
" File \"/opt/code/localstack/.venv/lib/python3.10/site-packages/botocore/client.py\", line 901, in _make_api_call\n request_dict = self._convert_to_request_dict(\n",
" File \"/opt/code/localstack/.venv/lib/python3.10/site-packages/botocore/client.py\", line 962, in _convert_to_request_dict\n request_dict = self._serializer.serialize_to_request(\n",
" File \"/opt/code/localstack/.venv/lib/python3.10/site-packages/botocore/validate.py\", line 381, in serialize_to_request\n raise ParamValidationError(report=report.generate_report())\n"
]
}
Which is a weird error - From what I researched on other question it usually happens when using the boto3 client, but I am using boto3 resources. Furthermore, when I run the code locally in my machine it runs fine.
At first I thought that it might be due to different versions for boto3 (My local machine is using version 1.24.96, while the version inside the lambda runtime is 1.16.31). However I downgraded my local version to the same as the one in the runtime, and I keep getting the same results.
After some answers on this question I managed to get the code working against actual AWS services, but it still won't work when running against localstack.
Am I doing anything wrong? Os might this be a bug with localstack?
--- Update 1 ---
Changing the return didn't solve the problem:
return {"statusCode": 200, "body": json.dumps(items)}
--- Update 2 ---
The code works when running against actual AWS services instead of running against localstack. Updating the question with this information.
This works fine from both my local machine and Lambda:
import json, boto3
from boto3.dynamodb.conditions import Key
def lambda_handler(event, context):
dynamodb = boto3.resource(
"dynamodb",
endpoint_url="https://dynamodb.eu-west-1.amazonaws.com"
)
table = dynamodb.Table("test")
items = table.query(
KeyConditionExpression=Key("pk").eq("1")
)
print(items)
return {
'statusCode': 200,
'body': json.dumps('Hello from Lambda!')
}
Also be sure that event["customer_id"] is in fact a string value as expected by the eq function.
I would check to ensure you have the endpoint setup correctly and that you have the current version deployed.
It may also be the fact you are trying to return the results of your API call via the handler, instead of a proper JSON response as expected:
return {
'statusCode': 200,
'body': json.dumps(items)
}

Bubble.io API Connector with Lambda + AWS Rekognition Problem

Hi I was trying to use AWS Rekognition -> Lambda -> AWS API -> Bubble.io
I want to pass a image base64 code to process the image recognition.
Here is the lambda code:
import json
import boto3
import base64
def lambda_handler(event, context):
def detect_labels():
# 1. create a client object, to connect to rekognition
client=boto3.client('rekognition')
image = base64.b64decode(event['face'])
# 4. call Rekognition, store result in 'response'
response = client.detect_labels(
Image={
'Bytes': image
},
MaxLabels=20,
)
#6. Return response from function
return response
# Call detect_labels
response = detect_labels()
# Return results to API gateway
return {
'statusCode': 200,
'body': json.dumps(response)
}
and if I test it within the lambda test with config:
{
"face": "<imgbase64codehere>"
}
It works fine and return success message.
However I set the same thing in Bubble.io
Setting img. It wont work.
I check the Cloudwatch, the error is:
[ERROR] KeyError: 'face'
Traceback (most recent call last):
File "/var/task/lambda_function.py", line 30, in lambda_handler
response = detect_labels()
File "/var/task/lambda_function.py", line 14, in detect_labels
image = base64.b64decode(event['face'])
[ERROR] KeyError: 'face' Traceback (most recent call last): File "/var/task/lambda_function.py", line 30, in lambda_handler response = detect_labels() File "/var/task/lambda_function.py", line 14, in detect_labels image = base64.b64decode(event['face'])
I am sure there is not connection problem (if I dont use JSON it works fine)
So what do I do wrong in the setting?
Thanks for your help in advance
In your test you are sending an input of just a JSON request with face.
It looks like the implementation you've setup uses Amazon API Gateway. If you are doing that your Lambda function might receive the event in a different format depending on how you've implemented the API Gateway integration.
The Lambda developer guide has an example of the JSON format. You will probably want to access the body attribute.

Google Cloud Translation API: Creating glossary error

I tried to test Cloud Translation API using glossary.
So I created a sample glossary file(.csv) and uploaded it on Cloud Storage.
However when I ran my test code (copying sample code from official documentation), an error occurred. It seems that there is a problem in my sample glossary file, but I cannot find it.
I attached my code, error message, and screenshot of the glossary file.
Could you please tell me how to fix it?
And can I use the glossary so that the original language is used when translated into another language?
Ex) Translation English to Korean
I want to visit California. >>> 나는 California에 방문하고 싶다.
Sample Code)
from google.cloud import translate_v3 as translate
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="my_service_account_json_file_path"
def create_glossary(
project_id="YOUR_PROJECT_ID",
input_uri="YOUR_INPUT_URI",
glossary_id="YOUR_GLOSSARY_ID",
):
"""
Create a equivalent term sets glossary. Glossary can be words or
short phrases (usually fewer than five words).
https://cloud.google.com/translate/docs/advanced/glossary#format-glossary
"""
client = translate.TranslationServiceClient()
# Supported language codes: https://cloud.google.com/translate/docs/languages
source_lang_code = "ko"
target_lang_code = "en"
location = "us-central1" # The location of the glossary
name = client.glossary_path(project_id, location, glossary_id)
language_codes_set = translate.types.Glossary.LanguageCodesSet(
language_codes=[source_lang_code, target_lang_code]
)
gcs_source = translate.types.GcsSource(input_uri=input_uri)
input_config = translate.types.GlossaryInputConfig(gcs_source=gcs_source)
glossary = translate.types.Glossary(
name=name, language_codes_set=language_codes_set, input_config=input_config
)
parent = client.location_path(project_id, location)
# glossary is a custom dictionary Translation API uses
# to translate the domain-specific terminology.
operation = client.create_glossary(parent=parent, glossary=glossary)
result = operation.result(timeout=90)
print("Created: {}".format(result.name))
print("Input Uri: {}".format(result.input_config.gcs_source.input_uri))
create_glossary("my_project_id", "file_path_on_my_cloud_storage_bucket", "test_glossary")
Error Message)
Traceback (most recent call last):
File "C:/Users/ME/py-test/translation_api_test.py", line 120, in <module>
create_glossary("my_project_id", "file_path_on_my_cloud_storage_bucket", "test_glossary")
File "C:/Users/ME/py-test/translation_api_test.py", line 44, in create_glossary
result = operation.result(timeout=90)
File "C:\Users\ME\py-test\venv\lib\site-packages\google\api_core\future\polling.py", line 127, in result
raise self._exception
google.api_core.exceptions.GoogleAPICallError: None No glossary entries found in input files. Check your files are not empty. stats = {total_examples = 0, total_successful_examples = 0, total_errors = 3, total_ignored_errors = 3, total_source_text_bytes = 0, total_target_text_bytes = 0, total_text_bytes = 0, text_bytes_by_language_map = []}
Glossary File)
https://drive.google.com/file/d/1RaladmLjgygai3XsZv3Ez4ij5uDH5EdE/view?usp=sharing
I solved my problem by changing encoding of the glossary file to UTF-8.
And I also found that I can use the glossary so that the original language is used when translated into another language.

gspread update_cells always return 502 with httpsession error

I am currently trying to overwrite and google spreadsheet with new data using gspread api (version 0.4.1) with sheet.update_cells but it keeps giving me 502 with err msg as follows:
The server encountered a temporary error and could not complete your request.<p>Please try again in 30 seconds. <ins>That’s all we know.
It seems to be http session problem from the stacktrace:
File "/usr/local/lib/python2.7/dist-packages/gspread/models.py", line 476, in update_cells
self.client.post_cells(self, ElementTree.tostring(feed))
File "/usr/local/lib/python2.7/dist-packages/gspread/client.py", line 303, in post_cells
r = self.session.post(url, data, headers=headers)
File "/usr/local/lib/python2.7/dist-packages/gspread/httpsession.py", line 81, in post
return self.request('POST', url, data=data, headers=headers)
File "/usr/local/lib/python2.7/dist-packages/gspread/httpsession.py", line 67, in request
response = func(url, data=data, headers=request_headers)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 111, in post
return request('post', url, data=data, json=json, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/api.py", line 57, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/local/lib/python2.7/dist-packages/requests/sessions.py", line 475, in request
I did a little bit investigation but it seems that the answer sort of varies, so i think i should better just my version here.
The code snippet is really nothing special like something as follows:
sheet = gspread.authorize(credentials).open_by_key(spreadsheet_key).worksheet(worksheet_title)
if not sheet:
return
if not len(new_rows):
return
sheet.resize(len(new_rows), sheet.col_count)
active_range = 'A1:{0}{1}'.format(last_col, len(new_rows))
cell_list = sheet.range(active_range)
k = 0
for row in new_rows:
for field in row:
cell_list[k].value = field
k+=1
sheet.update_cells(cell_list)
where my new_rows are just the new cell value i want to overwrite the sheet with. I don't think it is an authentication issue, as the same code snippet used to work but somehow at some point it keeps giving the 502.

Unique Key Error when exporting Scrapy data to Elasticsearch

I'm attempting to use the scrapy elasticsearch pipeline (here: https://github.com/knockrentals/scrapy-elasticsearch) to put data into elasticsearch. however i get the following error, i'm aware that it's related to the ELASTICSEARCH_UNIQ_KEY value that is currently set at 'url' but i have no idea what it should be set to.
Similar posts on here recommend solutions that involve creating a field for the unique key but i don't understand what this means.
Here's my error message:
2015-08-05 11:34:40 [scrapy] ERROR: Error processing {'link': [u'http://www.meetup.com/Search-Meetup-Karlsruhe/events/192357732/'],
'title': [u'Suchen in der vernetzten Welt']}
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/twisted/internet/defer.py", line 588, in _runCallbacks
current.result = callback(current.result, *args, **kw)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapyelasticsearch/scrapyelasticsearch.py", line 70, in process_item
self.index_item(item)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapyelasticsearch/scrapyelasticsearch.py", line 52, in index_item
local_id = hashlib.sha1(item[uniq_key]).hexdigest()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scrapy/item.py", line 56, in __getitem__
return self._values[key]
KeyError: 'url'
Similar posts on here recommend solutions that involve creating a
field for the unique key but i don't understand what this means.
Declare a field in your Item with the name you configured in the ELASTICSEARCH_UNIQ_KEY.
import scrapy
class DemoItem(scrapy.Item):
url = scrapy.Field() # ELASTICSEARCH_UNIQ_KEY
class DemoSpider(scrapy.Spider):
name = 'demo'
start_urls = ['http://www.example.com']
def parse(self, response):
demoItem = DemoItem()
demoItem['url'] = response.url
yield demoItem

Resources