I am not able to create a feature store in vertexAI using labels - google-cloud-vertex-ai

I am passing the values of lables as below to create a featurestore with labels. But after creation of the featurestore, I do not see the featurestore created with labels. Is it still not supported in VertexAI
fs = aiplatform.Featurestore.create(
featurestore_id=featurestore_id,
labels=dict(project='retail', env='prod'),
online_store_fixed_node_count=online_store_fixed_node_count,
sync=sync
)

As mentioned in this featurestore documentation:
A featurestore is a top-level container for entity types, features,
and feature values.
With this, the GCP console UI "labels" are the "labels" at the Feature level.
Once a featurestore is created, you will need to create an entity and then create a Feature that has the labels parameter as shown on the below sample python code.
from google.cloud import aiplatform
test_label = {'key1' : 'value1'}
def create_feature_sample(
project: str,
location: str,
feature_id: str,
value_type: str,
entity_type_id: str,
featurestore_id: str,
):
aiplatform.init(project=project, location=location)
my_feature = aiplatform.Feature.create(
feature_id=feature_id,
value_type=value_type,
entity_type_name=entity_type_id,
featurestore_id=featurestore_id,
labels=test_label,
)
my_feature.wait()
return my_feature
create_feature_sample('your-project','us-central1','test_feature3','STRING','test_entity3','test_fs3')
Below is the screenshot of the GCP console which shows that labels for test_feature3 feature has the values defined in the above sample python code.
You may refer to this creation of feature documentation using python for more details.
On the other hand, you may still view the labels you defined for your featurestore using the REST API as shown on the below sample.
curl -X GET \
-H "Authorization: Bearer "$(gcloud auth application-default print-access-token) \
"https://<your-location>-aiplatform.googleapis.com/v1/projects/<your-project>/locations/<your-location>/featurestores"
Below is the result of the REST API which also shows the value of the labels I defined for my "test_fs3" featurestore.

Related

Migrate kubeflow docker image components to VertexAI pipeline

I am trying to migrate a custom component created in kubeflow to VertexAI.
In Kubeflow I used to create components as docker container images and then load them into my pipeline as follows:
def my_custom_component_op(gcs_dataset_path: str, some_param: str):
return kfp.dsl.ContainerOp(
name='My Custom Component Step',
image='gcr.io/my-project-23r2/my-custom-component:latest',
arguments=["--gcs_dataset_path", gcs_dataset_path,
'--component_param', some_param],
file_outputs={
'output': '/app/output.csv',
}
)
I would then use them in the pipeline as follows:
#kfp.dsl.pipeline(
name='My custom pipeline',
description='The custom pipeline'
)
def generic_pipeline(project_id, some_param):
output_component = my_custom_component_op(
gcs_dataset_path=gcs_dataset_path,
some_param=some_param
)
output_next_op = next_op(gcs_dataset_path=dsl.InputArgumentPath(
output_component.outputs['output']),
next_op_param="some other param"
)
Can I reuse the same component docker image from kubeflow v1 in vertex ai pipeline? How can I do that? hopefully without changing anything in the component itself.
I have found examples online of vertex AI pipelines that uses the #component decorator as follows:
#component(base_image=PYTHON37, packages_to_install=[PANDAS])
def my_component_op(
gcs_dataset_path: str,
some_param: str
dataset: Output[Dataset],
):
...perform some op....
But this would require me to copy paste the docker code in my pipeline and this is not really something I want to do. Is there a way to re-use the docker image and passing the parameters? I couldn't find any example of that anywhere.
You need to prepare component yaml and load it with load_component_from_file.
It's well documented on kfp v2 Kubeflow documentation page, it's also written here.

Reading CloudWatch log query status in go SDK v2

I'm running a CloudWatch log query through the v2 SDK for Go. I've successfully submitted the query using the StartQuery method, however I can't seem to process the results.
I've got my query ID in a variable (queryID) and am using the GetQueryResults method as follows:
results, err := svc.GetQueryResults(context.TODO(), &cloudwatchlogs.GetQueryResultsInput{QueryId: queryId,})
How do I actually read the contents? Specifically, I'm looking at the Status field. If I run the query at the command line, this comes back as a string description. According to the SDK docs, this is a bespoke type "QueryStatus", which is defined as a string with enumerated constants.
I've tried comparing to the constant names, e.g.
if results.Status == cloudwatchlogs.GetQueryResultsOutput.QueryStatus.QueryStatusComplete
but the compiler doesn't accept this. How do I either reference the constants or get to the string value itself?
The QueryStatus type is defined in the separate types package. The Go SDK services are all organised this way.
import "github.com/aws/aws-sdk-go-v2/service/cloudwatchlogs/types"
if res.Status == types.QueryStatusComplete {
fmt.Println("complete!")
}

Azure Forms Recognizer - Saving output results SDK Python

When I used the API from Forms Recognizer, it returned a JSON file. Now, I am using Form Recognizer with SDK and Python, and it returns a data type that seems to be specific from the library azure.ai.formrecognizer.
Does anyone know how to save the data acquired from Form Recognizer SDK Python in a JSON file like the one received from Form Recognzier API?
from azure.ai.formrecognizer import FormRecognizerClient
from azure.identity import ClientSecretCredential
client_secret_credential = ClientSecretCredential(tenant_id, client_id, client_secret)
form_recognizer_client = FormRecognizerClient(endpoint, client_secret_credential)
with open(os.path.join(path, file_name), "rb") as fd:
form = fd.read()
poller = form_recognizer_client.begin_recognize_content(form)
form_pages = poller.result()
Thanks for your question! The Azure Form Recognizer SDK for Python provides helper methods like to_dict and from_dict on the models to facilitate converting the data type in the library to and from a dictionary. You can use the dictionary you get from the to_dict method directly or convert it to JSON.
For your example above, in order to get a JSON output you could do something like:
poller = form_recognizer_client.begin_recognize_content(form)
form_pages = poller.result()
d = [page.to_dict() for page in form_pages]
json_string = json.dumps(d)
I hope that answers your question, please let me know if you need more information related to the library.
Also, there's more information about our models and their methods on our documentation page here. You can use the dropdown to select the version of the library that you're using.

How to structure container logs in Vertex AI?

I have a model in Vertex AI, from the logs it seems that Vertex AI has ingested the log into message field within jsonPayload field, but i would like to structure the jsonPayload field such that every key in message will be a field within jsonPayload, i.e: flatten/extract message
The logs in Stackdriver follow a defined LogEntry schema. Cloud Logging uses structured logs where log entries use the jsonPayload field to add structures to their payload.
For Vertex AI, the parameters are passed inside the message field which we see in the logs. These structures of the logs are predefined. However if you want to extract the fields that are present inside the message block you can refer to the below mentioned workarounds:
1. Create a sink :
You can export your logs to a Cloud Storage bucket, Bigquery,Pub/Sub etc.
If you use Bigquery as the sink, then in Bigquery you can use the JSON functions to extract the required data.
2. Download the logs and write your custom code :
You can download the log files and then write your custom logic to extract data as per your requirements.
You can refer to the client library (python) to write the custom logic and python JSON functions.
Using the gcloud logging client to write structure logs into a Vertex AI endpoint:
(Make sure you have a service account with premissions to write logs into gcloud, and, for clean logs make sure you don't stream any other logs into stderr or stdout)
import json
import logging
from logging import Handler, LogRecord
import google.cloud.logging_v2 as logging_v2
from google.api_core.client_options import ClientOptions
from google.oauth2 import service_account
data_to_write_to_endpoint = {key1: value1, ...}
#Json key for a Service account permitted to write logs into the gcp
# project where your endpoint is
credentials = service_account.Credentials.from_service_account_info(
json.loads(SERVICE_ACOUNT_KEY_JSON)
)
client = logging_v2.client.Client(
credentials=credentials, client_options=ClientOptions(api_endpoint="logging.googleapis.com",),
)
# This represent your Vertex AI endpoint
resource = logging_v2.Resource(
type="aiplatform.googleapis.com/Endpoint",
labels={"endpoint_id": YOUR_ENDPOINT_ID, "location": ENDPOINT_REGION},
)
logger = client.logger("LOGGER NAME")
logger.log_struct(
info=data_to_write_to_endpoint,
severity=severity,
resource=resource,
)

how can I get ALL records from route53?

how can I get ALL records from route53?
referring code snippet here, which seemed to work for someone, however not clear to me: https://github.com/aws/aws-sdk-ruby/issues/620
Trying to get all (I have about ~7000 records) via resource record sets but can't seem to get the pagination to work with list_resource_record_sets. Here's what I have:
route53 = Aws::Route53::Client.new
response = route53.list_resource_record_sets({
start_record_name: fqdn(name),
start_record_type: type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
})
response.last_page?
response = response.next_page until response.last_page?
I verified I'm hooked into right region, I see the record I'm trying to get (so I can delete later) in aws console, but can't seem to get it through the api. I used this: https://github.com/aws/aws-sdk-ruby/issues/620 as a starting point.
Any ideas on what I'm doing wrong? Or is there an easier way, perhaps another method in the api I'm not finding, for me to get just the record I need given the hosted_zone_id, type and name?
The issue you linked is for the Ruby AWS SDK v2, but the latest is v3. It also looks like things may have changed around a bit since 2014, as I'm not seeing the #next_page or #last_page? methods in the v2 API or the v3 API.
Consider using the #next_record_name and #next_record_type from the response when #is_truncated is true. That's more consistent with how other paginations work in the Ruby AWS SDK, such as with DynamoDB scans for example.
Something like the following should work (though I don't have an AWS account with records to test it out):
route53 = Aws::Route53::Client.new
hosted_zone = ? # Required field according to the API docs
next_name = fqdn(name)
next_type = type
loop do
response = route53.list_resource_record_sets(
hosted_zone_id: hosted_zone,
start_record_name: next_name,
start_record_type: next_type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
)
records = response.resource_record_sets
# Break here if you find the record you want
# Also break if we've run out of pages
break unless response.is_truncated
next_name = response.next_record_name
next_type = response.next_record_type
end

Resources