Bigquery storage grpc write api - go

My use case here is to read from database and write it into bigquery table.
For this i am trying to use grpc api
And Following this example file. Considiring myself new to protobuf and golang I am unable to figure out how to write a DB row into bigquery table. Specifically confused about this part. Not able to find any particular example of creating request in protobuf byte sequence and streaming it.
Any help is much appreciated.

Go client provides a managedWriter that you can use to stream data more easily. You can see how it is used in the integration tests.
Also, if you are new to Go, do you consider using Java instead? There is a JsonStreamWriter available in Java that allows you to append JSONArray objects (as opposed to protobuf rows) and the samples are here: https://github.com/googleapis/java-bigquerystorage/tree/main/samples/snippets/src/main/java/com/example/bigquerystorage

Related

can i use some native Parquet file tools in Azure Logic App?

Hi there I need to change format of parquet file to csv using only Logic app native tools. Is that even possible?
I did research of similar issues, I found how to use Azure Functions to change format, but it's not native Logic App tool.
There's a custom connector that will transform Parquet to Json for you.
It will also allow you to perform filter and sorting operations on the data prior to it being returned.
Documentation can be found here ... https://www.statesolutions.com.au/parquet-to-json/

Datadog data validation between on-prem and cloud database

Very new to Datadog and need some help. I have crafted 2 SQL queries (one for on-prem database and one for cloud database) and I would like to run those queries through Datadog and be able display the query results and validate that the daily results fall within an expected variance between the two systems.
I have already set up Datadog on the cloud environment and believe I should use DogStatsD to create a custom metric but I am pretty lost with how I can incorporate my necessary SQL queries in the code to create the metric for eventual display on a dashboard. Any help will be greatly appreciated!!!
You probably want to be using the MySQL integration, and configure the 'custom queries' option: https://docs.datadoghq.com/integrations/faq/how-to-collect-metrics-from-custom-mysql-queries
You can follow those instructions after you configure the base integration https://docs.datadoghq.com/integrations/mysql/#pagetitle (This will give you a lot of use metrics in addition to the custom queries you want to run)
As you mentioned, DogStatsD is a library you can import to whatever script or application in order to submit metrics. But it really isn't a common practice in the slightest to modify the underlying code of your database. So instead it makes more sense to externally run a query on the database, take those results, and send them to datadog. You could totally write a python script or something to do this. However the Datadog agent already has this capability built in, so it's probably easier to just use that.
I am also just assuming SQL refers to MySQL, there are other integration for things like SQL Server, and PostgreSQL, and pretty much every implementation of sql. And the same pattern applies where you would configure the integration, and then add an extra line to the config file where you have the check run your queries.

Spring Data Flow w/ 2 sources feeding one processor/sink

I'm looking for some advice on setting up a Spring Data Flow stream for a specific use case.
My use case:
I have 2 RDBMS and I need to compare the results of queries run against each. The queries should be run roughly simultaneously. Based on the result of the comparison, I should be able to send an email through a custom email sink app which I have created.
I envision the stream diagram to look something like this (sorry for the paint):
The problem is that SDF does not, to my knowledge, allow a stream to be composed with 2 sources. It seems to me that something like this ought to be possible without pushing the limits of the framework too far. I'm looking for answers that provide a good approach to this scenario while working within the SDF framework.
I am using Kafka as a message broker and the data flow server is using mysql to persist stream information.
I have considered creating a custom Source app which polls two datasources and sends the messages on the output channel. This would eliminate my requirement of 2 sources, but it looks like it would require a significant amount of customization of the jdbc source application.
Thanks in advance.
I have not really tried this, but you should be able to use named destinations to achieve that. Take a look here: http://docs.spring.io/spring-cloud-dataflow/docs/current-SNAPSHOT/reference/htmlsingle/#spring-cloud-dataflow-stream-advanced
stream create --name jdbc1 --definition "jdbc > :dbSource"
stream create --name jdbc2 --definition "jdbc > :dbSource"
stream create --name processor --definition ":dbSource > aggregator | sink"

Spring Data Couchbase - Search without having admin rights on the cluster

I'm currently working on a POC with Couchbase, using Spring Data to put & get documents on/off a bucket on a cluster.
As I'm working in a big company, I'm lucky they gave me a bucket, but still I don't have the admin rights on the cluster, so I only have access to the bucket.
But as I'm digging into the Spring Data documentation, I'm not able to find a way to retrieve documents without creating views on the server. (I'm getting errors like "Unknown query param" ). Nevertheless with couchbase java sdk i'm able to, through n1ql queries, but the use of the Spring data layer is mandatory.
The answers I found always point me to the server-side function direction
ex : https://stackoverflow.com/a/30928169/3744307
What I would like to find, is a way to add a repository method like
List findReceiptByAccount(String Account)
without having to specificly declare the function server-side.
Is this possible, or have I to send a request to the administrators to create functions for me everytime I have to add a findByX method?
Thanks for your time,
What version of CB is it ?
I think that prior to 4.5, a n1ql access (which you seems to have) is enough to build your index yourself !
With Spring Data Couchbase 2.x that would use a N1QL index in the background, and it would work with a single primary index (although having 1 index per repository entity class would be best for performance). Maybe you can ask your admin to create that index once?

Neo4j native api data not picked by Spring Data Repos

I've got some data that has been pumped into a neo4j instance using the native api. The same instance is used by an app backed by Spring data graph. The repositories fail to find the data. I'm assuming that this is an issue due to indexes and/or missing properties.
When the data is pumped in the following properties are set:
node.setProperty("__type__", "com.x.x.Class");
Index is set as follows:
Index<Node> typeIndex = indexManager.forNodes("__types__");
typeIndex.add(node, "className", "com.x.x.Class");
Any clues/help is appreciated.
imamc,
I'd appreciate it if you posted a simple test that reproduces the problem, preferably to https://groups.google.com/forum/?fromgroups#!forum/neo4j
But off hand, what you said makes sense, I do not have any other tips. But if we get some code/ test to work on, we might be able to help.
Lasse

Resources