I am working on a project splitting services off of a monolithic codebase. Currently, the two services I am working on are a login/authentication service, a service providing a few REST API endpoints and a data retrieval service feeding the REST API. I need the services to exchange data with each other, in particular, data retrieval needs to send data to the REST API and REST API needs to exchange authentication requests/responses with the login service.
The technology of choice in go world seems to be protobufs and gRPC. I have been looking into using these, but it seems really inconvenient, so I wonder if I am just doing things wrong.
For example, my data retrieval gets records from an RDBMS and my REST API serves this data as JSON. Normally, I would define a "model" struct for each type of record in the database/API, use reflect annotations for both database and JSON in the struct definitions and used something like sqlx to scan query results into structs and encoding/json to serialize structs into JSON. But if I want to pass the data around using gRPC and protobufs this whole setup goes out of the window. Since protobuf generates its own struct types, I would have to manually implement conversion from SQL rows into protobufs and from protobufs into JSON for every single message type I define. Not that implementing conversions is hard, but it introduces more opportunities for bugs and code fragility and feels like reinventing the wheel.
And this seems like this should be a very common problem. Am I missing some obvious solution?
Related
We have a use case of data streaming from the main transactional system to other downstream such as data analytics and machine learning team.
One of the requirements are to ensure data governance that data source can control who can read which column, and potentially lifecycle of a data to ensure data siting in another domain gets purged should the source data removed it, such as if a user deletes the account, we need to make sure the data in all downstream gets removed.
While we are considering Thrift, Avro and ProtoBuf, what are the common frameworks that we can use for such data governance? Do any of these protocol supports metadata for such data governance around data authorization, lifecycle?
Let me get this straight:
protobuf is not a security device; to someone with the right tools it is just as readable as xml or json, with the slight issue that it can be uncertain how to interpret some values;
It's not of a much difference than JSON nor XML. It is just an interface language. Sure, it has encoding, it is a bit different and a lot more customizable, but it does in no way confront security. It is up to you to secure the channel between sender and receiver.
I need to develop my backed application in NodeJS ExpressJS and GraphQL, and I am using the Apollo GraphQL server for this. Now I have to connect this GraphQL to ElasticSearch so that I can directly write the ElasticSearch Queries in Apollo Playground like earlier I was using for GraphQL Queries.
Can someone help me with this Scenario?
There are multiple scenarios for this through object manipulation, some use a dedicated file for elasticsearch and others use the logic directly to resolvers in graphql and then just add the main method in the graphql/nodejs server declaration in order for the initialization to start (index creation etc) (some call it index.ts it depends)
Use objects and single responsibility.
create a frontend observable that looks at an API that API can then take data from the elastic cluster.
The problem as you pointed out is that you use graphql directly, while graphql is mainly to create a layer between front and back, what you do is making the API layer connect directly with the back, so that needs to change through a new object that only exists for the API, no matter what happens to your back this will have to stay the same, that's why graphql is important, it needs to be used on that specific way.
When trying to get started with GraphQL, you meet a lot of new terms: Some are related to the concept of GraphQL itself (mutations, subscriptions, …), but there is also an entire ecosystem around it, which – unfortunately – is not always separated clearly from GraphQL itself. However, I find it quite hard to tell where one thing ends, and where the other one starts, and what the differences are, and what is needed when.
So, to name a few of these terms:
GraphQL
Apollo
Apollo Client
Relay
Can you explain maybe in a few sentences what these things (except GraphQL) are, what they are good for, and how they relate to each other (or don't)? And, which important tools / concepts are missing here?
GraphQL The language
GraphQL is a query language. It has a specification that defines the language, schemas and also the execution of GraphQL queries. Learning these things is a great place to start and completely programming language agnostic.
GraphQL implementations
Then there are different GraphQL implementations in different languages that allow you to create a schema and describe how the query resolves to values. Usually these implementations validate the query against the schema that you have defined and take over the execution. Pretty much all of the JavaScript ecosystem uses GraphQL.js but there are many more implementations in other languages.
GraphQL Servers
GraphQL is also transport layer agnostic. That means that usually the GraphQL implementations don't come with an HTTP server. But often we use HTTP to make GraphQL queries, this is why there are some libraries that use these implementations and provide an easy way to create an HTTP server on top (e.g. by providing a middleware for an HTTP framework or coming with a whole server). I think in JavaScript pretty much everyone uses Apollo Server because it brings some more features and it integrates smothly with the Apollo ecosystem and the services offered by Apollo the company.
Apollo Server has also very much popularized the SDL (Schema Definition Language) approach of defining a GraphQL API. With the SDL approach the GraphQL schema is not created using code but by defining the Schema in a special language (also part of the GraphQL spec) and defining single resolver function seperately from that definition. This get's you started quickly but my feeling is that it is not very popular for large APIs. But you can simply pass your GraphQL.js schema to Apollo Server.
GraphQL Clients
When we have a server running we can make queries to the server using a simple HTTP client like the fetch function of the browser. That works pretty well but the power of GraphQL really shines when we use a client that supports caching an automatic query fetching when they are needed. This way we can reach the promised land of declarative data fetching / data dependencies. Facebook has published their own client library that is designed for the unique requirements of a large web enterprise. The library is called Relay and the newer version (breaking from the older version) is usually reffered to as Relay Modern. But Relay is relatively complicated and needs a specific build chain so that GraphQL became really interesting when Apollo released a lighweight client alternative known as Apollo Client. Apollo Client developed a lot over the years and now supports a lot of configuration. Also apollo-react allows to use Apollo Client with React whereas Relay is specifically built for React. With Apollo gaining quite some weight over the years the folks at Formidable Labs have created Urql.
Conclusion
You can use all of these technologies together. Many people simply chose to use the Apollo Ecosystem together, which is probably a solid choice. If you have used Redux before you will probably feel at home using Apollo Client or Urql. If you are building a large app with a performance focus you should consider Relay and understand what contrains this puts on the way you build the GraphQL schema.
I've recently read about the advantages (and disatvanteges) of GraphQL over Rest API.
I am developing a webpage that consumes several different Rest APIs and Soap services. Some of those services are dependent, meaning that a result from Rest1 will be passed as a parameter to Rest2 which will be passed to Soap service for a final return value.
From what I understood, GraphQL deals with multiple data sources and query nesting, but I have not yet understood if it will handle those nested dependent queries.
Can anyone that worked with several data sources that are dependent with GraphQL tell me if it can be done? My project should be up in 2 weeks and investing time in learning and setting up GraphQL and ending up not using it because it's not supporting my case would be a big failure for me.
Note: the APIs and services are not mine, I am consuming them from an outside source
I'm assuming you haven't yet setup a GraphQL server. Once you do, you can see how this isn't too difficult. So, I'd recommend you setup your own server first. The Egghead Course, "Build a GraphQL Server" got me started, but it's not free.
In essence, you'll be setting up your schema then defining how to resolve with data. When you resolve, you can setup an express server to query a database, or you can hit a REST interface, or hit your SOAP interface. How you retrieve the data is up to you, so long as you return it in compliance with your defined schema.
Hope that makes sense. Mocking up a mini app to demonstrate is possible, but since I don't have one handy, this is the best advice I can offer.
Thrift sounds awesome but can't find some basic stuff I'm used to in RPC frameworks (as HttpServlet). Example of the things I can't find: session management, filtering, upload/download progress.
I understand that the missing stuff might be a management layer on top of Thrift. If so, any example of such a layer? Perhaps AOP (Aspect Oriented)?
I can't imagine such a layer that compiles to all languages and that's I'm missing. Taking session management as an example, there might be several clients that all need to do some authentication and pass the session_id upon each RPC. I would expect a similar API for all languages doing so.
Anyone knows of a a management layer for Thrift?
So thrift itself is not going to help you out a lot here.
I have had similar desires, and have a few suggestions:
1. Put your management objects into the IDL
Simply add an api token or common transfer data struct as a parameter to all of your service methods. Set it as parameter id 15 so that it will always be the last parameter, even if you add others in the middle.
As the first step in your handler you can validate/store/do whatever with the extra data.
This has the advantage that it is valid in any platform that thrift supports.
2. Use thrift over http
If you use http as your transport, you can include whatever data as you want as http headers, and the thrift content as the body.
This will often require a custom http client for every platform you use to inject the data, and a custom handler on the server to use the data, but neither of those are prohibitively difficult.
3. Hack the protocol
It is possible to create your own custom protocol that wraps another protocol and injects custom data. Take a look at how the multiplexed protocol works in the thrift library for most languages:
c# here. It sends the method name across the wire as service:method. The multiplexed processor unwraps this encoding and passes it on to the appropriate processor.
I have used a similar method to encode arbitrary key/value pairs (like http headers) inside the method name.
The downside to this is that you need to write a more complicated extension for each platform you will be using. Once. It varies a bit from language to language how this works, but it is generally simple enough once you figure it out once.
These are just a few ideas I have had, and I am sure there are others. The nice thing about thrift is how the individual components are decoupled from each other. If you have special needs you can swap any of them out as you need to to add specific functionality.