Index data entered in the smart contract for offchain computation - nearprotocol

I want to index the data entered in the near protocol smart contract for offchain computation.
How to trigger a new entry of the smart contract in offchain sql database or elasticsearch for real-time data indexing?
I can do that in the frontend, but don't know if it's the right/best method as different users can use different frontend for querying the blockchain.

Great question. There is currently not an "event" system in NEAR. So the solution will have to be polling for now.
I would recommend looking at this experimental method:
https://docs.near.org/docs/api/rpc-experimental#example-of-data-changes
The code contained in that resource is meant to be run in Terminal, similar to running a curl command. These types of RPC calls are fairly easy to understand if you look in near-api-js. For an example of how a NodeJS app uses this library which talks to the RPC, I would recommend looking in this directory of near-shell:
https://github.com/near/near-shell/tree/master/commands

Related

Off-Chain Worker Framework

I haven’t entirely given up on the idea of validators moonlighting as oracles for off-chain computation…based on this extensive discussion: https://gov.near.org/t/off-chain-computation-framework/1400/6
So far from studying Sputnik’s code, I have figured out the mechanics of how to upload a blob to a smart contract. Let's say that a blob represents a storage-less contract, having only stateless functions that act only on input to the function, and return those inputs modified.
Now I’m missing the piece of how Validators can download and execute the blob. As mentioned by Ilya in the link above, the NearSDK would be able to interpret the blob (if the blob is essentially a compiled contract), but it needs to be a modified version of the SDK...
Think of this like sandbox mode…blob cannot modify state of any other contract, but can read state (forget about the internet access part for now). Results of the blob execution are then fed back to a smart contract, where they have to match the results of every other validator who executed the blob. This can be done by hash comparison (rather than looping through the results individually), so it’s not an expensive comparison, especially because it’s all or nothing.
Question: how can a Validator download the blob and execute it via a sandboxed SDK, and post the result via the regular SDK to the blockchain? I am missing a lot of architectural context…and this is bringing me to the edge of giving on the idea. Please help prevent that from happening!
If you are implementing this as a separate binary, your binary will be doing next things:
Use RPC to load the WASM file from the blockchain. See RPC reference
Use runtime-standalone to run this WASM with specific inputs. An example of using runtime standalone is here, but you will need to customize this with few things.
The result should be sent as a transaction signed by this binary again via RPC.
If you want these WASM files to have access to state, you will need to load state inside this binary. There are two options:
Modify a nearcore node to also do the above items
Run nearcore in parallel, and open the database on read when you are initializing Trie (e.g. here load from disk instead).
If you want to add more host functions (like accessing internet), you will need to fork runtime-standalone to expose those functions.

Can AWS Lambda be used as the backend for getstream.io?

I didn't find any posts related to this topic. It seems natural to use Lambda as a getstream backend, but I'm not sure if it heavily depends on persistent connections or other architectural choices that would rule it out. Is it a sensible approach? Has anyone made it work? Any advice?
While you can build an entire website only in Lambda, you have to consider the followings:
Lambda behind API Gateway has a timeout limit of 30 seconds and a Payload size limit (both received and sended) of 6MB. While for most of the cases this is fine, if you have some really big operations or you need to send some really big datas (like a high resolution image), you can't do it with this approach, but you need to think about something else (for instance you can send an SNS to another Lambda function with higher timeout that can do all this asynchronously and then send the result to the client when it's done, supposing the client is capable of receiving events)
Lambda has cold starts, which in terms slow down your APIs when a client calls them for the first time in a while. The cold start time depends on the language you are doing your Lambdas, so you might consider this too. If you are using C# or Java for your Lambdas, than this is probably not the best choice. From this point of view, Node.JS and Python seems to be the best choices, with Golang rising. You can find more about it here. And by the way, you can now specify a Provisioned Throughput for your Lambda, which aims to fix the cold start issue, but I haven't used it yet so I can't tell if there is any difference (but I'm sure there is)
If done correctly you'll end up managing hundreds of Lambda functions, while with a standard Docker Container under ECS you'll manage few APIs with multiple endpoints. This point should not be underestimated, as on one side it will make changes easier in the future, since lambda will be small and you'll easily find the bug and fix it, but on the other side you have to move across these functions, which if you don't know exactly which lambda is responsible of what can be a long process
Lambda can't handle sessions as far as I know. Because after some time the Lambda container gets dropped, you can't store any session inside the Lambda itself. You'll always need a structure to store the session so it can be shared across multiple Lambda invocations, such as some records in a DynamoDB table or something else, but this mean that you have to write the code for this, while in a classic API (like a .NET Core one) all of this is handled by the language itself and you only need to store or retrieve items from the session (most of the times)
So... yeah! A backed written entirely in Lambda is possible. The company I work in does it and I must say is a lot better, both in terms of speed and development time. But those benefits comes later, since you need to face all of the reasons I listed above before, and is not as easy as it could seem
Yes, you can use AWS Lambda as backend and integrate with Stream API there.
Building an entire application on Lambda directly is going to be very complex and requires writing lot of boiler plate code just to enforce some basic organization and structure to your project.
My recommendation is use a serverless framework to do this that takes care of keeping your application well organized and to deploy new versions (and environments).
Serverless is a good option for that: https://serverless.com/framework/docs/providers/aws/guide/intro/

System API in mulesoft

I have a requirement to persist some data in a table (single table). The data is coming from UI. Do i need to write just the system API and persist the data OR i need to write process and system API both? I don't see a use of process API in this case. Please suggest. Is it always necessary to access system API through process API or system API can be invoked without process API as well.
I would recommend a fine-grained approach to this. We should be following it through the experience layer even though we do not have must customization to the data.
In short, an experience layer API and directly calling System layer API (if there is no orchestration/data conversion/formatting needed)
Why we need a system API & experience API? A couple of points.
System API should be more attached to the underlying system. And if
in case, in the future, it changes then it should not impact any of
the clients.
Secondly, giving an upper layer gives us the feasibility to add
different SLAs, policies, logging and lots more, to different
clients. Even if you have a single client right now it's better to
architect for the future. Reusing is the key advantage of these APIs.
Please check Pattern 2 in this document
That is a question for the enterprise architect in your organisation. In this case, the process API would probably be a simple proxy for the system API, but that might not always be the case in future. Also, it is sometimes useful to follow a standard architectural pattern even if it creates some spurious complexity in the implementation. As always, there are design trade-offs and the answer will depend on factors that cannot be known by people outside of your organisation.

Setting up multiple network layers in Relay Modern

I am using a react-native app with relay modern.
Currently our app's fetchQuery implementation, just does a fetch on the network (like in https://facebook.github.io/relay/docs/en/network-layer.html),
Although there is a possibility of another local-network layer like https://github.com/relay-tools/relay-local-schema which returns data from a local-db like sqlite/realm.
Is there a way to setup offline-first response from local-network layer, followed by automatic request to real network which also populates the store with fresher data (along with writing to local-db)?
Also should/can they share the same store?
From the requirements of Network.create(), it should return a promise containing the payload, there does not seem a possibility to return multiple values.
Any ideas/help/suggestions are appreciated.
What you trying to achieve its complex, and ill go for the easy approach which is long time cache.
As you might know relay modern uses a local storage and its exact copy of the data you are fetching, you can configure this store cache as per your needs, no cache on mutations.
To understand how this is achieve the best library around to customise Relay Modern or Classic network layer you can find in https://github.com/nodkz/react-relay-network-modern
My recommendation: setup your cache and watch your request.... (you going to love it)
Thinking in Relay,
https://facebook.github.io/relay/docs/en/thinking-in-relay.html

Will it be a safe to access elasticsearch directly from Javascript API

I am learning elasticsearch. I wanted to know how safe (in terms of access control & validating user access) it is to access ES server directly from JavaScript API rather than accessing it through your backend ? Will it be a safe to access ES directly from Javascript API ?
Depends on what you mean by "safe".
If you mean "safe to expose to the internet", then no, definitely not, as there isn't any access control and anyone will be able to insert data or even drop all the indexes.
This discussion gives a good overview of the issue. Relevant section:
Just as you would not expose a database directly to the Internet and let users send arbitrary SQL, you should not expose Elasticsearch to the world of untrusted users without sanitizing the input. Specifically, these are the problems we want to prevent:
Exposing private data. This entails limiting the searches to certain indexes, and/or applying filters to the searches.
Restricting who can update what.
Preventing expensive requests that can overwhelm or crash nodes and/or the entire cluster.
Preventing arbitrary code execution through dynamic scripts.
Its most certainly possible, and it can be "safe" if say you're using it as an internal tool behind some kind of authentication. In general, no, its not secure. Elasticsearch API can create, delete, update, and search, which is not something you want to give a client access to or they could essentially do any or all of these things.
You could, in theory, create role based auth with ElasticSearch Shield, but it's far from standard practice. Its not really anymore difficult to implement search on your backend then just have a simple call to return search results.

Resources