How to use Cloud Firestore in a non-realtime fashion? - angularfire2

I've really been investing a lot of time integrating Firestore into my new apps and I'm loving it so far. One thing I noticed is it's very much "realtime" as it touts. This is awesome for writing chat apps, etc, but for non "realtime" features of those apps, it seems kind of excessive to use the Observable pattern for that as you're really probably only attempting to grab a list (not an add, update or delete event) or even a single item and doing the map on a snapshotChange gets a little repetitive in this case as well.
Is there a suggested or recommended way to use Firestore in an almost "RESTful" manner? I'm using the AngularFire2 library if that matters at all. Is it just suggested to use the firestore methods from the firebase npm packages?

I think you might be confusing Firebase Realtime Database and Firebase Firestore.
Firebase Realtime Database is very much as you describe in that it is expected you will subscribe to a stream of updates for the data you are interested in. Functionality is provided for "one-time" reads but the product name, documentation and samples often portray real-time scenarios.
Cloud Firestore, on the other hand, does not make these assumptions but still provides the option of real-time stream updates or "one-time" reads.
For deciding between the 2, you should read Google's own document on what they believe the suitability of each is.
If you decide to use Firestore, a sample "one-time" read is documented here.

Related

Should I minimise the number of subscriptions in my relay application?

I am new to using graphql and we have built a backend graphql server using elixir and we are building a frontend app using react and react-relay.
My question is whether it is better to have one large subscription at the root of my query renderer instead of having loads of smaller subscriptions for individual components. I think I would prefer using lots and lots of smaller subscriptions rather than fewer (or even one) very large subscriptions but there are concerns that too many subscriptions will be very heavy. Is this valid?
TIA
There are a few things to consider here, and really, they all depend on what your definition of "very heavy" is. Note "very heavy" might mean something very different for your Elixir server implementation than it does on the client, so I will attempt to cover some directions you may want to investigate for both here.
What is your subscription transport? Websockets can be expensive and difficult to scale on both ends at a certain point, but if you can deal with unidirectional data flow (server to client only), SSE (Server-Sent Events) are a great option. See more on a breakdown between SSE and WS here. This is more a comment on your server than on your client.
From an API design perspective, I'd caution against the few (or one) large subscriptions idea. Why? Inevitably, you are going to be pushing data on the client that it never asked for; this causes unnecessary work for both client and server. Furthermore, an individual component should only be able to subscribe to data screams with data specifically designated for it. If you go the large subscription route, then you'll have to write a good deal of defensive code to filter the event stream, looking for the data you need. That shouldn't be your responsibility to micromanage, not to mention the dirty event stream on your server.
This is not necessarily to lead you down the "small subscription" route either. Ultimately, you might want to look at this hybrid approach , which articulates my opinions on the matter better than I can myself. TL;DR design the subscriptions API so that you can enjoy the tightly scoped benefits of lots of small subscriptions ("per entity," as the author titles them), but still allow you to share payloads and reuse the same handlers that your mutations do to resolve data.
Plus, if you wanted to use persisted queries the hybrid approach is going to serve you better.

Need defense against wacky challenge to Event Sourcing architecture w/CosmosDB

In the current plan, incoming commands are handled via Function Apps, resulting in Events being sent to an Event Hub, and then materializing the views
Someone is arguing that instead of storing events in something like table storage, and materializing views based on events and snapshots, that we should:
Just stream events to a log in Azure Monitor to have auditing
We can make changes to a domain object immediately in response to a command and use the change feed as our source of events for materialized views.
He doesn’t see the advantage of even having a materialized view. Why not just use a query? Argument is we don’t expect a lot of traffic.
He wants to fulfill the whole audit log by saving events to the azure monitor log - Just an application log. Instead, that commands should just directly modify the representation of an entity in cosmos, and we'd use the change feed from CosmosDB as our domain object events, or we would create new events off of that via subscribers to that stream.
Is this actually an advantageous approach? Can ya'll think of any reasons why we wouldn't want to do that? Seems like we'd be losing something here.
He's saying we'd no longer need to be concerned with eventual consistency, as we'd have immediate consistency.
Every reference implementation I've evaluated does NOT do it the way he's suggesting. I'm not deeply versed in the advantages/disadvantages of the event sourcing / CQRS paradigm so I'm at a loss at the moment.. Currently researching furiously
This is a conceptual issue so there's not so much a code example. However, here's some references that seem to back up the approach I'm taking..
https://medium.com/#thomasweiss_io/planet-scale-event-sourcing-with-azure-cosmos-db-48a557757c8d
https://sajeetharan.com/2019/02/03/event-sourcing-with-azure-eventhub-and-cosmosdb/
https://learn.microsoft.com/en-us/azure/architecture/patterns/event-sourcing
If your goal is only to have the audit log, state-based persistence could be a good choice. Event sourcing adds some complexity to the implementation side and unless you can identify more advantages of using it, you might not convince your team to bring this complexity to the system. There are numerous questions and answers on SO, as well as in some blog posts, about pros and cons of event sourcing, so I won't get into that discussion here.
I can warn you, though, that the second article in your list is very weak and would most probably lead you to many difficulties. The role of Event Hub there is completely unclear and it doesn't explain anything about projections and read-models (what you call "materialised views"). Only a very limited number of use-cases can live with only getting one entity by id and without being able to execute a query across multiple entities. That also probably answers your concern of having read-models at all. You will need them very soon when for the first time you will start figuring out how to get a list of entities based on some condition (query).
Using CosmosDb as the event store is completely feasible, as described in the first article if you can manage the costs involved. Just remember to set the change feed TTL to -1, otherwise, you won't be able to replay your projections when you need to.
To summarise:
Keeping the audit log can be done without event-sourcing, but you need to ensure that events are published reliably, preferably in the same transaction as the entity state update. It is often hard or impossible but you might accept the risk of your audit requirement is not strict. You can also base your audit log on the CosmosDb change feed, just collecting document changes and logging them somewhere.
Event sourcing is a powerful technique but it has both pros and cons. The most common prejudice against using event sourcing is its implementation complexity. It might not be a big issue if you have a team that is somewhat experienced in building event-sourced systems. If you don't have such a team, you might want to build a small-scale spike to get some experience.
If you don't get full buy-in from the team to use event sourcing, you will later get all the blame if anything goes wrong. And it will go wrong at some point, especially with little experience in this area.
Spend some time reading books and trying out things yourself, before going wild in production.
Don't use Event Hub for anything that it is not designed for. Event Hub is the powerful event ingestion transport with limited TTL and it should be used for that purpose.
Don't use Table Storage as the event store, unless you only read entities by id. I used it in production for such a scenario and it worked (to some extent) but you can't project read-models from there.
A simple rule of thumb is to not use products for tasks they weren't designed for.
Azure Monitor was not designed to store application domain data. Azure Monitor is designed to store telemetry data from your applications and services and provides features such as alerts and other types of integration into DevOps tools for managing the operation and health of your apps.
There is a simple reason why you were able to find articles on event sourcing using Cosmos DB and why our own docs talk about it. Because it was designed to be used this way. It is simple to set up Cosmos DB to be an append only event store for your applications and use Change Feed to fire off messages in other apps or services or, in your case, to maintain a materialized view state of domain objects within your app.

Amazon SimpleDB or DynamoDB

We are building a mobile app with a rails CMS to manage it.
What our app look like?
Every admin user of the app can set one private channel with very small amount of data -
About 50 short strings.
Users can then download the app and register few different channels and fetch the data from the server to their devices. The data will be stored locally and will not be fetched again unless the admin user will update the data (but we assume that it won't happen so often). Every channel will be available to not more then 500 devices.
The users can contribute to the channel but this data will be stored on S3 and not on the database.
2 important points:
Most of the channels will be active for 5 months and not for 500 users +-. But most of the activity will happen on the same couple of days.
Every channel is for small amout of users (500) But we hope :) to get to hundreds of thousens of admin users.
Building the CMS with rails we saw that using SimpleDB is more strait-forward then using DynamoDB. But, as we are not server experts, we saw the limitations of SimpleDB and we don't know if SimpleDB could handle the amount of data transfer that we will have (if our app will succeed). another important point is that DynamoDb costs are much higher and not depended on the use while SimpleDb will be much cheaper at the beginning.
The question is:
Does simpleDB can feet our needs?
Could we migrate later to dynamoDB if our service will grow in the future ?
Starting out with a new project and not really knowing what to expect from the usage i'd say that the better option is to go with SimpleDB. It doesn't sound like your usage is going to be very high SimpleDB should be able to handle that no problem. The real power of dynamoDB comes in when you really have a lot of load. You don't fall into that category it seems.
If you design your application correctly switching between SimpleDB and DynamoDB should be a simple task if you decide at some point that SimlpeDB is not working out. I do these kind of switches all the time with other components in my software. Since both databases are NoSQL you shouldn't have a problem converting between the two. Just make sure that any any features you use in SimpleDB are available in DynamoDB. Make sure to design your database design for both DynamoDB has stricter requirements using indexes make sure that the two will be compatible.
That being said. Plenty of people have been using SimpleDB for their applications and I don't expect that you would see any performance problems unless your product really takes off, at which time you can invest in resources to move to DynamoDB.
Aside from all that we have the pricing, like you already mentioned. SimpleDB is the obvious solution for your use case.

Google Visualization API

I want a real and honest opinion what do you think of Google Visualization API?
Is it reliable to use becasue when i was reading the documentation i noticed that there are alot of issues and defects to overcome and can i use it to retrieve data from mysql database.
Thank you.
I am currently evaluating it. As compared to other javascript data visualization frameworks, i think it has a lot going for it:
dynamic loading is built-in
diverse, many things to choose from.
looks really great!
framework mostly takes care of picking whatever implementation fits the current browser
service based, you don't need to download anything in advance
unified data source: just create one data table, and have multiple visalizations draw from that data.
As a disadvantage, I'd like to mention security. I mean, because it's all service based, it is not so transparent what happens when you pass data into these API calls. And as far as I know, the API is free, but not open source, so I can't really check what is going on behind the covers.
I think the Google visualization API really shines if you want to very quickly whip up a visualization gadget for use in a blog or so, and you are not interested in deploying all kinds of plugins and libraries (for eaxmple, with jQuery based frameworks, you need may need to manage multitple javascript libraries that work together to deliver the goods). If on the other hand you are creating an application that you want to sell, you might want to keep more control over what components you are using, and I would probably consider using something like Flot
But like I said, I am only evaluation atm, I am not using this in production.
Works really great for me. Can be customized fairly easily. Haven't seen any scaling issues. No data is exposed so security should not be an issue. - Arunabh Das
One point I want to add here is that, Google Visualization API cannot be downloaded, its not available for offline usage. So application which is going to use it must be always connected to internet, otherwise I think it wont be able to render charts. Due
to this limitation, this API cannot be used in some applications for which internet connection is not available.
I am currently working on a web based application that will have the Google Visualization API added to it and from the perspective of a developer the Google Visualization API is very limited in what you can do with each individual Chart and if I had a choice I would probably look at dojox charting just because of the extra flexibility that the framework gives you.
If you are doing any kind of large web application that will use charting extensively then I would not recommend the Google Visualizations API it does not have enough flexibility for a large web application.
I am using Google Visualization API and I want to stress that they still won't let you download it, which means if their servers are down, your app will be down if you depend on it. I have been using it for about 4 months, and they have crashed once me once so I'd say they pretty reliable and their documentation is really nice.

Streaming, Daemons, Cronjobs, how do you use them? (in Ruby)

I've finally had a second to look into streaming, daemons, and cron
tasks and all the neat gems built around them! But I'm not clear on
how/when to use these things.
I have a few questions:
1) If I wanted to have a website that stayed constantly updated, realtime, with my Facebook friends' activity feeds, up-to-the-minute Amazon book reviews on my favorite books, and my Twitter feed, would I just create some custom streaming implementation using the Daemon gem, the ruby-yali gem for streaming the content, and the Whenever gem, which could say, check those sites every 3-10 seconds to see if content I'm looking for has changed? Is that how it would work? Or is it typically/preferably done differently?
2) Is (1) too processor intensive? Is there a better way you do it, a better way for live content streaming, given that the website you want realtime updates on doesn't have a streaming api? I'm thinking about just sending a request every few seconds in a separate small ruby app (with daemons and cronjobs), getting the json/xml result, using nokogiri to remove the stuff I don't need, and then just going through the small list of comments/books/posts/etc., building a feed of what's changed, and using Juggernaut or something to push those changes to some rails app. Would that work?
I guess it all boils down to the question:
How does real-time streaming of the latest content of some website work? How do YOU do it?
...so if someone is on my site, they can see in real time the new message or new book that just came out?
Looking forward to your answers,
Lance
Well first, if a website that doesn't provide an API, then it's a strong indication that it's not legal to parse and extract their data, however you'd better check their terms of use and privacy policy.
Personally I'm not aware of something called "Streaming API", but supposing that they have an API , you still need to pull the results provided by it(xml, json, ....), parse them and present them back to the user. The strategy will vary depending on your app type:
Desktop app: then you just can pull the data directly, parse it and provide it to the user, many apps are like that just like Twhirl.
Web app: then you need to cut down the time for extracting the data. Typically you will pull the data from the API and store it. However, storing the data is a bit tricky! You don't want want your database to be a lock down for the app by the extreme pull queries that it gonna get to retrieve the data back. One way to do this is to use push methodology; follow option 2 in this case to get the data and then push to the user. If you want instant updates like chat for example you can have a look at orbited. If it's ok to save the data to some kind of user and followers' 'inboxes', then the simplest way as I can tell is to use IMAP to send the updates to the user inbox.

Resources