I'm writing a web socket service for my Ember app. The service will subscribe to a URL, and receive data. The data will push models into Ember Data's store.
The URL scheme does not represent standard RESTful routes; it's not /posts and /users, for example, it's something like /inbound. Once I subscribe it will just be a firehose of various events.
For each of these routes I subscribe to, I will need to implementing data munging specific to that route, to get the data into a format Ember Data expects. My question is, where is the best place to do this?
An example event object I'll receive:
event: {
0: "device:add",
1: {
device: {
devPath: "/some/path",
label: "abc",
mountPath: "/some/path",
serial: "abc",
uuid: "5406-12F6",
uniqueIdentifier: "f5e30ccd7a3d4678681b580e03d50cc5",
mounted: false,
files: [ ],
ingest: {
uniqueIdentifier: 123
someProp: 123,
anotherProp: 'abc'
}
}
}
}
I'd like to munge the data to be standardized, like this
device: {
id: "f5e30ccd7a3d4678681b580e03d50cc5",
devPath: "/some/path",
label: "abc",
mountPath: "/some/path",
serial: "abc",
uuid: "5406-12F6",
mounted: false,
files: [ ],
ingestId: 123
},
ingest: {
id: 123,
someProp: 123,
anotherProp: 'abc'
}
and then hand that off to something that will know how to add both the device model and the ingest model to the store. I'm just getting confused on all the abstractions in ember data.
Questions:
Which method should I pass that final, standardized JSON to in order to add the records to the store? store.push?
Where is the appropriate place for the initial data munging i.e. getting the event data from the array? Application serializer's extractSingle? pushPayload? Most of the munging will be non-standard across the different routes.
Should per-type serializers be used for each key in the data after I've done the initial munging? i.e. should I had initial "blob" to application serializer, which will then delegate each key to the per-model serializers?
References:
Docs on the store
RESTSerializer
Related
This is the real data structure of my current json data, it contains [] in the data in the first place.
{"events":[{"type":"message","message":{"type":"text"}}]}
So, basically I just want text message type data from SQS, but now I don't know what should the filter be.
If you are using yaml in serverless, I´d try this:
filterPatterns:
- body: {events: {type: ["message"], message: {type: ["text"]}}}
In case it helps, I had a similar scenario: in my case, I want a function to be triggered only when inside the body of the SQS message, the field "type" has values "create" or "delete"}
In my case, the next code worked:
filterPatterns:
- body : {type: [create, delete]}
From Redux docs:
This [normalized] state structure is much flatter overall. Compared to
the original nested format, this is an improvement in several ways...
From https://github.com/paularmstrong/normalizr
:
Many APIs, public or not, return JSON data that has deeply nested objects. Using data in this kind of structure is often very difficult for JavaScript applications, especially those using Flux or Redux.
Seems like normalized database-ish data structures are better to work with on front end. Then why GraphQL is so popular if it's whole language style is revolved around quickly getting any nested data? Why do people use it then?
This kind of discussion is off-topic on SO ...
it's not only about [normalized] structures ...
graphql client (like apollo) takes care of all data fetching related nuances (error handling, cache, refetching, data conversion, and many more) also but hardly doable with redux.
Different use cases, you can use both:
keep (complex) app state in redux,
handle data fetching in apollo (you can use it for local state, too).
Let's look at why we want to normalize the cache and what kind of work we have to do to get a normalized cache.
For the main page we fetch a list of TODOs and a list of high priority TODOS. Our two endpoints return the following data:
{
all: [{ id: 1, title: "TODO 1" }, { id: 2, title: "TODO 2" }, { id: 2, title: "TODO 2"}],
highPrio: [{ id: 1, title: "TODO 1" }]
}
If we would store the data like this into our cache, we have a difficult time updating a single todo, because we have to update the todo in every array we have in our store or might have in our store in the future.
We can normalize the data and only store references in the array. This way we can easily update a single todo in a single place:
{
queries: {
all: [{ ref: "Todo:1" }, { ref: "Todo:2" }, { ref: "Todo:2" }],
highPrio: [{ ref: "Todo:1" }}]
},
refs: {
"Todo:1": { id: 1, title: "TODO 1" },
"Todo:2": { id: 2, title: "TODO 2" },
"Todo:3": { id: 3, title: "TODO 3" }
}
}
The downside is, that this shape of data is now much harder to use in our list component. We will have to transform the cache a lot, roughtly like so:
function denormalise(cache) {
return {
all: cache.queries.all.map(({ ref }) => cache.ref[ref]),
highPrio: cache.queries.highPrio.map(({ ref }) => cache.ref[ref]),
};
}
Notice how now updating Todo:1 inside of the cache will update all queries that reference the todo automatically, if we run this function inside of the React component (this is often called a selector in Redux).
The magical thing about GraphQL is that it is a strict specification with a type system. This allows GraphQL clients like Apollo to globally identify objects and normalise that cache. At the same time it can also automatically denormalise the cache for you and update objects in the cache automatically after a mutation. This means that most of the time you have to write no caching logic at all. And this should explain why it is so popular: The best code is no code!
const { data, loading, error } = useQuery(gql`
{ all { id title } highPrio { id title }
`);
This code automatically fetches the query on load, normalizes the response and writes it into the cache. Then denormalizes the cache back into the shape of the query using the cache data. Updates to elements in the cache automatically update all subscribed components.
I'm trying to get the possible values for multiple dropdown menus from my graphQL api.
for example, say I have a schema like so:
type Employee {
id: ID!
name: String!
jobRole: Lookup!
address: Address!
}
type Address {
street: String!
line2: String
city: String!
state: Lookup!
country: Lookup!
zip: String!
}
type Lookup {
id: ID!
value: String!
}
jobRole, city and state are all fields that have a predetermined list of values that are needed in various dropdowns in forms around the app.
What would be the best practice in the schema design for this case? I'm considering the following option:
query {
lookups {
jobRoles {
id
value
}
}
}
This has the advantage of being data driven so I can update my job roles without having to update my schema, but I can see this becoming cumbersome. I've only added a few of our business objects, and already have about 25 different types of lookups in my schema and as I add more data into the API I'll need to somehow to maintain the right lookups being used for the right fields, dealing with general lookups that are used in multiple places vs ultra specific lookups that will only ever apply to one field, etc.
Has anyone else come across a similar issue and is there a good design pattern to handle this?
And for the record I don't want to use enums with introspection for 2 reasons.
With the number of lookups we have in our existing data there will be a need for very frequent schema updates
With an enum you only get one value, I need a code that will be used as the primary key in the DB and a descriptive value that will be displayed in the UI.
//bad
enum jobRole {
MANAGER
ENGINEER
SALES
}
//needed
[
{
id: 1,
value: "Manager"
},
{
id: 2,
value: "Engineer"
},
{
id: 3,
value: "Sales"
}
]
EDIT
I wanted to give another example of why enums probably aren't going to work. We have a lot of descriptions that should show up in a drop down that contain special characters.
// Client Type
[
{
id: 'ENDOW',
value: 'Foundation/Endowment'
},
{
id: 'PUBLIC',
value: 'Public (Government)'
},
{
id: 'MULTI',
value: 'Union/Multi-Employer'
}
]
There are others that are worse, they have <, >, %, etc. And some of them are complete sentences so the restrictive naming of enums really isn't going to work for this case. I'm leaning towards just making a bunch of lookup queries and treating each lookup as a distinct business object
I found a way to make enums work the way I needed. I can get the value by putting it in the description
Here's my gql schema definition
enum ClientType {
"""
Public (Government)
"""
PUBLIC
"""
Union/Multi-Employer
"""
MULTI
"""
Foundation/Endowment
"""
ENDOW
}
When I retrieve it with an introspection query like so
{
__type(name: "ClientType") {
enumValues {
name
description
}
}
}
I get my data in the exact structure I was looking for!
{
"data": {
"__type": {
"enumValues": [{
"name": "PUBLIC",
"description": "Public (Government)"
}, {
"name": "MULTI",
"description": "Union/Multi-Employer"
}, {
"name": "ENDOW",
"description": "Foundation/Endowment"
}]
}
}
}
Which has exactly what I need. I can use all the special characters, numbers, etc. found in our descriptions. If anyone is wondering how I keep my schema in sync with our database, I have a simple code generating script that queries the tables that store this info and generates an enums.ts file that exports all these enums. Whenever the data is updated (which doesn't happen that often) I just re-run the code generator and publish the schema changes to production.
You can still use enums for this if you want.
Introspection queries can be used client-side just like any other query. Depending on what implementation/framework you're using server-side, you may have to explicitly enable introspection in production. Your client can query the possible enum values when your app loads -- regardless of how many times the schema changes, the client will always have the correct enum values to display.
Enum values are not limited to all caps, although they cannot contain spaces. So you can have Engineer but not Human Resources. That said, if you substitute underscores for spaces, you can just transform the value client-side.
I can't speak to non-JavaScript implementations, but GraphQL.js supports assigning a value property for each enum value. This property is only used internally. For example, if you receive the enum as an argument, you'll get 2 instead of Engineer. Likewise, you would return 2 instead of Engineer inside a resolver. You can see how this is done with Apollo Server here.
My app currently uses Parse and it's time to migrate. I am either considering hosting my own Parse Server or use Firebase. I am looking for guidance on how to approach my data migration problem to Firebase based on my current data model.
I have a table Users and that table, aside from all the normal columns, has Partner column which is of type User.
The flow, works like this:
User 1 signs up
User 1 invites User 2
User 2 receives invite e-mail with invite code
User 2 goes to app and signs-up using that code
And then I have a Parse cloud function that joins both users as partners adding each other's IDs to the respective column.
The partners are connected via a GCM topic name, so I can push notifications just to these two people.
So this is what I would like to achieve in Firebase. I would like to have two users to connect together in a way.
Maybe I could have a json like this:
partners: {
topic_name_partner1: {
user1: {info about user1},
user2: {info about user2}
},
topic_name_partner2: {
user1: {info about user1},
user2: {info about user2}
},
topic_name_partner3: {
user1: {info about user1},
user2: {info about user2}
}
....etc
}
Would this approach make sense ? Obviously I want a scalable application, so looking for help as well to best represent the data in that sense.
And, lastly, does Firebase have Cloud Functions like Parse? If not, how can I connect both users when the second user is registering? Maybe I have to look up Ref for the topic_name_partner1 string and then when finding it, update user2 with the reference to that user?
Thanks!
Based on this firebase structure guide, here is what I will do
users: {
user1: {
name: "user1",
partner: "topic_name_partner1",
... other info
},
user2: {
name: "user2",
partner: "topic_name_partner1",
... other info
},
user3: {
name: "user3",
partner: "topic_name_partner2",
... other info
}
}
partners: {
topic_name_partner1: {
user1: true,
user2: true
},
topic_name_partner2: {
user3: true
}
}
so the data will not so big when I just want to get a list of users in a partner without their details.
and currently Firebase does not have Cloud Functions feature like Parse, so you have to move the data from the client (or probably use their Firebase SDK for server).
I would like to create a click stream application using HBase, in sql this would be a pretty simple task but in Hbase I have not got the first clue. Can someone advise me on a schema design and keys to use in HBase.
I have provided a rough data model and several questions that I would like to interrogate the data for.
Questions I would like to ask for accessing data
What events led to a conversion?
What was the last page / How many paged viewed?
What pages a customer drops off?
What products does a male customer between 20 and 30 like to buy?
A customer has bought product x also likely to buy product y?
Conversion amount from first page ?
{
PageViews: [
{
date: "19700101 00:00",
domain: "http://foobar.com",
path: "pageOne.html",
timeOnPage: "10",
pageViewNumber: 1,
events: [
{ name: "slideClicked", value: 0, time: "00:00"},
{ name: "conversion", value: 100, time: "00:05"}
],
pageData: {
category: "home",
pageTitle: "Home Page"
}
},
{
date: "19700101 00:01",
domain: "http://foobar.com",
path: "pageTwo.html",
timeOnPage: "20",
pageViewNumber: 2,
events: [
{ name: "addToCart", value: 50.00, time: "00:02"}
],
pageData: {
category: "product",
pageTitle: "Mans Shirt",
itemValue: 50.00
}
},
{
date: "19700101 00:03",
domain: "http://foobar.com",
path: "pageThree.html",
timeOnPage: "30",
pageViewNumber: 3,
events: [],
pageData: {
category: "basket",
pageTitle: "Checkout"
}
}
],
Customer: {
IPAddress: 127.0.0.1,
Browser: "Chrome",
FirstName: "John",
LastName: "Doe",
Email: "john.doe#email.com",
isMobile: 1,
returning: 1,
age: 25,
sex: "Male"
}
}
Well, you data is mainly in one-to-many relationship. One customer and an array of page view entities. And since all your queries are customer centric, it makes sense to store each customer as a row in Hbase and have customerid(may be email in your case) as part of row key.
If you decide to store one row for one customer, each page view details would be stored as nested. The video link regarding hbase design will help you understand that. So for you above example, you get one row, and three nested entities
Another approach would be, denormalized form, for hbase to perform good lookup. Here each row would be page view, and customer data gets appended for every row.So for your above example, you end up with three rows. Data would be duplicated. Again the video gives info regarding that too(compression things).
You have more nested levels inside each page view - live events and pagedata. So it will only get worse, with respect to denormalization. As everything in Hbase is a key value pair, it is difficult to query and match these nested levels. Hope this helps you to kick off
Good video link here