I've arrived at a point where I have to use GraphQL in a SQL-esque way to pull data en-masse (I wish it weren't so), and I also need to use a canned query called fooQuery that I cannot edit to extract the data.
Having looked at other answers here and here, I wrote a python script to generate a big ugly superquery from a list of unique ids that looks like this:
{
query155051: fooQuery(id: 155051) {
... details
}
query414989: fooQuery(id: 414989) {
... details
}
.
.
.
query265014: fooQuery(id: 265014) {
... details
}
}
fragment details on fooQuery {
categories {
id
name
}
}
}
Where the three lines of dots represent thousands of analogous subqueries. The db does fine for up to ~30 such subqueries, but it fails with any more. I have no idea why.
My question is then this – is there a better way to batch thousands of queries in a single query, or should I just break up the thousands into many superqueries of ~30 subqueries?
Related
Hullo everyone,
This has been discussed a bit before, but it's one of those things where there is so much scattered discussion resulting in various proposed "hacks" that I'm having a hard time determining what I should do.
I would like to use the result of a query as an argument for another nested query.
query {
allStudents {
nodes {
courseAssessmentInfoByCourse(courseId: "2b0df865-d7c6-4c96-9f10-992cd409dedb") {
weightedMarkAverage
// getting result for specific course is easy enough
}
coursesByStudentCourseStudentIdAndCourseId {
nodes {
name
// would like to be able to do something like this
// to get a list of all the courses and their respective
// assessment infos
assessmentInfoByStudentId (studentId: student_node.studentId) {
weightedMarkAverage
}
}
}
}
}
}
Is there a way of doing this that is considered to be best practice?
Is there a standard way to do it built into GraphQL now?
Thanks for any help!
The only means to substitute values in a GraphQL document is through variables, and these must be declared in your operation definition and then included alongside your document as part of your request. There is no inherent way to reference previously resolved values within the same document.
If you get to a point where you think you need this functionality, it's generally a symptom of poor schema design in the first place. What follows are some suggestions for improving your schema, assuming you have control over that.
For example, minimally, you could eliminate the studentId argument on assessmentInfoByStudentId altogether. coursesByStudentCourseStudentIdAndCourseId is a field on the student node, so its resolver can already access the student's id. It can pass this information down to each course node, which can then be used by assessmentInfoByStudentId.
That said, you're probably better off totally rethinking how you've got your connections set up. I don't know what your underlying storage layer looks like, or the shape your client needs the data to be in, so it's hard to make any specific recommendations. However, for the sake of example, let's assume we have three types -- Course, Student and AssessmentInfo. A Course has many Students, a Student has many Courses, and an AssessmentInfo has a single Student and a single Course.
We might expose all three entities as root level queries:
query {
allStudents {
# fields
}
allCourses {
# fields
}
allAssessmentInfos {
# fields
}
}
Each node could have a connection to the other two types:
query {
allStudents {
courses {
edges {
node {
id
}
}
}
assessmentInfos {
edges {
node {
id
}
}
}
}
}
If we want to fetch all students, and for each student know what courses s/he is taking and his/her weighted mark average for that course, we can then write a query like:
query {
allStudents {
assessmentInfos {
edges {
node {
id
course {
id
name
}
}
}
}
}
}
Again, this exact schema might not work for your specific use case but it should give you an idea around how you can approach your problem from a different angle. A couple more tips when designing a schema:
Add filter arguments on connection fields, instead of creating separate fields for each scenario you need to cover. A single courses field on a Student type can have a variety of arguments like semester, campus or isPassing -- this is cleaner and more flexible than creating different fields like coursesBySemester, coursesByCampus, etc.
If you're dealing with aggregate values like average, min, max, etc. it might make sense to expose those values as fields on each connection type, in the same way a count field is sometimes available alongside the nodes field. There's a (proposal)[https://github.com/prisma/prisma/issues/1312] for Prisma that illustrates one fairly neat way to do handle these aggregate values. Doing something like this would mean if you already have, for example, an Assessment type, a connection field might be sufficient to expose aggregate data about that type (like grade averages) without needing to expose a separate AssessmentInfo type.
Filtering is relatively straightforward, grouping is a bit tougher. If you do find that you need the nodes of a connection grouped by a particular field, again this may be best done by exposing an additional field on the connection itself, (like Gatsby does it)[https://www.gatsbyjs.org/docs/graphql-reference/#group].
in my app there are many entities which get exposed by GraphQL. All that entities get Resolvers and those have many methods (I think they are called "fields" in GraphQl). Since there is only one Query type allowed, I get an "endless" list of fields which belong to many different contexts, i.E.:
query {
newsRss (...)
newsCurrent (...)
userById(...)
weatherCurrent (...)
weatherForecast(...)
# ... many more
}
As you can see, there are still 3 different contexts here: news, users and weather. Now I can go on and prefix all fields ([contextName]FieldName), as I did in the example, but the list gets longer and longer.
Is there a way to "group" some of them together, if they relate to the same context? Like so, in case of the weather context:
query {
weather {
current(...)
forecast(...)
}
}
Thanks in advance!
If you want to group them together , you need to have a type which contain all fields under the same context . Take weather as an example , you need to have a type which contain currentWeather and forecastWeather field. Does this concept make sense to your application such that you can name it easily and users will not feel strange about it ? If yes , you can change the schema to achieve your purpose.
On the other hand, if all fields of the same context actually return the same type but they just filtering different things, you can consider to define arguments in the root query field to specify the condition that you want to filter , something like :
query {
weather(type:CURRENT){}
}
and
query {
weather(type:FORECAST){}
}
to query the current weather and forecast weather respectively.
So it is a question about how you design the schema.
I have gone through the docs and also Googled. I see little mention of returning multiple queries on the same sheet from Maat's Laravel Excel. I presume therefore it is 1 query for 1 downloaded spreadsheet. I also presume that if you do have multiple queries that you will need to place each query on an additional sheet.
Have got this right ?
Many thanks
In a perfect world, every query would get its own sheet. But in reality, it will export whatever you give it so long as it receives a single array or collection for the output, depending on your configuration. It would be up to you to determine how to combine your queries into a format that could be interpreted as rows and columns.
Basic example with two queries:
class ExportSample implements FromCollection
{
// ...
public function collection()
{
// query 1
$a = User::where('id',2)->get();
// query 2
$b = User::where('id',4)->get();
// merge collections
return $a->merge($b);
}
}
Of course, if your queries result in different column structures, there may be additional obstacles.
Hullo everyone,
This has been discussed a bit before, but it's one of those things where there is so much scattered discussion resulting in various proposed "hacks" that I'm having a hard time determining what I should do.
I would like to use the result of a query as an argument for another nested query.
query {
allStudents {
nodes {
courseAssessmentInfoByCourse(courseId: "2b0df865-d7c6-4c96-9f10-992cd409dedb") {
weightedMarkAverage
// getting result for specific course is easy enough
}
coursesByStudentCourseStudentIdAndCourseId {
nodes {
name
// would like to be able to do something like this
// to get a list of all the courses and their respective
// assessment infos
assessmentInfoByStudentId (studentId: student_node.studentId) {
weightedMarkAverage
}
}
}
}
}
}
Is there a way of doing this that is considered to be best practice?
Is there a standard way to do it built into GraphQL now?
Thanks for any help!
The only means to substitute values in a GraphQL document is through variables, and these must be declared in your operation definition and then included alongside your document as part of your request. There is no inherent way to reference previously resolved values within the same document.
If you get to a point where you think you need this functionality, it's generally a symptom of poor schema design in the first place. What follows are some suggestions for improving your schema, assuming you have control over that.
For example, minimally, you could eliminate the studentId argument on assessmentInfoByStudentId altogether. coursesByStudentCourseStudentIdAndCourseId is a field on the student node, so its resolver can already access the student's id. It can pass this information down to each course node, which can then be used by assessmentInfoByStudentId.
That said, you're probably better off totally rethinking how you've got your connections set up. I don't know what your underlying storage layer looks like, or the shape your client needs the data to be in, so it's hard to make any specific recommendations. However, for the sake of example, let's assume we have three types -- Course, Student and AssessmentInfo. A Course has many Students, a Student has many Courses, and an AssessmentInfo has a single Student and a single Course.
We might expose all three entities as root level queries:
query {
allStudents {
# fields
}
allCourses {
# fields
}
allAssessmentInfos {
# fields
}
}
Each node could have a connection to the other two types:
query {
allStudents {
courses {
edges {
node {
id
}
}
}
assessmentInfos {
edges {
node {
id
}
}
}
}
}
If we want to fetch all students, and for each student know what courses s/he is taking and his/her weighted mark average for that course, we can then write a query like:
query {
allStudents {
assessmentInfos {
edges {
node {
id
course {
id
name
}
}
}
}
}
}
Again, this exact schema might not work for your specific use case but it should give you an idea around how you can approach your problem from a different angle. A couple more tips when designing a schema:
Add filter arguments on connection fields, instead of creating separate fields for each scenario you need to cover. A single courses field on a Student type can have a variety of arguments like semester, campus or isPassing -- this is cleaner and more flexible than creating different fields like coursesBySemester, coursesByCampus, etc.
If you're dealing with aggregate values like average, min, max, etc. it might make sense to expose those values as fields on each connection type, in the same way a count field is sometimes available alongside the nodes field. There's a (proposal)[https://github.com/prisma/prisma/issues/1312] for Prisma that illustrates one fairly neat way to do handle these aggregate values. Doing something like this would mean if you already have, for example, an Assessment type, a connection field might be sufficient to expose aggregate data about that type (like grade averages) without needing to expose a separate AssessmentInfo type.
Filtering is relatively straightforward, grouping is a bit tougher. If you do find that you need the nodes of a connection grouped by a particular field, again this may be best done by exposing an additional field on the connection itself, (like Gatsby does it)[https://www.gatsbyjs.org/docs/graphql-reference/#group].
So basically how do you handle permissions?
Let's say we have a list of Post(s) of some kind, with an argument first to limit the amount of posts. And only the owner and approved users can read the posts, everyone else can't. What is the best way to implement this?
query {
{
viewer {
posts(first: 10) {
id
text
}
}
}
}
What I'm currently thinking of, is to have a single source of truth to whether a user can read the post or not, and hook it up with the dataloader module.
But, how do I query for exactly 10 posts? If I query my DB for exactly 10 rows, when I then later on filter them with some business logic, then I can get for example 8 posts returned.
A solution is to not put a limit on the query, but that's not very efficient. So what is a good way to go about this?
Inspiration from here
(1) https://dev-blog.apollodata.com/auth-in-graphql-part-2-c6441bcc4302
(2) https://dev-blog.apollodata.com/graphql-at-facebook-by-dan-schafer-38d65ef075af
(1) solved it by
export const DB = {
Lists: {
all: (user_id) => {
return sql.raw("SELECT id FROM lists WHERE owner_id is NULL or owner_id = %s, user_id);
}
}
}
as the query, and then to filter out which rows can be read:
resolve: (root, _, ctx) => {
// factor out data fetching
return DB.Lists.all(ctx.user_id)
.then( lists => {
// enforce auth on each node
return lists.map(auth.List.enforce_read_perm(ctx.user_id));
});
}
So, we can clearly see that it's querying for all the rows, even if, say, the first argument was 1, which is what I'm trying to avoid.
Maybe I'm approaching the problem wrong in some way, as the business logic lives on another layer than the DB one, so there's no way but to query all the rows. Any help appreciated.
For future reference and other people searching for solutions.
Used Dataloader to solve the authentication problem.
Literally implemented what they did in https://dev-blog.apollodata.com/graphql-at-facebook-by-dan-schafer-38d65ef075af and used this boilerplate repo as guidance. Not much more to say than that.