I’m running a joined SQL query for locations and events (occuring at the locations). In the results, naturally the location data is replicated per row, as there’s a one-to-many relationship: one location holds multiple events.
What’s an optimal approach to clean up the multiplied location data?
Staying with a single SQL operation, what makes the most sense is performing a check while looping through the query results (rows).
However I cannot seem to access the locations object to check for a pre-existing location ID.
Edit:
This is the SQL output. As you see, location data naturally occurs multiple times, because it's shared across events. Ultimately this will be sent out as JSON eventually, with nested structs, one for locations, one for events.
id title latlng id title locationid
1 Fox Thea... 43.6640673,-79.4213863 1 Bob's Event 1
1 Fox Thea... 43.6640673,-79.4213863 2 Jill's Event 1
2 Wrigley ... 43.6640673,-79.4213863 3 Mary's Event 2
3 Blues Bar 43.6640673,-79.4213863 4 John's Event 3
1 Fox Thea... 43.6640673,-79.4213863 5 Monthly G... 1
1 Fox Thea... 43.6640673,-79.4213863 6 A Special... 1
1 Fox Thea... 43.6640673,-79.4213863 7 The Final... 1
The JSON output. As you see location data is multiplied making for a larger JSON file.
{
"Locations": [
{
"ID": 1,
"Title": "Fox Theatre",
"Latlng": "43.6640673,-79.4213863",
},
{
"ID": 1,
"Title": "Fox Theatre",
"Latlng": "43.6640673,-79.4213863",
},
{
"ID": 2,
"Title": "Wrigley Field",
"Latlng": "43.6640673,-79.4213863",
},
{
"ID": 3,
"Title": "Blues Bar",
"Latlng": "43.6640673,-79.4213863",
},
{
"ID": 1,
"Title": "Fox Theatre",
"Latlng": "43.6640673,-79.4213863",
},
{
"ID": 1,
"Title": "Fox Theatre",
"Latlng": "43.6640673,-79.4213863",
},
{
"ID": 1,
"Title": "Fox Theatre",
"Latlng": "43.6640673,-79.4213863",
}
],
"Events": [
{
"ID": 1,
"Title": "Bob's Event",
"Location": 1
},
{
"ID": 2,
"Title": "Jill's Event",
"Location": 1
},
{
"ID": 3,
"Title": "Mary's Event",
"Location": 2
},
{
"ID": 4,
"Title": "John's Event",
"Location": 3
},
{
"ID": 5,
"Title": "Monthly Gathering",
"Location": 1
},
{
"ID": 6,
"Title": "A Special Event",
"Location": 1
},
{
"ID": 7,
"Title": "The Final Contest",
"Location": 1
}
]
}
Structs:
// Event type
type Event struct {
ID int `schema:"id"`
Title string `schema:"title"`
LocationID int `schema:"locationid"`
}
// Location type
type Location struct {
ID int `schema:"id"`
Title string `schema:"title"`
Latlng string `schema:"latlng"`
}
// LocationsEvents type
type LocationsEvents struct {
Locations []Location `schema:"locations"`
Events []Event `schema:"events"`
}
Function running the query and looping through rows:
func getLocationsEvents(db *sql.DB, start, count int) ([]Location, []Event, error) {
var locations = []Location{}
var events = []Event{}
rows, err := db.Query("SELECT locations.id, locations.title, locations.latlng, events.id, events.title, events.locationid FROM locations LEFT JOIN events ON locations.id = events.locationid LIMIT ? OFFSET ?", count, start)
if err != nil {
return locations, events, err
}
defer rows.Close()
for rows.Next() {
var location Location
var event Event
err := rows.Scan(&location.ID, &location.Title, &location.Latlng, &event.ID, &event.Title, &event.LocationID);
if err != nil {
return locations, events, err
}
// Here I can print locations and see it getting longer with each loop iteration
fmt.Println(locations)
// How can I check if an ID exists in locations?
// Ideally, if location.ID already exists in locations, then only append event, otherwise, append both the location and event
locations = append(locations, location)
events = append(events, event)
}
return locations, events, nil
}
Function called on by router:
func (a *App) getLocationsEventsJSON(w http.ResponseWriter, r *http.Request) {
count := 99
start := 0
if count > 10 || count < 1 {
count = 10
}
if start < 0 {
start = 0
}
locations, events, err := getLocationsEvents(a.DB, start, count)
if err != nil {
respondWithError(w, http.StatusInternalServerError, err.Error())
return
}
var locationsEvents LocationsEvents
locationsEvents.Locations = locations
locationsEvents.Events = events
respondWithJSON(w, http.StatusOK, locationsEvents)
}
Function sending data out as JSON (part of REST API):
func respondWithJSON(w http.ResponseWriter, code int, payload interface{}) {
response, _ := json.Marshal(payload)
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(code)
w.Write(response)
}
UPDATE:
Reverting to doing this with the SQL query, what are the possibilities? Using GROUP BY? Here is an example SQL:
SELECT locations.id, locations.title, locations.latlng, events.id, events.title, events.locationid
FROM locations
LEFT JOIN events ON locations.id = events.locationid
GROUP BY locations.id, events.id
The result set still contains duplicated location data, however it's nicely grouped and sorted.
Then there's the possibility of sub-queries:
http://www.w3resource.com/sql/subqueries/understanding-sql-subqueries.php but now I'm running multiple SQL queries, something I wanted to avoid.
In reality I don't think I can avoid the duplicated location data when using a single join query like I am. How else would I receive a resultset of joined data, without having location data replicated? Having the SQL server send me pre-made JSON data as I need it (locations and events seperated)? From my understanding it's better doing that work after receiving results.
I think you can split your request in two: locations (SELECT * FROM locations) and events (SELECT * FROM events) and then pass them to JSON marshaller.
These 2 requests will be very easy and fast for database to perform. Next they will be easier to cache intermediate results.
but now I'm running multiple SQL queries, something I wanted to avoid.
Could you pls clarify this moment - why do you want to avoid multiple queries? What task do you want to solve and what limitations have? Sometimes set of small easy queries are better than one overcomplicated.
If you are querying the database yourself, you should be able to avoid any duplicates in the first place.
In the end of your query add "GROUP BY {unique field}".
Example that should give a unique list of locations that are on you event list
SELECT location.*
FROM location.ID, location.Title, location.Latlng
INNER JOIN event ON event.ID=location.ID
GROUP BY location.ID
Related
I have a question regarding GraphQL because I do not know if it is possible or not.
I have a simple scheme like this:
enum Range{
D,
D_1,
D_7
}
type Data {
id: Int!
levels(range: [Range!]):[LevelEntry]
}
type LevelEntry{
range: Range!
levelData: LevelData
}
type LevelData {
range: Range!
users: Int
name: String
stairs: Int
money: Float
}
Basically I want to do a query so I can retrieve different attributes for the different entries on the levelData property of levels array which can be filtered by some levels range.
For instance:
data {
"id": 1,
"levels": [
{
"range": D,
"levelData": {
"range": D,
"users": 1
}
},
{
"range": D_1,
"levelData": {
"range": D_1,
"users": 1,
"name": "somename"
}
}
]
This means i want for D "range, users" properties and for D_1 "range,users,name" properties
I have done an example of query but I do not know if this is possible:
query data(range: [D,D_1]){
id,
levels {
range
... on D {
range,
users
}
... on D_1 {
range,
users,
name
}
}
}
Is it possible? If it is how can i do it?
{
"rules": [
{
"rank": 1,
"grades": [
{
"id": 100,
"hierarchyCode": 32
},
{
"id": 200,
"hierarchyCode": 33
}
]
},
{
"rank": 2,
"grades": []
}
]
}
I've a json like above and I'm using streams to return "hierarchyCode" based on some condition. For example if I pass "200" my result should print 33. So far I did something like this:
request.getRules().stream()
.flatMap(ruleDTO -> ruleDTO.getGrades().stream())
.map(gradeDTO -> gradeDTO.getHierarchyCode())
.forEach(hierarchyCode -> {
//I'm doing some business logic here
Optional<SomePojo> dsf = someList.stream()
.filter(pojo -> hierarchyCode.equals(pojo.getId())) // lets say pojo.getId() returns 200
.findFirst();
System.out.println(dsf.get().getCode());
});
So in the first iteration for the expected output it returns 33, but in the second iteration it is failing with Null pointer instead of just skipping the loop since "grades" array is empty this time. How do I handle the null pointer exception here?
You can use the below code snippet using Java 8:
int result;
int valueToFilter = 200;
List<Grade> gradeList = data.getRules().stream().map(Rule::getGrades).filter(x-> x!=null && !x.isEmpty()).flatMap(Collection::stream).collect(Collectors.toList())
Optional<Grade> optional = gradeList.stream().filter(x -> x.getId() == valueToFilter).findFirst();
if(optional.isPresent()){
result = optional.get().getHierarchyCode();
System.out.println(result);
}
I have created POJO's according to my code, you can try this approach with your code structure.
In case you need POJO's as per this code, i will share the same as well.
Thanks,
Girdhar
Given the format at the end of the question, what's the best way to get the top-level name for a given item?
Top-level names are the ones with parentId = 1.
def getTopLevel(name: String): String = {
// Environment(150) -> Environment(150) - since its parentId is 1
// Assassination -> Security - since Assassination(12) -> Terrorism(10) -> Security(2)
}
Here's my current approach but is there something better?
unmapped = categories.size
Loop through this list until there are still unmapped items.
- build a Map(Int, String) for top levels.
- build a Map(Int, Int) - that maps an id to top level id.
- keep track of unmapped items
once loop exits, I can use both Maps to get the job done.
[
{
"name": "Destination Overview",
"id": 1,
"parentId": null
},
{
"name": "Environment",
"id": 150,
"parentId": 1
},
{
"name": "Security",
"id": 2,
"parentId": 1
},
{
"name": "Armed Conflict",
"id": 10223,
"parentId": 2
},
{
"name": "Civil Unrest",
"id": 21,
"parentId": 2
},
{
"name": "Terrorism",
"id": 10,
"parentId": 2
},
{
"name": "Assassination",
"id": 12,
"parentId": 10
}
]
This is actually two questions.
Parsing Json into a Scala collection and
Using that collection to trace items back to the top parent
For the first question, you can use play-json. The second part can be handled with a tail-recursive function. Here is the full program that solves both problems:
import play.api.libs.json.{Json, Reads}
case class Node(name: String, id: Int, parentId: Option[Int])
object JsonParentFinder {
def main(args: Array[String]): Unit = {
val s =
"""
|[
| {
| "name": "Destination Overview",
| "id": 1,
| "parentId": null
| },
| {
| "name": "Environment",
| "id": 150,
| "parentId": 1
| },
// rest of the json
|]
|""".stripMargin
implicit val NodeReads : Reads[Node] =Json.reads[Node]
val r = Json.parse(s).as[Seq[Node]]
.map(x => x.id -> x).toMap
println(getTopLevelNode(150, r))
println(getTopLevelNode(12, r))
}
def getTopLevelNode(itemId : Int, nodes: Map[Int, Node], path : List[Node] = List.empty[Node]) : List[Node] = {
if(nodes(itemId).id == 1)
nodes(itemId) +: path
else
getTopLevelNode(nodes(nodes(itemId).parentId.get).id, nodes, nodes(itemId) +: path)
}
}
Output will be:
List(Node(Destination Overview,1,None), Node(Environment,150,Some(1)))
List(Node(Destination Overview,1,None), Node(Security,2,Some(1)), Node(Terrorism,10,Some(2)), Node(Assassination,12,Some(10)))
A few notes:
I have not implemented comprehensive error-handling logic. The implicit assumption is that the only item with parentId==None is the root node. nodes(itemId).parentId.get could lead to failure.
Also, in creating the map, the assumption is that all items have unique ids.
Another assumption is that all nodes eventually have a path to the root node. If that is not the case, this will fail. But it should be straightforward to fix these cases by adding more stop conditions.
I am prepending items to the accumulator list(named path here) because prepend operation on Scala's lists takes constant time. You can just reverse the resulting list or use another data structure like Vector to efficiently build the path.
So I have an Struct that holds data that has a AddedByUser which links to my User Struct.
What I want to be able to do it remove the UserLevel from the AddedByUser
Now I want to be able to do it from this function only, so using the json:"-" is not an option. That would remove it from all json output. I only want to remove it form this one function.
I should also say that these are Gorm models and when I have been trying to remove the 10 option (UserLevels) it only removes the outer data set not the UserLevel from all of the data.
{
"ID": 1,
"CreatedAt": "2019-01-08T16:33:09.514711Z",
"UpdatedAt": "2019-01-08T16:33:09.514711Z",
"DeletedAt": null,
"UUID": "00000000-0000-0000-0000-000000000000",
"Title": "title000",
"Information": "info999",
"EventDate": "2006-01-02T15:04:05Z",
"AddedByUser": {
"ID": 2,
"CreatedAt": "2019-01-08T15:27:52.435397Z",
"UpdatedAt": "2019-01-08T15:27:52.435397Z",
"DeletedAt": null,
"UUID": "b019df80-a7e4-4397-814a-795e7e84b4ca",
"Firstname": "Me",
"Surname": "admin",
"Password": "....",
"Email": "admin#email.co.uk",
"UserLevel": {
"ID": 0,
"CreatedAt": "0001-01-01T00:00:00Z",
"UpdatedAt": "0001-01-01T00:00:00Z",
"DeletedAt": null,
"LevelTitle": "",
"UserLevel": null
},
So this is what I have tried,
data := []models.MyData{}
data = append(data[0:2])
I have about 14 results, with out the append it loads all the results but with this is only loads two results. The idea was to remove either UpdateAt or Title. As I am not sure if the gorm model information is all 0 or if the slice sees them as 0,1,2,3,4 etc.
I have also tried to range over the slice of models, while I can access each of the sections, I can not seem to find a simple method to remove data by name from a struct? Maps seem to have that but not structs which I am not sure why?
Thanks.
UPDATE
This is the model I am using:
//Model
type MyData struct {
gorm.Model
UUID uuid.UUID
Title string
Information string
EventDate time.Time
AddedByUser Users `gorm:"ForeignKey:added_by_user_fk"`
AddedByUserFK uint
}
//Users Model
type Users struct {
gorm.Model
UUID uuid.UUID
Firstname string
Surname string
Password string
Email string
UserLevel UserLevels `gorm:"ForeignKey:user_level_fk" json:",omitempty"`
UserLevelFK uint
}
As mentioned in the comments, you cannot remove fields from a struct value, because that would yield a value of a different type.
However, you can set fields to their zero value. Combined with the omitempty JSON tag, you can exclude fields from the JSON encoding. To make this work properly, you have to change the UserLevel field to a pointer type (otherwise you end up with empty objects in the JSON document).
Types shortened for brevity:
package main
import (
"encoding/json"
"fmt"
)
type MyData struct {
Title string
AddedByUser Users
}
type Users struct {
ID int
UserLevel *UserLevels `json:",omitempty"` // pointer type with omitempty
}
type UserLevels struct {
LevelTitle string
}
func main() {
var x MyData
x.Title = "foo"
x.AddedByUser.ID = 2
x.AddedByUser.UserLevel = &UserLevels{}
f(x)
b, _ := json.MarshalIndent(x, "", " ")
fmt.Println("main:\n" + string(b))
}
func f(x MyData) {
// "unset" UserLevel. Since we are receiving a copy of MyData, this is
// invisible to the caller.
x.AddedByUser.UserLevel = nil
b, _ := json.MarshalIndent(x, "", " ")
fmt.Println("f:\n" + string(b))
}
// Output:
// f:
// {
// "Title": "foo",
// "AddedByUser": {
// "ID": 2
// }
// }
// main:
// {
// "Title": "foo",
// "AddedByUser": {
// "ID": 2,
// "UserLevel": {
// "LevelTitle": ""
// }
// }
// }
Try it on the playground: https://play.golang.org/p/trUgnYamVOA
Alternatively, you can define new types that exclude the AddedByUser field. However, since this field isn't at the top level, this is a lot of work, and it's easy to forget to update those types when new fields are added to the original types.
If the field were at the top level, the compiler would do most of the work for you, because types that only differ in their field tags can be directly converted to one another:
type MyData struct {
ID int
Title string
}
func main() {
var x MyData
x.ID = 1
x.Title = "foo"
f(x)
}
func f(x MyData) {
type data struct { // same as MyData, except the field tags
ID int
Title string `json:"-"`
}
b, _ := json.MarshalIndent(data(x), "", " ")
fmt.Println("main:\n" + string(b))
}
I have a json file (nested json) that I am unmarshalling its content into a map[string]interface. Now I have to implement pagination as the data is large. The client side will send as a query parameter the desired page, how can I slice the data I have?
This is a snippet of the data I am dealing with:
"packages":{
"pkg1": {
"meta": {
"description": "description1",
"name": "pkg1.1"
},
"name": "pkg1.1"
},
"pkg2": {
"meta": {
"description": "description2",
"name": "pkg2.2"
},
"name": "pkg2.2"
},
}
So what I did is that I recursively iterated through the data and created an array of a custom type containing the data I need (name, description) for each entry so that I can use it for pagination. Here is the code I used:
type Object struct {
name string
description string
}
func iterate(aMap map[string]interface{}, result *[]Object){
for key, val := range aMap {
switch val.(type) {
case map[string]interface{}:
if(key == "meta"){
switch reflect.TypeOf(val).Kind() {
case reflect.Map:
s := reflect.ValueOf(val)
var tmpData Object
if(s.MapIndex(reflect.ValueOf("name")).IsValid()){
tmpData.name = s.MapIndex(reflect.ValueOf("name")).Interface().(string)
}
if(s.MapIndex(reflect.ValueOf("description")).IsValid()){
tmpData.description = s.MapIndex(reflect.ValueOf("description")).Interface().(string)
}
*result = append(*result, tmpData)
}
}
iterate(val.(map[string]interface{}), result)
default: //DO NOTHING!!
}
}
}
If you're doing pagination, somewhere the data must be represented as a list instead of an object? I assume at some place in your JSON, you have a list of items, otherwise pagination doesn't make sense.
It shouldn't be very hard, something simple like this should work:
const (
itemsPerPage = 10
)
var data []map[string]interface{}
// pages start at 1, can't be 0 or less.
func GetDataPage(page int) []map[string]interface{} {
start := (page - 1) * itemsPerPage
stop := start + itemsPerPage
if start > len(data) {
return nil
}
if stop > len(data) {
stop = len(data)
}
return data[start:stop]
}
You are unmarshalling your json into a map which has no order by itself. In order to be able to paginate your results you need to order them in some way.
One way of doing it is to sort your data and then store it into an array. But in order to paginate you need to have ordered data and that is not possible with a map.