gorm use the find rerurn empty - go

I use the find method to find some records,but when I send the wrong condition.it should return the
ErrRecordNotFound,but it return a empty struct
var projectSetBudget domain.ProjectSetBudget
filter := &domain.ProjectSetBudget{}
filter.TenantID = tenantID
filter.ID = ProjectSetBudgetID
res := r.db.Where(filter).Find(&projectSetBudget)
i just print the info
zap.S().Info(errors.Is(res.Error, gorm.ErrRecordNotFound))
it print false
and return the empty struct
{
"ID": 0,
"CreatedAt": "0001-01-01T00:00:00Z",
"UpdatedAt": "0001-01-01T00:00:00Z",
"DeletedAt": null,
"title": "",
}

GORM provides First, Take, Last methods to retrieve a single object from the database, it adds LIMIT 1 condition when querying the database, and it will return the error ErrRecordNotFound if no record is found.
https://gorm.io/docs/query.html
only First, Take, Last methods but find
if you use First, Take, Last there will be ErrRecordNotFound
First, Take, Last have folow code, but find have not, find will find not only one, so you can know it found or not by count:
tx.Statement.RaiseErrorOnNotFound = true

Related

FaunaDB search document and get its ranking based on a score

I have the following Collection of documents with structure:
type Streak struct {
UserID string `fauna:"user_id"`
Username string `fauna:"username"`
Count int `fauna:"count"`
UpdatedAt time.Time `fauna:"updated_at"`
CreatedAt time.Time `fauna:"created_at"`
}
This looks like the following in FaunaDB Collections:
{
"ref": Ref(Collection("streaks"), "288597420809388544"),
"ts": 1611486798180000,
"data": {
"count": 1,
"updated_at": Time("2021-01-24T11:13:17.859483176Z"),
"user_id": "276989300",
"username": "yodanparry"
}
}
Basically I need a lambda or a function that takes in a user_id and spits out its rank within the collection. rank is simply sorted by the count field. For example, let's say I have the following documents (I ignored other fields for simplicity):
user_id
count
abc
12
xyz
10
fgh
999
If I throw in fgh as an input for this lambda function, I want it to spit out 1 (or 0 if you start counting from 0).
I already have an index for user_id so I can query and match a document reference from this index. I also have an index sorted_count that sorts document based on count field ascendingly.
My current solution was to query all documents by sorted_count index, then get the rank by iterating through the array. I think there should be a better solution for this. I'm just not seeing it.
Please help. Thank you!
Counting things in Fauna isn't as easy as one might expect. But you might still be able to do something more efficient than you describe.
Assuming you have:
CreateIndex(
{
name: "sorted_count",
source: Collection("streaks"),
values: [
{ field: ["data", "count"] }
]
}
)
Then you can query this index like so:
Count(
Paginate(
Match(Index("sorted_count")),
{ after: 10, size: 100000 }
)
)
Which will return an object like this one:
{
before: [10],
data: [123]
}
Which tells you that there are 123 documents with count >= 10, which I think is what you want.
This means that, in order to get a user's rank based on their user_id, you'll need to implement this two-step process:
Determine the count of the user in question using your index on user_id.
Query sorted_count using the user's count as described above.
Note that, in case your collection has more than 100,000 documents, you'll need your Go code to iterate through all the pages based on the returned object's after field. 100,000 is Fauna's maximum allowed page size. See the Fauna docs on pagination for details.
Also note that this might not reflect whatever your desired logic is for resolving ties.

Elastic Ingest Pipeline split field and create a nested field

Dear freindly helpers,
I have an index that is fed by a database via Kafka. Now this database holds a field that aggregates a couple of pieces of information like so key/value; key/value; (don't ask for the reason, I have no idea who designed it liked that and why ;-) )
93/4; 34/12;
it can be empty, or it can hold 1..n key/value pairs.
I want to use an ingest pipeline and ideally have a "nested" field which holds all values that are in tha field.
Probably like this:
{"categories":
{ "93": 7,
"82": 4
}
}
The use case is the following: we want to visualize the sum of a filtered number of these categories (they tell me how many minutes a specific process took longer) and relate them in ranges.
Example: I filter categories x, y ,z and then group how many documents for the day had no delay, which had a delay up to 5 minutes and which had a delay between 5 and 15 minutes.
I have tried to get the fields neatly separated with the kv processor and wanted to work from there on but it was a complete wrong approach I guess.
"kv": {
"field": "IncomingField",
"field_split": ";",
"value_split": "/",
"target_field": "delays",
"ignore_missing": true,
"trim_key": "\\s",
"trim_value": "\\s",
"ignore_failure": true
}
When I test the pipeline it seems ok
"delays": {
"62": "3",
"86": "2"
}
but there are two things that don't work.
I can't know upfront how many of these combinations I have and thus converting the values from string t int in the same pipeline is an issue.
When I want to create a kibana index pattern I end up with many fields like delay.82 and delay.82.keyword which does not make sense at all for the usecase as I can't filter (get only the sum of delays where the key is one of x,y,z) and aggregate.
I have looked into other processors (dorexpander) but can't really get my head around how to get this working.
I hope my question is clear (I lack english skills, sorry) and that someone can point me at the right direction.
Thank you very much!
You should rather structure them as an array of objects with shared accessors, for instance:
[ {key: 93, value: 7}, ...]
That way, you'll be able to aggregate on categories.key and categories.value.
So this means iterating the categories' entrySet() using a custom script processor like so:
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "extracts k/v pairs",
"processors": [
{
"script": {
"source": """
def categories = ctx.categories;
def kv_pairs = new ArrayList();
for (def pair : categories.entrySet()) {
def k = pair.getKey();
def v = pair.getValue();
kv_pairs.add(["key": k, "value": v]);
}
ctx.categories = kv_pairs;
"""
}
}
]
},
"docs": [
{
"_source": {
"categories": {
"82": 4,
"93": 7
}
}
}
]
}
P.S.: Do make sure your categories field is mapped as nested b/c otherwise you'll lose the connections between the keys & the values (also called flattening).

How can I filter if any value of an array is contained in another array in rethinkdb/reql?

I want to find any user who is member of a group I can manage (using the webinterface/javascript):
Users:
{
"id": 1
"member_in_groups": ["all", "de-south"]
},
{
"id": 2
"member_in_groups": ["all", "de-north"]
}
I tried:
r.db('mydb').table('users').filter(r.row('member_in_groups').map(function(p) {
return r.expr(['de-south']).contains(p);
}))
but always both users are returned. Which command do I have to use and how can I use an index for this (I read about multi-indexes in https://rethinkdb.com/docs/secondary-indexes/python/#multi-indexes but there only one value is searched for)?
I got the correct answer at the slack channel so posting it here if anyone else comes to this thread through googling:
First create a multi index as described in
https://rethinkdb.com/docs/secondary-indexes/javascript/, e. g.
r.db('<db-name>').table('<table-name>').indexCreate('<some-index-name>', {multi: true}).run()
(you can omit .run() if using the webadmin)
Then query the data with
r.db('<db-name>').table('<table-name>').getAll('de-north', 'de-west', {index:'<some-index-name>'}).distinct()

Offset calculation in Spring Pagination

I have a service with pagination index starting from 1.I get the list of entity after some logic I return the same(responses) as below
totalCount = responses.size();
return new PageImpl<>(responses, pageable, totalCount);
and when I have requested the 1st page as
new PageRequest(1, 100)
I get back the response as
{"content": [
{
"id": "e1",
}{
"id": "2",
}
],
"last": false,
"totalElements": 102,
"totalPages": 2,
"size": 100,
"number": 1,
"sort": null,
"first": true,
"numberOfElements": 2
}
Here even though I have "numberOfElements": 2 I get back "totalElements": 102
The issue which I found is because of pageable.getOffset() calculation in PageImpl
this.total = !content.isEmpty() && pageable != null && pageable.getOffset()
+ pageable.getPageSize() > total
? pageable.getOffset() + content.size() : total;
In my scenario for the 1st Page I'am getting offset as 100 (1*100). How do I resolve this.
Note : I use a third party service to get the responses which is indexed 1 . So I am trying to align my service to that so that the entire logic follow the same indexing.
The result you get is correct, since PageRequest used zero-based pages as stated in the API docs:
Parameters:
page - zero-based page index.
size - the size of the page to be returned.
So that means you're retrieving the second page (not the first one), and since you have a limit of 100 records and a total of 102 records, you'll only retrieve the last two of them.
You can still expose a 1-based number though:
new PageRequest(page-1, 100);
Alternatively, you can customize this by implementing Pageable. This allows you to override the actual offset being used by Spring data.
Nonetheless, this doesn't change the fact that Spring data expects getPageNumber() to be a zero based number. You cannot change that, you can only add an abstraction layer on top of it to make it meet your requirements.
And what's wrong with that? totalElements tells you how many elements are stored within the data source. numberOfElements tells you how many elements the current page contains.
When have 102 elements in total and you requesting page 2 with size 100, you should get exactly the response you received.
What probably confuses you:
With new PageRequest(1, 100) you are requesting the 2nd page as the index starts at 0.

DynamoDB: What's the best way to structure and query a sorted list of timestamped logs?

In the interest of better understanding Amazon's DynamoDB, Lambda functions and IAM roles (I'll stick to DynamoDB in this question), I'm setting up a Linux device to listen for new DynamoDB items and audibly read out updates that are being added by other functions at a regular interval. My goal is to query or scan items, returning those items in ascending order since a specific timestamp (the last time the device checked).
Here's the item structure I'm using so far:
{
"id": {
"S": "1eb4520d44715b6daa5f9d907fe43aab" //md5sum of "time"
},
"message": {
"S": "I'm creating the audible reporting log now."
},
"status": {
"S": "working"
},
"time": {
"S": "1452297505" //timestamp: should probably add milliseconds for sake of unique "id"
}
}
"id" is the partition key. "time" is the sort key. Looking at this now, I'm guessing I should probably make "time" a number, not a string...
Query or scan? Query seems like the correct option for sorting, but it requires a specific partition ID in the query (at least in in the AWS website query tool), so perhaps I'm adding those incorrectly. Scan loads all items and I'm guessing that the sort is not automatic or an option (at least not in in the AWS website query tool). I really only want to load items greater than a timestamp value, sorted.
Where am I off in my thinking? I appreciate the assistance in advance.
UPDATE
After further experimentation with AWS-CLI and DynamoDB, I ended up using a slightly different solution. Since this is a small scale "hello world" type of project, all update items are added to the same table with a single partition key, "SF Reporter", for now. This could scale if I decide to start monitoring additional "reporter"/service updates with separate queries and/or devices.
{
"datetime": { //sort key
"S": "2016-01-11T05:15:02"
},
"message": {
"S": "It is all good."
},
"reporter": { //primary partition key
"S": "SF Reporter"
},
"status": {
"S": "ok"
}
}
The JSON query itself looks something like this (abbreviated node.js example):
var AWS = require("aws-sdk");
AWS.config.credentials = new AWS.SharedIniFileCredentials({ profile: 'default' });
AWS.config.update({"region": "us-west-2"});
var docClient = new AWS.DynamoDB.DocumentClient();
var params = {
TableName: "spoken_reports",
KeyConditionExpression: "#reporter = :reporter and #datetime >= :datetime",
ExpressionAttributeNames:{
"#reporter": "reporter",
"#datetime": "datetime"
},
ExpressionAttributeValues: {
":reporter":"SF Reporter",
":datetime":"2016-01-11T05:15:02"
}
};
docClient.query(params, onUpdatesReceived);
var onUpdatesReceived = function(err, data) {
if (err) {
console.log(err, err.stack);
} else {
console.log(data);
}
}
The query gets the latest updates sorted by a string timestamp (defaults to ascending order in this example). This allows for some scaling as I can have multiple devices checking the same table for the latest updates. I would create a scheduled query/function to clear out old updates once in a while to keep things light.
Dead simple way:
You should set up a global secondary index, and project "isNew" as the primary/hash key to it, with timestamp as the range key.
On creation of an entry, mark isNew as a UUID or something. This will make the table item project into the index.
When you need to check for data, scan the secondary index - the index will have only the results which are new. Then, updateItem the items you have read within the table itself to delete the isNew key on the item. The item will be removed from the secondary index, so it is not read again.
If you stick with this table design, scanning the entire table is the only option you have, for the reasons you've mentioned: for querying, you need a partition key, which is something your devices have no way of knowing beforehand.
There is another solution that comes to my mind:
Let's say your current table is called T1. Create another table, T2, that has deviceID as partition key and timestamp as sort key.
You define a AWS Lambda function on T1's stream that will, on any update, push that row in T2 as well, one per device.
Now whenever any of your device wakes up, it queries (not scan) T2 with its own device id. Processes all the rows and deletes them.
In other words, T2 will always have all the rows that a given device is yet to process.

Resources