#Basic(fetch = FetchType/Lazy) for some TEXT field doesn't work - spring

I want to use #Basic(fetch = FetchType/Lazy) for the field assigned to the text, although I add a plugin for bytecode enhancement but again selects this text field in query. Does not hibernate give the opportunity to ignore some field? I need this when I select 3 tables, otherwise I know that I can do this with a projection, my structure is one parent and two child with #OnetoMany relation.

This requires the use of byte code enhancement. See the documentation for details: https://docs.jboss.org/hibernate/orm/5.6/userguide/html_single/Hibernate_User_Guide.html#basic-annotation

Related

parent-child documents relation retrieval

could you help me with little problem regarding parent-child documents relation?
Considering JSON, I have objects, each of them contains an array of sub-objects. Sub-objects contain some text fields.
I need to maintain full-text-search on these objects and construct snippets. I need highlighting for building snippets.
If I use nested objects, highlighting does not deal with them.
Therefore, I use Parent-Child relationships.
Now I need to retrieve Parent-documents, which children match the query_string. Furthermore, I need to get highlighted fields of matched children and associate each one(each child) with corresponding parent to construct snippets in my application.
Is it possible to accomplish my goal in one query?
I think that you should consider using the children aggregation. With that you can retrieve children items within their parents. It's aggregation so you are not able to get the whole document (just id) but with that you retrieve the relationship... Then with another query you can get document details quickly.
Link here : https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-children-aggregation.html
And more details : https://www.elastic.co/guide/en/elasticsearch/guide/current/children-agg.html

difference between source and fields

Whats the differenet between source and fields
acording to documentation they booth are used to listing fields which we want to from index.
Fields is best used for fields that are stored. When not stored it behaves similar to source.
So in case all the fields you want in result are all stored it would be faster filtering using "fields" instead of source.
Also fields can be used to get metadata fields if they are stored.
However one of the limitations of fields is that it can be used only to fetch leaf fields i.e it cannot be used on nested fields/object.
The following article in found provides a good explanation.

Elasticsearch indexed database table column structure

I have a question regarding the setup of my elasticsearch database index... I have created a table which I have rivered to index in elasticsearch. The table is built from a script that queries multiple tables to denormalize data making it easier to index by a unique id 1:1 ratio
An example of a set of fields I have is street, city, state, zip, which I can query on, but my question is , should I be keeping those fields individually indexed , or be concatenating them as one big field like address which contains all of the previous fields into one? Or be putting in the extra time to setup parent-child indexes?
The use case example is I have a customer with billing info coming from one direction, I want to query elasticsearch to see if that customer already exists, or at least return the closest result
I know this question is more conceptual than programming, I just can't find any information of best practices.
Concatenation
For the first part of your question: I wouldn't concatenate the different fields into a field containing all information. Having multiple fields gives you the advantage of calculating facets and aggregates on those fields, e.g. how many customers are from a specific city or have a specific zip. You can still use a match or multimatch query to query for information from different fields.
In addition to having the information in separate fields I would use multifields with an analyzed and not_analyzed part (fieldname.raw). This again allows for aggregates, facets and sorting.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/0.90/mapping-multi-field-type.html
Think of 'New York': if you analyze it will be stored as ['New', 'York'] and you will not be able to see all People from 'New York'. What you'd see are all people from 'New' and 'York'.
_all field
There is a special _all field in elasticsearch which does the concatenation in the background. You don't have to do it yourself. It is possible to enable/disable it.
Parent Child relationship
Concerning the part whether to use nested objects or parent child relationship: I think that using a parent child relationship is more appropriate for your case. Nested objects are stored in a 'flattened' way, i.e. the information from the nested objects in arrays is stored as being part of one object. Consider the following example:
You have an order for a client:
client: 'Samuel Thomson'
orderline: 'Strong Thinkpad'
orderline: 'Light Macbook'
client: 'Jay Rizzi'
orderline: 'Strong Macbook'
Using nested objects if you search for clients who ordered 'Strong Macbook' you'd get both clients. This because 'Samuel Thomson' and his orders are stored altogether, i.e. ['Strong' 'Thinkpad' 'Light' 'Macbook'], there is no distinction between the two orderlines.
By using parent child documents, the orderlines for the same client are not mixed and preserve their identity.

Query is not understandable - using field Fulltext search [Tags] = "foo"

I have a problem that only happens rarely with FT search. but once it happens it stays. I use the following search term in the FT search box in Lotus Notes
[Tags] = "foo"
in most application this search term work fine. but for some applications this search term gives the error "query is not understandable".
It does not matter if I replace the value, e.g [Tags] = "boo" produce the same result. and also FIELD Tags = "boo". for the record [Tag] = "foo" works fine so it seem be issues with the field or field name.
The problem only happens in some applications. Once this problem start happening no views can be searched using that search query and I get the error message everytime I search.
It does not help to remove, compact and re-create the FT index.
I get the same error in xpages when using the same search query in a view data source.
I have seen this problem using other fieldnames as well in other application.
If I remove the FT index the search query works
Creating a new copy of the "broken" database does not resolve the problem
I tried to have only one document in database, create a new FT index. the document in view does not have the field "Tags" still not working. (there are other forms in db with the fieldname "Tags")
This is a real show stopper for me as I have built some of my XPages based on search values from specific fields
In my own invstigation of this problem I think it has to do with some sort of bug in the FT index. There seem to be some data contained in documents or forms that causes the FT index to not work correctly.
I am looking for a solution to this problem as I have not found a way to repair it once it has become broken.
Update:
It does not help to follow this procedure
https://www-304.ibm.com/support/docview.wss?uid=swg21261002
Here is my debug info
[1078:0002-2250] IN FTGSearch
[1078:0002-2250] option = 0x400219
[1078:0002-2250] Query: ( FIELD Tags = "foo")
[1078:0002-2250] OUT FTGSearch error = F09
[1078:0002-2250] FTGSearch: found=0, returned=0, start=0, count=0, limit=0
It sounds like you need to fix the UNK table with a compact. Here is the listing of compact options, use a copy style not in place.
http://www-01.ibm.com/support/docview.wss?uid=swg21084388
If Tags field is sometimes numeric, I would advise looking at the database design. The UNK table is a table of all fields in the NSF. The first time a field name is used, it is stored in the UNK table as that data type. Full text searching uses that data type and only that data type. If you have a field Tags on more than one form in a database, once numeric and once text, you're in for big trouble with full text searches. The datatype in searches will depend on which datatype the field was on the first document saved which had that field. Even if you delete all documents that have it as numeric, you won't change the UNK table without the compact. Sounds like that's what you have here. Ensure the database never stores Tags as numeric. Delete or change all docs with it stored numeric. Then compact.
Thank you all for answering. I learned a whole lot about UNK tables and FT index today.
The problem was that I had a numeric field called "Tags" in a form that I hadn't looked at and really didn't think that it would contain a field by that name.
after using the DDE search I found all instances of the tags field and could eaily locate the problem form. I renamed the field in the form, removed the FT indx , used compact -c and recreated the ft index. now everythig is working fine.
One other thing to notice is that I have several databases with the same design but only a few of them had the ft index problem, the reason for this is probably because some of these databases was created after the form with the faulty Tags field was created.
I am so happy to have solved this.
lessons learned
If you plan to use fulltext index in your application. make sure you do not have the same field name in different forms and use different field types.
from now on I will probably use shared fields more :-)
One more thing we discovered
You actually do not need notes peek to find out which field tpe is stored in the UNK table. you can use the "Fields" button in the searchbar. if you select the field and the right hand box displays "contains" you know the unk table has a text field type set.

Lucene filter with docIds

I'm trying to do the following: I want to create a set of candidates by querying each field separately and then adding the top k matches to this set. After I'm done with that, I need to run another query on this candidate set.
The way how I implemented it right now is using a QueryWrapperFilter with a BooleanQuery that matches the unique id field of each candidate document. However, this means I have to call IndexSearcher.doc().get("docId") for each candidate document before I can add it to my BooleanQuery, which is the major bottleneck. I'm only loading the docId field via MapFieldSelector("docId).
I wanted to create my own Filter class, but I can't use the internal Lucene doc ids directly, because they are specified per segment. Any thoughts on how to approach this?
Instead of reading the stored docId, index the field (it probably already is) and use the FieldCache to retrieve docIds much faster. Then instead of using the docIds in a BooleanQuery, try using a TermsFilter or FieldCacheTermsFilter. The latter documentation describes the performance trade-offs.

Resources