I want to utilize a custom Analyzer like default. Searching online on the Hibernate search ' s documentation, i saw that is possibile changing it in the Hibernate configuration. In particular with the property "hibernate.search.analyzer". Then I made this property
<property name="hibernate.search.analyzer">Class of Analyzer </property>
My question is: How I can create a analyzer' s class for pass it at the property?
In particolary I want use the "EdgeNGram" , I tried to pass the EngedNgram' s tokenizer factory , but it not works .
<property name="hibernate.search.analyzer">EdgeNGramTokenizerFactory.class</property>
Can you show me a Example of the class that I can pass at this property ? Thanks
EDIT: Hibernate Search 6+ users, what follows is mostly irrelevant to you. Read this section of the documentation instead.
First, let me warn you that the default analyzer should generally be a general-purpose one, one that may not be great, but is good enough for most fields. As new requirements are added to your application, you are very unlikely to be able to use the same analyzer everywhere, and will ultimately have to use specific analyzers in at least some of your index fields. That's why I personally prefer to use org.apache.lucene.analysis.core.KeywordAnalyzer as a default, and specify an analyzer wherever I need one.
EDIT: With Hibernate Search 6 this advice (using a keyword analyzer by default) has become less relevant, since keyword fields and full-text fields are clearly separated. Still, that's good advice for Hibernate Search 5.
Now, you've been warned: using an EdgeNGramTokenizerFactory for your default analyzer is probably a bad idea. If you still want to do it, read on...
The default analyzer doesn't have to be a class. It can be the fully qualified name of a class, but here you want a custom analyzer, and writing your own analyzer class can be complex, so if you're not used to Lucene I wouldn't recommend it.
What you can do instead is use the name of a named analyzer defined using an #AnalyzerDef annotation or an analysis definition provider. These definitions use "off the shelf" analysis components and assemble them to a fully-fledged analyzer, which is much easier to do.
So, for example, you can define this class, which is not an analyzer class, but rather a class that provides analyzer definitions:
package com.acme.search;
import org.apache.lucene.analysis.core.ASCIIFoldingFilterFactory;
import org.apache.lucene.analysis.core.LowerCaseFilterFactory;
import org.apache.lucene.analysis.ngram.EdgeNGramTokenizerFactory;
import org.hibernate.search.analyzer.definition.LuceneAnalysisDefinitionProvider;
import org.hibernate.search.analyzer.definition.LuceneAnalysisDefinitionRegistryBuilder;
public class CustomAnalyzerProvider implements LuceneAnalysisDefinitionProvider {
#Override
public void register(LuceneAnalyzerDefinitionRegistryBuilder builder) {
builder
.analyzer( "myAnalyzer" )
.tokenizer( EdgeNGramTokenizerFactory.class )
.param( "minGramSize" "1" )
.param( "maxGramSize", "5" )
.tokenFilter( ASCIIFoldingFilterFactory.class )
.tokenFilter( LowerCaseFilterFactory.class );
}
}
Then define the following properties in your persistence.xml:
<property name="hibernate.search.lucene.analysis_definition_provider">com.acme.search.CustomAnalyzerProvider</property>
<property name="hibernate.search.analyzer">myAnalyzer</property>
And you should be good to go.
EDIT: If you use the Elasticsearch integration, then 1) using a custom Lucene Analyzer class will never work and 2) you need to do this to defined named analyzers instead:
Define this class, which is not an analyzer class, but rather a class that provides analyzer definitions:
package com.acme.search;
import org.hibernate.search.elasticsearch.analyzer.definition.ElasticsearchAnalysisDefinitionProvider;
import org.hibernate.search.elasticsearch.analyzer.definition.ElasticsearchAnalysisDefinitionRegistryBuilder;
public class CustomAnalyzerProvider implements ElasticsearchAnalysisDefinitionProvider {
#Override
public void register(initionRegistryBuilder builder) {
builder.analyzer( "myAnalyzer" )
.withTokenizer( "myEdgeNgram" )
.withCharFilters( "asciifolding" )
.withTokenFilters( "lowercase" );
builder.tokenizer( "myEdgeNgram" )
.type( "edge_ngram" )
.param( "min_gram", "1" )
.param( "max_gram", "5" );
}
}
Then define the following properties in your persistence.xml (note the properties are different from my example with Lucene):
<property name="hibernate.search.elasticsearch.analysis_definition_provider">com.acme.search.CustomAnalyzerProvider</property>
<property name="hibernate.search.analyzer">myAnalyzer</property>
I you need more information, the documentation might help.
Related
I am trying to map a json object to a Spring boot model class now the contract says for a property it have only a certain set of allowed values(not more than 3).
Example:
Suppose that json has field "name" and the contract says allowed values for field "name" are john,todd,phil
Anything other than john,todd,phil wont be accepted.
Is there any way to achive this constraint using any annotations
You can use following solutions
Solution 1:
Using #Pattern annotation with regex , if you want to use case insensitive use appropriate flags
#Pattern(regexp = "john|todd|phil", flags = Pattern.Flag.CASE_INSENSITIVE)
Solution 2:
By creating a enum class type with allowed values
public enum {
JOHN, TODD, PHIL
}
In your model class use #Enumerated(EnumType.STRING) on name filed
Im trying to use this library for an existing application that already have records in cassandra and it table columns are written in snake case. Is there a way to have my phantom model objects in snake case.
Just override column name:
abstract class User extends Table[...] {
object lastName extends StringColumn {
override val name: String = "last_name"
}
}
Have a look here, at the official documentation.
import com.outworkers.phantom.NamingStrategy.SnakeCase.caseSensitive
import com.outworkers.phantom.NamingStrategy.SnakeCase.caseInsensitive
There is also a test suite that might be of help, but you need to import this in every file where you define tables. Tests are here.
I wrote the following object type class.
public class ResponseType<T> : ObjectType<ResponseEntry<T>>
{
protected override void Configure(IObjectTypeDescriptor<ResponseEntry<T>> descriptor)
{
descriptor.Name("Response");
}
}
I want to use it like this as the outermost type in the resolver definition.
descriptor.Field<SharedResolvers>(r => r.GetObject1(default, default, default, default))
.Type<ResponseType<ListType<Object1>>>()
.Name("object1");
descriptor.Field<SharedResolvers>(r => r.GetObject2(default, default, default, default))
.Type<ResponseType<ListType<Object2>>>()
.Name("object2");
This code works if I only implement object1 however as soon as I add object2 I get the following error.
System.Collections.Generic.KeyNotFoundException: 'The given key 'HotChocolate.Configuration.RegisteredType' was not present in the dictionary.'
It seems as though there may be some issue with declaring two resolvers of the same class type. Is that the case? And if so, what are my options?
I was able to resolve the issue by setting the descriptor.Name to a unique value based on T.
descriptor.Name($"Response_{typeof(T).GetHashCode()}");
Then I realized my real issue was that I was defining the name at all. If you don't override the name it automatically comes up with a unique name/key based on the type definition.
In my model I set the #TextIndexed annotation to add a field to the fulltext index of MongoDB:
#TextIndexed
private String descriptionShort;
This works so far.
But how can set the default_language to "De" for the index?
I noticed that the language is automatically set by Spring when a language property is found on the model entity.
At least the behaviour pointed to this conclusion.
However, I did not find any docs on this?
My model has no language property at this point so I wonder how to achieve this?
According to the unit test's the class can be annotated to the default language through the #Document annotation. There is also a section in the reference documentation. Basically using the same code as in the unit test:
#Document(language = "german")
static class TextIndexedDocumentRoot {
#TextIndexed String textIndexedPropertyWithDefaultWeight;
#TextIndexed(weight = 5) String textIndexedPropertyWithWeight;
TextIndexedDocumentWihtLanguageOverride nestedDocument;
}
static class TextIndexedDocumentWihtLanguageOverride {
#Language String lang;
#TextIndexed String textIndexedPropertyInNestedDocument;
String nonTextIndexedProperty;
}
}
Just to note that the #Language annotation there serves as the language_override setting, but this would actually happen within the "sub-document" as shown with the default field name of "language" anyway, and it a common pattern for enabling multi language support with different language phrases stored in the document.
Also note the language can be "german" or "de" as the ISO code, or anything that is supported on the Text Search Languages as listed in the documentation. Other options are available in the Enterprise Edition only.
I'd like to use a 128-bit UUID rather than Long for the id field on all of my Grails domains. I'd rather not have to specify all of the mapping information on every domain. Is there a simple way to achieve this in a generic/global way? I'm using Grails 2.3.x, the Hibernate 3.6.10.2 plugin, the Database Migration Plugin 1.3.8, and Oracle 11g (11.2.0.2.0).
There seem to be a number of questions related to this, but none provide complete, accurate, and up-to-date answers that actually work.
Related Questions
What's the best way to define custom id generation as default in Grails?
grails using uuid as id and mapping to to binary column
Configuring Grails/Hibernate/Postgres for UUID
Problems mapping UUID in JPA/hibernate
Custom 16 digit ID Generator in Grails Domain
Using UUID and RAW(16)
If you want to use a UUID in your Grails domain and a RAW(16) in your database, you'll need to add the following.
For every domain, specify the id field. Here's an example using ExampleDomain.groovy
class ExampleDomain {
UUID id
}
Add the following mapping to Config.groovy
grails.gorm.default.mapping = {
id(generator: "uuid2", type: "uuid-binary", length: 16)
}
For details on the three values I've selected, please see these links.
http://docs.jboss.org/hibernate/core/3.6/reference/en-US/html/mapping.html#d0e5294
http://docs.jboss.org/hibernate/core/3.6/reference/en-US/html/types.html#types-basic-value-uuid
How should I store a GUID in Oracle?
Add a custom dialect to your data source entry in Datasource.groovy. If you are using Hibernate 4.0.0.CR5 or higher, you can skip this step.
dataSource {
// Other configuration values removed for brevity
dialect = com.example.hibernate.dialect.BinaryAwareOracle10gDialect
}
Implement the custom dialect you referenced in step #3. Here is BinaryAwareOracle10gDialect implemented in Java. If you are using Hibernate 4.0.0.CR5 or higher, you can skip this step.
package com.example.hibernate.dialect;
import java.sql.Types;
import org.hibernate.dialect.Oracle10gDialect;
public class BinaryAwareOracle10gDialect extends Oracle10gDialect {
#Override
protected void registerLargeObjectTypeMappings() {
super.registerLargeObjectTypeMappings();
registerColumnType(Types.BINARY, 2000, "raw($l)");
registerColumnType(Types.BINARY, "long raw");
}
}
For more information about this change, please see the related Hibernate defect https://hibernate.atlassian.net/browse/HHH-6188.
Using UUID and VARCHAR2(36)
If you want to use a UUID in your Grails domain and a VARCHAR2(36) in your database, you'll need to add the following.
For every domain, specify the id field. Here's an example using ExampleDomain.groovy.
class ExampleDomain {
UUID id
}
Add the following mapping to Config.groovy
grails.gorm.default.mapping = {
id(generator: "uuid2", type: "uuid-char", length: 36)
}
For details on the three values, please see the links in step #2 from the previous section.
I think there is a easy way:
String id = UUID.randomUUID().toString()
static mapping = {
id generator:'assigned'
}