ElasticSearch NEST 5.6.1 Query for unit test - elasticsearch

I wrote a bunch of queries to elastic search and I wanted to write a unit test for them. using this post moq an elastic connection I was able to preform a general mocking. But When I tried to view the Json which is being generated from my query I didn't manage to get it in any way.
I tried to follow this post elsatic query moq, but it is relevant only to older versions of Nest because the method ConnectionStatus and RequestInformation is no longer available for an ISearchResponse object.
My test look as follow:
[TestMethod]
public void VerifyElasticFuncJson()
{
//Arrange
var elasticService = new Mock<IElasticService>();
var elasticClient = new Mock<IElasticClient>();
var clinet = new ElasticClient();
var searchResponse = new Mock<ISearchResponse<ElasticLog>>();
elasticService.Setup(es => es.GetConnection())
.Returns(elasticClient.Object);
elasticClient.Setup(ec => ec.Search(It.IsAny<Func<SearchDescriptor<ElasticLog>,
ISearchRequest>>())).
Returns(searchResponse.Object);
//Act
var service = new ElasticCusipInfoQuery(elasticService.Object);
var FindFunc = service.MatchCusip("CusipA", HostName.GSMSIMPAPPR01,
LogType.Serilog);
var con = GetConnection();
var search = con.Search<ElasticLog>(sd => sd
.Type(LogType.Serilog)
.Index("logstash-*")
.Query(q => q
.Bool(b => b
.Must(FindFunc)
)
)
);
**HERE I want to get the JSON** and assert it look as expected**
}
Is there any other way to achieve what I ask?

The best way to do this would be to use the InMemoryConnection to capture the request bytes and compare this to the expected JSON. This is what the unit tests for NEST do. Something like
private static void Main()
{
var pool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var connectionSettings = new ConnectionSettings(pool, new InMemoryConnection())
.DefaultIndex("default")
.DisableDirectStreaming();
var client = new ElasticClient(connectionSettings);
// Act
var searchResponse = client.Search<Question>(s => s
.Query(q => (q
.Match(m => m
.Field(f => f.Title)
.Query("Kibana")
) || q
.Match(m => m
.Field(f => f.Title)
.Query("Elasticsearch")
.Boost(2)
)) && +q
.Range(t => t
.Field(f => f.Score)
.GreaterThan(0)
)
)
);
var actual = searchResponse.RequestJson();
var expected = new
{
query = new {
#bool = new {
must = new object[] {
new {
#bool = new {
should = new object[] {
new {
match = new {
title = new {
query = "Kibana"
}
}
},
new {
match = new {
title = new {
query = "Elasticsearch",
boost = 2d
}
}
}
},
}
},
new {
#bool = new {
filter = new [] {
new {
range = new {
score = new {
gt = 0d
}
}
}
}
}
}
}
}
}
};
// Assert
Console.WriteLine(JObject.DeepEquals(JToken.FromObject(expected), JToken.Parse(actual)));
}
public static class Extensions
{
public static string RequestJson(this IResponse response) =>
Encoding.UTF8.GetString(response.ApiCall.RequestBodyInBytes);
}
I've used an anonymous type for the expected JSON as it's easier to work with than an escaped JSON string.
One thing to note is that Json.NET's JObject.DeepEquals(...) will return true even when there are repeated object keys in a JSON object (so long as the last key/value matches). It's not likely something you'll encounter if you're only serializing NEST searches though, but something to be aware of.
If you're going to have many tests checking serialization, you'll want to create a single instance of ConnectionSettings and share with all, so that you can take advantage of the internal caches within it and your tests will run quicker than instantiating a new instance in each test.

Related

How to read distance from Elasticsearch response

I am using Elasticsearch V6, NEST V6.
I am searching ES as below and I am using ScriptFields to calculate the distance and include it the result.
var searchResponse = _elasticClient.Search<MyDocument>(new SearchRequest<MyDocument>
{
Query = new BoolQuery
{
Must = new QueryContainer[] { matchQuery },
Filter = new QueryContainer[] { filterQuery },
},
Source = new SourceFilter
{
Includes = resultFields // fields to be included in the result
},
ScriptFields = new ScriptField
{
Script = new InlineScript("doc['geoLocation'].planeDistance(params.lat, params.lng) * 0.001") // divide by 1000 to convert to km
{
Lang = "painless",
Params = new FluentDictionary<string, object>
{
{ "lat", _center.Latitude },
{ "lng", _center.Longitude }
}
}
}
});
Now, I am trying to read the search result and I am not sure how to read the distance from the response, this is what I have tried:
// this is how I read the Document, all OK here
var docs = searchResponse.Documents.ToList<MyDocument>();
// this is my attempt to read the distance from the result
var hits = searchResponse.Hits;
foreach (var h in hits)
{
var d = h.Fields["distance"];
// d is of type Nest.LazyDocument
// I am not sure how to get the distance value from object of type LazyDocument
}
While debugging I can see the distance value, I am just not sure how to read the value?
I found the answer here
To read search document and distance:
foreach (var hit in searchResponse.Hits)
{
MyDocument doc = hit.Source;
double distance = hit.Fields.Value<double>("distance");
}
And if you are only interested in distance:
foreach (var fieldValues in searchResponse.Fields)
{
var distance = fieldValues.Value<double>("distance");
}

Elasticsearch.net Index Settings + Analyzer

i can use elasticsearch 2.3.0 version with C# (Nest)
i want to use the analysis with index,
but index settings doesn't change and i don't know why.
Here's my code:
private void button1_Click_1(object sender, EventArgs e)
{
var conn = new Uri("http://localhost:9200");
var config = new ConnectionSettings(conn);
var client = new ElasticClient(config);
string server = cmb_Serv.Text.Trim();
if (server.Length > 0)
{
string ser = server;
string uid = util.getConfigValue("SetUid");
string pwd = util.getConfigValue("SetPwd");
string dbn = cmb_Db.Text;
string tbl = cmb_Tbl.Text;
setWorkDbConnection(ser, uid, pwd, dbn);
string query = util.getConfigValue("SelectMC");
query = query.Replace("###tbl###",tbl);
using (SqlCommand cmd1 = new SqlCommand())
{
using (SqlConnection con1 = new SqlConnection())
{
con1.ConnectionString = util.WorkConnectionString;
con1.Open();
cmd1.CommandTimeout = 0; cmd1.Connection = con1;
cmd1.CommandText = query;
int id_num =0;
SqlDataReader reader = cmd1.ExecuteReader();
while (reader.Read())
{ id_num++;
Console.Write("\r" + id_num);
var mc = new mc
{
Id = id_num,
code = reader[0].ToString(),
mainclass = reader[1].ToString().Trim()
};
client.Index(mc, idx => idx.Index("mctest_ilhee"));
client.Alias(x => x.Add(a => a.Alias("mcAlias").Index("mctest_ilhee")));
client.Map<mc>(d => d
.Properties(props => props
.String(s => s
.Name(p => p.mainclass)
.Name(p2 => p2.code).Index(FieldIndexOption.Analyzed).Analyzer("whitespace"))));
} reader.Dispose();
reader.Close();
}
IndexSettings Is = new IndexSettings();
Is.Analysis.Analyzers.Add("snowball", new SnowballAnalyzer());
Is.Analysis.Analyzers.Add("whitespace", new WhitespaceAnalyzer());
}
}
}
Well first of all you code is strange.
Why you are doing mapping in while? Do mapping only once.
Its impossible to help you because you are not even providing error you get. I would recomend to add simple debug method.
protected void ValidateResponse(IResponse response)
{
if (!response.IsValid ||
(response is IIndicesOperationResponse && !((IIndicesOperationResponse) response).Acknowledged))
{
var error = string.Format("Request to ES failed with error: {0} ", response.ServerError != null ? response.ServerError.Error : "Unknown");
var esRequest = string.Format("URL: {0}\n Method: {1}\n Request: {2}\n",
response.ConnectionStatus.RequestUrl,
response.ConnectionStatus.RequestMethod,
response.ConnectionStatus.Request != null
? Encoding.UTF8.GetString(response.ConnectionStatus.Request)
: string.Empty);
}
}
All requests such as client.Alias, client.Map returns status. So you can do
var result = client.Map<mc>(.....YOUR_CODE_HERE....)
ValidateResponse(result);
Then you will see two things, propper error which ES return + request which NEST sends to ES

Scroll example in ElasticSearch NEST API

I am using .From() and .Size() methods to retrieve all documents from Elastic Search results.
Below is sample example -
ISearchResponse<dynamic> bResponse = ObjElasticClient.Search<dynamic>(s => s.From(0).Size(25000).Index("accounts").AllTypes().Query(Query));
Recently i came across scroll feature of Elastic Search. This looks better approach than From() and Size() methods specifically to fetch large data.
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-scroll.html
I looking for example on Scroll feature in NEST API.
Can someone please provide NEST example?
Thanks,
Sameer
Here's an example of using scroll with NEST and C#. Works with 5.x and 6.x
public IEnumerable<T> GetAllDocumentsInIndex<T>(string indexName, string scrollTimeout = "2m", int scrollSize = 1000) where T : class
{
ISearchResponse<T> initialResponse = this.ElasticClient.Search<T>
(scr => scr.Index(indexName)
.From(0)
.Take(scrollSize)
.MatchAll()
.Scroll(scrollTimeout));
List<T> results = new List<T>();
if (!initialResponse.IsValid || string.IsNullOrEmpty(initialResponse.ScrollId))
throw new Exception(initialResponse.ServerError.Error.Reason);
if (initialResponse.Documents.Any())
results.AddRange(initialResponse.Documents);
string scrollid = initialResponse.ScrollId;
bool isScrollSetHasData = true;
while (isScrollSetHasData)
{
ISearchResponse<T> loopingResponse = this.ElasticClient.Scroll<T>(scrollTimeout, scrollid);
if (loopingResponse.IsValid)
{
results.AddRange(loopingResponse.Documents);
scrollid = loopingResponse.ScrollId;
}
isScrollSetHasData = loopingResponse.Documents.Any();
}
this.ElasticClient.ClearScroll(new ClearScrollRequest(scrollid));
return results;
}
It's from: http://telegraphrepaircompany.com/elasticsearch-nest-scroll-api-c/
Internal implementation of NEST Reindex uses scroll to move documents from one index to another.
It should be good starting point.
Below you can find interesting for you code from github.
var page = 0;
var searchResult = this.CurrentClient.Search<T>(
s => s
.Index(fromIndex)
.AllTypes()
.From(0)
.Size(size)
.Query(this._reindexDescriptor._QuerySelector ?? (q=>q.MatchAll()))
.SearchType(SearchType.Scan)
.Scroll(scroll)
);
if (searchResult.Total <= 0)
throw new ReindexException(searchResult.ConnectionStatus, "index " + fromIndex + " has no documents!");
IBulkResponse indexResult = null;
do
{
var result = searchResult;
searchResult = this.CurrentClient.Scroll<T>(s => s
.Scroll(scroll)
.ScrollId(result.ScrollId)
);
if (searchResult.Documents.HasAny())
indexResult = this.IndexSearchResults(searchResult, observer, toIndex, page);
page++;
} while (searchResult.IsValid && indexResult != null && indexResult.IsValid && searchResult.Documents.HasAny());
Also you can take a look at integration test for Scroll
[Test]
public void SearchTypeScan()
{
var scanResults = this.Client.Search<ElasticsearchProject>(s => s
.From(0)
.Size(1)
.MatchAll()
.Fields(f => f.Name)
.SearchType(SearchType.Scan)
.Scroll("2s")
);
Assert.True(scanResults.IsValid);
Assert.False(scanResults.FieldSelections.Any());
Assert.IsNotNullOrEmpty(scanResults.ScrollId);
var results = this.Client.Scroll<ElasticsearchProject>(s=>s
.Scroll("4s")
.ScrollId(scanResults.ScrollId)
);
var hitCount = results.Hits.Count();
while (results.FieldSelections.Any())
{
Assert.True(results.IsValid);
Assert.True(results.FieldSelections.Any());
Assert.IsNotNullOrEmpty(results.ScrollId);
var localResults = results;
results = this.Client.Scroll<ElasticsearchProject>(s=>s
.Scroll("4s")
.ScrollId(localResults.ScrollId));
hitCount += results.Hits.Count();
}
Assert.AreEqual(scanResults.Total, hitCount);
}
I took the liberty of rewriting the fine answer from Michael to async and a bit less verbose (v. 6.x Nest):
public async Task<IList<T>> RockAndScroll<T>(
string indexName,
string scrollTimeoutMinutes = "2m",
int scrollPageSize = 1000
) where T : class
{
var searchResponse = await this.ElasticClient.SearchAsync<T>(sd => sd
.Index(indexName)
.From(0)
.Take(scrollPageSize)
.MatchAll()
.Scroll(scrollTimeoutMinutes));
var results = new List<T>();
while (true)
{
if (!searchResponse.IsValid || string.IsNullOrEmpty(searchResponse.ScrollId))
throw new Exception($"Search error: {searchResponse.ServerError.Error.Reason}");
if (!searchResponse.Documents.Any())
break;
results.AddRange(searchResponse.Documents);
searchResponse = await ElasticClient.ScrollAsync<T>(scrollTimeoutMinutes, searchResponse.ScrollId);
}
await this.ElasticClient.ClearScrollAsync(new ClearScrollRequest(searchResponse.ScrollId));
return results;
}
I took Frederick's answer and made it an extension method that leverages IAsyncEnumerable:
// Adopted from https://stackoverflow.com/a/56261657/1072030
public static async IAsyncEnumerable<T> ScrollAllAsync<T>(
this IElasticClient elasticClient,
string indexName,
string scrollTimeoutMinutes = "2m",
int scrollPageSize = 1000,
[EnumeratorCancellation] CancellationToken ct = default
) where T : class
{
var searchResponse = await elasticClient.SearchAsync<T>(
sd => sd
.Index(indexName)
.From(0)
.Take(scrollPageSize)
.MatchAll()
.Scroll(scrollTimeoutMinutes),
ct);
try
{
while (true)
{
if (!searchResponse.IsValid || string.IsNullOrEmpty(searchResponse.ScrollId))
throw new Exception($"Search error: {searchResponse.ServerError.Error.Reason}");
if (!searchResponse.Documents.Any())
break;
foreach(var item in searchResponse.Documents)
{
yield return item;
}
searchResponse = await elasticClient.ScrollAsync<T>(scrollTimeoutMinutes, searchResponse.ScrollId, ct: ct);
}
}
finally
{
await elasticClient.ClearScrollAsync(new ClearScrollRequest(searchResponse.ScrollId), ct: ct);
}
}
I'm guessing we should really be using search_after instead of the scroll api, but meh.

elasticsearch nest stopword filter does not work

I am tiring to implement elasticsearch NEST client and indexing documents and SQL data and able to search these perfectly. But I am not able to apply stopwords on these records. Below is the code. Please note I put "abc" as my stopword.
public IndexSettings GetIndexSettings()
{
var stopTokenFilter = new StopTokenFilter();
string stopwordsfilePath = Convert.ToString(ConfigurationManager.AppSettings["Stopwords"]);
string[] stopwordsLines = System.IO.File.ReadAllLines(stopwordsfilePath);
List<string> words = new List<string>();
foreach (string line in stopwordsLines)
{
words.Add(line);
}
stopTokenFilter.Stopwords = words;
var settings = new IndexSettings { NumberOfReplicas = 0, NumberOfShards = 5 };
settings.Settings.Add("merge.policy.merge_factor", "10");
settings.Settings.Add("search.slowlog.threshold.fetch.warn", "1s");
settings.Analysis.Analyzers.Add("xyz", new StandardAnalyzer { StopWords = words });
settings.Analysis.Tokenizers.Add("keyword", new KeywordTokenizer());
settings.Analysis.Tokenizers.Add("standard", new StandardTokenizer());
settings.Analysis.TokenFilters.Add("standard", new StandardTokenFilter());
settings.Analysis.TokenFilters.Add("lowercase", new LowercaseTokenFilter());
settings.Analysis.TokenFilters.Add("stop", stopTokenFilter);
settings.Analysis.TokenFilters.Add("asciifolding", new AsciiFoldingTokenFilter());
settings.Analysis.TokenFilters.Add("word_delimiter", new WordDelimiterTokenFilter());
return settings;
}
public void CreateDocumentIndex(string indexName = null)
{
IndexSettings settings = GetIndexSettings();
if (!this.client.IndexExists(indexName).Exists)
{
this.client.CreateIndex(indexName, c => c
.InitializeUsing(settings)
.AddMapping<Document>
(m => m.Properties(ps => ps.Attachment
(a => a.Name(o => o.Documents)
.TitleField(t => t.Name(x => x.Name)
.TermVector(TermVectorOption.WithPositionsOffsets))))));
}
var r = this.client.GetIndexSettings(i => i.Index(indexName));
}
Indexing Data
var documents = GetDocuments();
documents.ForEach((document) =>
{
indexRepository.IndexData<Document>(document, DOCindexName, DOCtypeName);
});
public bool IndexData<T>(T data, string indexName = null, string mappingType = null)
where T : class, new()
{
if (client == null)
{
throw new ArgumentNullException("data");
}
var result = this.client.Index<T>(data, c => c.Index(indexName).Type(mappingType));
return result.IsValid;
}
In one of my document I have put a single line "abc" and I do not expect this to be returned as "abc" is in my stopword list. But On Searching Document It is also returning the above document. Below is the search query.
public IEnumerable<dynamic> GetAll(string queryTerm)
{
var queryResult = this.client.Search<dynamic>(d => d
.Analyzer("xyz")
.AllIndices()
.AllTypes()
.QueryString(queryTerm)).Documents;
return queryResult;
}
Please suggest where I am going wrong.

How can I create a MetadataWorkspace using metadata loading delegates?

I followed this example Changing schema name on runtime - Entity Framework where I can create a new EntityConnection from a MetaDataWorkspace that I then use to construct a DbContext with a different schema, but I get compiler warnings saying that RegisterItemCollection method is obsolete and to "Construct MetadataWorkspace using constructor that accepts metadata loading delegates."
How do I do that? Here is the code that is working but gives the 3 warnings for the RegsiterItemCollection calls. I'm surprised it works since warning says obsolete not just deprecated.
public static EntityConnection CreateEntityConnection(string schema, string connString, string model)
{
XmlReader[] conceptualReader = new XmlReader[]
{
XmlReader.Create(
Assembly
.GetExecutingAssembly()
.GetManifestResourceStream(model + ".csdl")
)
};
XmlReader[] mappingReader = new XmlReader[]
{
XmlReader.Create(
Assembly
.GetExecutingAssembly()
.GetManifestResourceStream(model + ".msl")
)
};
var storageReader = XmlReader.Create(
Assembly
.GetExecutingAssembly()
.GetManifestResourceStream(model + ".ssdl")
);
//XNamespace storageNS = "http://schemas.microsoft.com/ado/2009/02/edm/ssdl"; // this would not work!!!
XNamespace storageNS = "http://schemas.microsoft.com/ado/2009/11/edm/ssdl";
var storageXml = XElement.Load(storageReader);
foreach (var entitySet in storageXml.Descendants(storageNS + "EntitySet"))
{
var schemaAttribute = entitySet.Attributes("Schema").FirstOrDefault();
if (schemaAttribute != null)
{
schemaAttribute.SetValue(schema);
}
}
storageXml.CreateReader();
StoreItemCollection storageCollection =
new StoreItemCollection(
new XmlReader[] { storageXml.CreateReader() }
);
EdmItemCollection conceptualCollection = new EdmItemCollection(conceptualReader);
StorageMappingItemCollection mappingCollection =
new StorageMappingItemCollection(
conceptualCollection, storageCollection, mappingReader
);
//var workspace2 = new MetadataWorkspace(conceptualCollection, storageCollection, mappingCollection);
var workspace = new MetadataWorkspace();
workspace.RegisterItemCollection(conceptualCollection);
workspace.RegisterItemCollection(storageCollection);
workspace.RegisterItemCollection(mappingCollection);
var connectionData = new EntityConnectionStringBuilder(connString);
var connection = DbProviderFactories
.GetFactory(connectionData.Provider)
.CreateConnection();
connection.ConnectionString = connectionData.ProviderConnectionString;
return new EntityConnection(workspace, connection);
}
I was able to get rid of the 3 warning messages. Basically it wants you to register the collections in the constructor of the MetadataWorkspace.
There are 3 different overloads for MetadataWorkspace, I chose to use the one which requires to to supply a path (array of strings) to the workspace metadata. To do this I saved readers to temp files and reloaded them.
This is working for me without any warnings.
public static EntityConnection CreateEntityConnection(string schema, string connString, string model) {
var conceptualReader = XmlReader.Create(Assembly.GetExecutingAssembly().GetManifestResourceStream(model + ".csdl"));
var mappingReader = XmlReader.Create(Assembly.GetExecutingAssembly().GetManifestResourceStream(model + ".msl"));
var storageReader = XmlReader.Create(Assembly.GetExecutingAssembly().GetManifestResourceStream(model + ".ssdl"));
XNamespace storageNS = "http://schemas.microsoft.com/ado/2009/11/edm/ssdl";
var storageXml = XElement.Load(storageReader);
var conceptualXml = XElement.Load(conceptualReader);
var mappingXml = XElement.Load(mappingReader);
foreach (var entitySet in storageXml.Descendants(storageNS + "EntitySet")) {
var schemaAttribute = entitySet.Attributes("Schema").FirstOrDefault();
if (schemaAttribute != null) {
schemaAttribute.SetValue(schema);
}
}
storageXml.Save("temp.ssdl");
conceptualXml.Save("temp.csdl");
mappingXml.Save("temp.msl");
MetadataWorkspace workspace = new MetadataWorkspace(new List<String>(){
#"temp.csdl",
#"temp.ssdl",
#"temp.msl"
}
, new List<Assembly>());
var connectionData = new EntityConnectionStringBuilder(connString);
var connection = DbProviderFactories.GetFactory(connectionData.Provider).CreateConnection();
connection.ConnectionString = connectionData.ProviderConnectionString;
return new EntityConnection(workspace, connection);
}
Not wanting to create temp files which slows the process down, I found an alternate answer to this is fairly simple. I replaced these lines of code -
//var workspace2 = new MetadataWorkspace(conceptualCollection, storageCollection, mappingCollection);
var workspace = new MetadataWorkspace();
workspace.RegisterItemCollection(conceptualCollection);
workspace.RegisterItemCollection(storageCollection);
workspace.RegisterItemCollection(mappingCollection);
with this one line of code -
var workspace = new MetadataWorkspace(() => conceptualCollection, () => storageCollection, () => mappingCollection);
and that works fine.

Resources