Building Categories with Prismic? - graphql

I have an existing Gatsby/Prismic blog. I'm trying now to organize my content by using categories. Does anyone knows a good tutorial? Theres literally just 1 Documentation that Ive found and its not really helping. Does anyone knows the steps to create and display Categories using Gatsby and Prismic?

In "Prismic" add a new "Content Type" that is of "Repeatable Type" called "category".
To build the category structure you can use "Build Mode" or "JSON editor" on the right side. If you choose "JSON editor"
paste in .
{
"Main" : {
"name" : {
"type" : "Text",
"config" : {
"label" : "name",
"placeholder" : "Name of the category"
}
}
}
}
...and save.
Now add new categories like you do new blog posts. Click new content and then "category" and input a category name ex: events then press "save" and "publish".
Now edit your blog post content base. If you click on the JSON editor add to the Meta
{
"Main" : {
//Here is your code for the blog post add the Meta below
"Meta" : {
"categories" : {
"type" : "Group",
"config" : {
"fields" : {
"category" : {
"type" : "Link",
"config" : {
"select" : "document",
"customtypes" : [ "category" ],
"label" : "category",
"placeholder" : "Category"
}
}
},
"label" : "Categories"
}
}
}
}
Click on your old blog posts that you want to add the category too. There should now be a Meta tag next to the Main. When you click it you will see that there is a category field which when clicked list the category fields you made. Select one.
Now you can filter your blog posts by categories. How you do this is up to you, perhaps a query to get all the the category names which you put into a drop down?
A good example is this Gatsby Starter https://github.com/LekoArts/gatsby-starter-prismic-i18n

Related

How do I combine different indexes in a "more like this" query?

The docs of the MLT query give following example (abbreviated by me) to retrieve a document similar to an existing document:
"query": {
"more_like_this" : {
"fields" : ["title", "description"],
"like" : [
{
"_index" : "imdb",
"_id" : "1"
}],
"min_term_freq" : 1,
"max_query_terms" : 12
}
}
Which seems to compare the "title" and "description" fields among movie titles to the one movie with ID 1. Suppose I have an index for people's comments though and I would like to get all movie titles which have a "title" or "description" similar to one particular comment.
I know that I could provide free text as a value for the "like" field - the document (comment) is already part of another index though, so I would like to use that one. Just not based on the "title" and "description" fields (which would not exist on a comment), but let's say its "body" field. How would I do that?
You can add the same alias on both indexes : https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html
and run the query against the alias
note: this will cause a higher load on your elastic cluster.

Aggregating Nested Fields in Kibana /Elastic Search

I have defined an Index in elastic cache 6
PUT my_index
{
"mappings": {
"_doc": {
"properties": {
"user": {
"type": "nested"
}
}
}
}
}
and loaded some same data as follows
PUT my_index/_doc/1
{
"group" : "coach",
"user" : [
{
"first" : "John",
"last" : "Frank"
},
{
"first" : "Hero",
"last" : "tim"
}
]
}
PUT my_index/_doc/2
{
"group" : "team",
"user" : [
{
"first" : "John",
"last" : "term"
},
{
"first" : "david",
"last" : "gayle"
}
]
}
Now I am trying to search in the discover page or the visualization page, but I receive a blank
after a bit of trial and error and googling around i found that does not support nested type for aggregation and search out of the box. To enable this you must install a plugin and the best plugin i found is listed below.
https://ppadovani.github.io/knql_plugin/overview/
The plugin provides all the features from the discover tab to the visualization tab.

How to find similar documents in Elasticsearch

My documents are made up of using various fields. Now given an input document, I want to find the similar documents using the input document fields. How can I achieve it?
{
"query": {
"more_like_this" : {
"ids" : ["12345"],
"fields" : ["field_1", "field_2"],
"min_term_freq" : 1,
"max_query_terms" : 12
}
}
}
you will get similar documents to id 12345. Here you need to specify only ids and field like title, category, name, etc. not their values.
Here is another code to do without ids, but you need to specify fields with values. Example: Get similar documents which have similar title to:
elasticsearch is fast
{
"query": {
"more_like_this" : {
"fields" : ["title"],
"like" : "elasticsearch is fast",
"min_term_freq" : 1,
"max_query_terms" : 12
}
}
}
You can add more fields and their values
You haven't mentioned the types of your fields. A general approach is to use a catch all field (using copy_to) with the more like this query.
{
"query": {
"more_like_this" : {
"fields" : ["first name", "last name", "address", "etc"],
"like" : "your_query",
"min_term_freq" : 1,
"max_query_terms" : 12
}
}
}
Put everything in your_query . You can increase or decrease min_term_freq and max_query_terms

Elasticsearch highlight: how to get entire text of the field in Java client

I am new to Elasticsearch. I am hoping to get highlighted field in Java client. If I run the following query in Windows prompt:
{
"query": {
"filtered" : {
"query" : {
"term" : {
"title" : "western"
}
},
"filter" : {
"term" : { "year" : 1961 }
}
}
},
"highlight" : {
fields" : {
"title" : {}
}
}
}
I get nice highlighted text as follows:
{
"_index" : "book",
"_type" : "history",
"_id" : "1",
"_score" : 0.095891505,
"_source":{ "title": "All Quiet on the Western great Front", "year": 1961}
"highlight" : {
"title" : [ "All Quiet on the <em>Western</em> great Front dead" ]
}
}
The highlight
"highlight" : {
"title" : [ "All Quiet on the <em>Western</em> great Front dead" ]
}
can be easily converted into a Java Map object, and the "title" property contains the entire text of the matched field, which is really what I want.
However, in Java client, I get highlighted fragments, which puts different segments of highlighted text of the same field into an array of text.
Thanks and regards.
In the Java API the default number of fragments that are returned is 5. So if you only want one fragment to be returned you need to set that.
client.prepareSearch("book")
.setTypes("history")
.addHighlightedField("title")
.setQuery(query)
.setHighlighterFragmentSize(2000)
.setHighlighterNumOfFragments(1);
You may also set the number of fragments to 0 which will display the entire field with highlighting tags. This will also ignore fragment_size.
.setHighlighterNumOfFragments(0)
Here is what I found and I am not sure whether it is the right or best solution. In Java client, use setHighlighterFragmentSize method:
SearchResponse sr = client.prepareSearch("book")
.setTypes("history")
.addHighlightedField("title")
.setQuery(query)
.setHighlighterFragmentSize(2000) //set it larger than the size of the field so that the only one fragment is returned and it contains the entire text of the field.
I really want to hear what experts out there say and choose their reply as the answer.
Regards.

How to index dump of html files to elasticsearch?

I am totaly new in elastic so my knowledge is only from elasticsearch site and I need to help.
My task is to index large row data in html format into elastic search. I already crawled my data and stored it onto disk (200 000 html files). My question is what is the simplest way to index all html files into elasticsearch? Should I do it manualy by for each document to make put request to elastic? For example like:
curl -XPUT 'http://localhost:9200/registers/tomas/1' -d '{
"user" : "tomasko",
"post_date" : "2009-11-15T14:12:12",
"field 1" : "field data"
"field 2" : "field 2 data"
}'
And second question is if I have to parse HTML document to retrieve data for JSON field 1 like in example code over?
And finaly after indexing may I delete all HTML documents? Thanks for all.
I'd look at the bulk api that allows you to send more than document in a single request, in order to speed up your indexing process. You can send batch of 10, 20 or more documents, depending on how big they are.
Depending on what you want to index you might need to parse the html, unless you want to index the whole html as a single field (you might want to use the html strip char filter in that case to strip out the html tags from the indexed text).
After indexing I'd suggest to make sure the mapping is correct and you can find what you're looking for. You can always reindex using the _source special field that elasticsearch stores under the hood, but if you already wrote your indexer code you might want to use it again to reindex when needed (of course with the same html documents). In practice, you never index your data once... so be careful :) even though elasticsearch always helps you out with the _source field), it's just a matter of querying the existing index and reindex all its documents on another index.
#javanna's suggestion to look at the Bulk API will definitely lead you in the right direction. If you are using NEST, you can store all your objects in a list which you can then serialize JSON objects for indexing the content.
Specifically, if you want to strip the html tags out prior to indexing and storing the content as is, you can use the mapper attachment plugin - in which when you define the mapping, you can categorize the content_type to be "html."
The mapper attachment is useful for many things especially if you are handling multiple document types, but most notably - I believe just using this for the purpose of stripping out the html tags is sufficient enough (which you cannot do with the html_strip char filter).
Just a forewarning though - NONE of the html tags will be stored. So if you do need those tags somehow, I would suggest defining another field to store the original content. Another note: You cannot specify multifields for mapper attachment documents, so you would need to store that outside of the mapper attachment document. See my working example below.
You'll need to result in this mapping:
{
"html5-es" : {
"aliases" : { },
"mappings" : {
"document" : {
"properties" : {
"delete" : {
"type" : "boolean"
},
"file" : {
"type" : "attachment",
"fields" : {
"content" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets",
"analyzer" : "autocomplete"
},
"author" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets"
},
"title" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets",
"analyzer" : "autocomplete"
},
"name" : {
"type" : "string"
},
"date" : {
"type" : "date",
"format" : "strict_date_optional_time||epoch_millis"
},
"keywords" : {
"type" : "string"
},
"content_type" : {
"type" : "string"
},
"content_length" : {
"type" : "integer"
},
"language" : {
"type" : "string"
}
}
},
"hash_id" : {
"type" : "string"
},
"path" : {
"type" : "string"
},
"raw_content" : {
"type" : "string",
"store" : true,
"term_vector" : "with_positions_offsets",
"analyzer" : "raw"
},
"title" : {
"type" : "string"
}
}
}
},
"settings" : { //insert your own settings here },
"warmers" : { }
}
}
Such that in NEST, I will assemble the content as such:
Attachment attachment = new Attachment();
attachment.Content = Convert.ToBase64String(File.ReadAllBytes("path/to/document"));
attachment.ContentType = "html";
Document document = new Document();
document.File = attachment;
document.RawContent = InsertRawContentFromString(originalText);
I have tested this in Sense - results are as follows:
"file": {
"_content": "PGh0bWwgeG1sbnM6TWFkQ2FwPSJodHRwOi8vd3d3Lm1hZGNhcHNvZnR3YXJlLmNvbS9TY2hlbWFzL01hZENhcC54c2QiPg0KICA8aGVhZCAvPg0KICA8Ym9keT4NCiAgICA8aDE+VG9waWMxMDwvaDE+DQogICAgPHA+RGVsZXRlIHRoaXMgdGV4dCBhbmQgcmVwbGFjZSBpdCB3aXRoIHlvdXIgb3duIGNvbnRlbnQuIENoZWNrIHlvdXIgbWFpbGJveC48L3A+DQogICAgPHA+wqA8L3A+DQogICAgPHA+YXNkZjwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD4xMDwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD5MYXZlbmRlci48L3A+DQogICAgPHA+wqA8L3A+DQogICAgPHA+MTAvNiAxMjowMzwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD41IDA5PC9wPg0KICAgIDxwPsKgPC9wPg0KICAgIDxwPjExIDQ3PC9wPg0KICAgIDxwPsKgPC9wPg0KICAgIDxwPkhhbGxvd2VlbiBpcyBpbiBPY3RvYmVyLjwvcD4NCiAgICA8cD7CoDwvcD4NCiAgICA8cD5qb2c8L3A+DQogIDwvYm9keT4NCjwvaHRtbD4=",
"_content_length": 0,
"_content_type": "html",
"_date": "0001-01-01T00:00:00",
"_title": "Topic10"
},
"delete": false,
"raw_content": "<h1>Topic10</h1><p>Delete this text and replace it with your own content. Check your mailbox.</p><p> </p><p>asdf</p><p> </p><p>10</p><p> </p><p>Lavender.</p><p> </p><p>10/6 12:03</p><p> </p><p>5 09</p><p> </p><p>11 47</p><p> </p><p>Halloween is in October.</p><p> </p><p>jog</p>"
},
"highlight": {
"file.content": [
"\n <em>Topic10</em>\n\n Delete this text and replace it with your own content. Check your mailbox.\n\n  \n\n asdf\n\n  \n\n 10\n\n  \n\n Lavender.\n\n  \n\n 10/6 12:03\n\n  \n\n 5 09\n\n  \n\n 11 47\n\n  \n\n Halloween is in October.\n\n  \n\n jog\n\n "
]
}

Resources