CouchDB Rewriting - url-rewriting

CouchDB Rewriting - url-rewriting

I have a URL for my website, which is as follows: http://ashgavs.cloudant.com/site/_design/AshGavsCouch/main/index.html
I added a field to my design document called rewrites and it's as follows:
[
{
"from": "",
"to": "main/index.html"
}
]
However when I go to this URL: http://ashgavs.cloudant.com/site/_design/AshGavsCouch/ . The rewrite isn't happening. Am I doing it wrong? Is there a way to see where its rewriting to so that I can debug this?

As mentioned in the docs the default rewrite is from /site/_design/AshGavsCouch/_rewrite. If you want the rewrite to be from /site/_design/AshGavsCouch/ then you need to specify the URL in your "from" field.

Related

Azure Data Factory REST API paging with Elasticsearch

During developing pipeline which will use Elasticsearch as a source I faced with issue related paging. I am using SQL Elasticsearch API. Basically, I've started to do request in postman and it works well. The body of request looks following:
{
"query":"SELECT Id,name,ownership,modifiedDate FROM \"core\" ORDER BY Id",
"fetch_size": 20,
"cursor" : ""
}
After first run in response body it contains cursor string which is pointer to next page. If in postman I send the request and provide cursor value from previous request it return data for second page and so on. I am trying to archive the same result in Azure Data Factory. For this I using copy activity, which store response to Azure blob. Setup for source is following.
copy activity source configuration
This is expression for body
{
"query": "SELECT Id,name,ownership,modifiedDate FROM \"#{variables('TableName')}\" WHERE ORDER BY Id","fetch_size": #{variables('Rows')}, "cursor": ""
}
I have no idea how to correctly setup pagination rule. The pipeline works properly but only for the first request. I've tried to setup Headers.cursor and expression $.cursor but this setup leads to an infinite loop and pipeline fails with the Elasticsearch restriction.
I've also tried to read document at https://learn.microsoft.com/en-us/azure/data-factory/connector-rest#pagination-support but it seems pretty limited in terms of usage examples and difficult for understanding.
Could somebody help me understand how to build the pipeline with paging abilities utilization?
Responce with the cursor looks like:
{
"columns": [
{
"name": "companyId",
"type": "integer"
},
{
"name": "name",
"type": "text"
},
{
"name": "ownership",
"type": "keyword"
},
{
"name": "modifiedDate",
"type": "datetime"
}
],
"rows": [
[
2,
"mic Inc.",
"manufacture",
"2021-03-31T12:57:51.000Z"
]
],
"cursor": "g/WuAwFaAXNoRG5GMVpYSjVWR2hsYmtabGRHTm9BZ0FBQUFBRUp6VGxGbUpIZWxWaVMzcGhVWEJITUhkbmJsRlhlUzFtWjNjQUFBQUFCQ2MwNWhaaVIzcFZZa3Q2WVZGd1J6QjNaMjVSVjNrdFptZDP/////DwQBZgljb21wYW55SWQBCWNvbXBhbnlJZAEHaW50ZWdlcgAAAAFmBG5hbWUBBG5hbWUBBHRleHQAAAABZglvd25lcnNoaXABCW93bmVyc2hpcAEHa2V5d29yZAEAAAFmDG1vZGlmaWVkRGF0ZQEMbW9kaWZpZWREYXRlAQhkYXRldGltZQEAAAEP"
}

I finally find the solution, hopefully, it will be useful for the community.
Basically, what needs to be done it is split the solution into four steps.
Step 1 Make the first request as in the question description and stage file to blob.
Step 2 Read blob file and get the cursor value, set it to variable
Step 3 Keep requesting data with a changed body
{"cursor" : "#{variables('cursor')}" }
Pipeline looks like this:
pipeline
Configuration of pagination looks following
pagination . It is a workaround as the server ignores this header, but we need to have something which allows sending a request in loop.

Elasticsearch: create alias with routing and filter using Java API

I'm trying to send this request using Java API:
curl -XPUT 'http://localhost:9201/living/_alias/living_team' -d '
{
"routing": "living_team",
"filter": {
"term": {
"user": "living_team" //user property must exists
}
}
}'
Up to now, I've not been able to figure out how exactly build the Java request:
this.elasticsearchResources.getElasticsearchClient()
.admin()
.indices()
.prepareAliases()
.addAlias(
ElasticsearchRepository.ELASTICSEARCH_INDEX,
alias,
QueryBuilders.termQuery("user", alias);
This line only creates an ADD ALIAS request with a filter, nevertheless I don't know how to set routing path...
How could I set the routing path up on request?

You mean how to set "http://localhost:9201/living/_alias/living_team" on your request?
PS. I wrote this as an answer cause i cannot comment yet.
Edited:
Hope this can help you
IndicesAliasesRequest request = new IndicesAliasesRequest();
request.addAliasAction(new AliasAction(AliasAction.Type.ADD).alias("the_alias").index(index).searchRouting("the_search_routing").indexRouting("the_index_routing"));
IndicesAliasesResponse response = elasticsearchClient.admin().indices().aliases(request).get();

How to update multiple documents that match a query in elasticsearch

I have documents which contains only "url"(analyzed) and "respsize"(not_analyzed) fields at first. I want to update documents that match the url and add new field "category"
I mean;
at first doc1:
{
"url":"http://stackoverflow.com/users/4005632/mehmet-yener-yilmaz",
"respsize":"500"
}
I have an external data and I know "stackoverflow.com" belongs to category 10,
And I need to update the doc, and make it like:
{
"url":"http://stackoverflow.com/users/4005632/mehmet-yener-yilmaz",
"respsize":"500",
"category":"10"
}
Of course I will do this all documents which url fields has "stackoverflow.com"
and I need the update each doc oly once.. Because category data of url is not changeable, no need to update again.
I need to use _update api with _version number to check it but cant compose the dsl query.
EDIT
I run this and looks works fine:
But documents not changed..
Although query result looks true, new field not added to docs, need refresh or etc?

You could use the update by query plugin in order to do just that. The idea is to select all document without a category and whose url matches a certain string and add the category you wish.
curl -XPOST 'localhost:9200/webproxylog/_update_by_query' -H "Content-Type: application/json" -d '
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"term": {
"url": "stackoverflow.com"
}
},
{
"missing": {
"field": "category"
}
}
]
}
}
}
},
"script" : "ctx._source.category = \"10\";"
}'
After running this, all your documents with url: stackoverflow.com that don't have a category, will get category: 10. You can run the same query again later to fix new stackoverflow.com documents that have been indexed in the meantime.
Also make sure to enable scripting in elasticsearch.yml and restart ES:
script.inline: on
script.indexed: on
In the script, you're free to add as many fields as you want, e.g.
...
"script" : "ctx._source.category1 = \"10\"; ctx._source.category2 = \"20\";"
UPDATE
ES 2.3 now features the update by query functionality. You can still use the above query exactly as is and it will work (except that filtered and missing are deprecated, but still working ;).

That all sounds great but just to add to #Val answer, Update By Query is available form ElasticSearch 2.x but not for earlier versions. In our case we're using 1.4 for legacy reasons and there is no chance of upgrading in forseeable future so another solution is using the Update by query plugin provided here: https://github.com/yakaz/elasticsearch-action-updatebyquery

Google place api - application specific search

I am trying to do application specific places search with google place api. Here is how I am adding a place:
Request:
{
"location": {
"lat": 37.760538,
"lng": -121.900879
},
"accuracy": 50,
"name": "p2p",
"types": ["other"]
}
I get success response as shown below:
Response:
{
"id" : "dfe583b1ac058750cf524f958afc5e82ade455d7",
"place_id" : "qgYvCi0wMDAwMDBhNWE4OWU4NTMzOjgwOGZlZTBhNjI3OjBjNTU1OTU4M2Q2NDI5YmM",
"reference" : "CkQxAAAAsPE72V-jhHUjj6vPy2HdC__2MhAdXanL6mlFBA4bcayRabKyMlfKFiah7U2vkoCj1P_0w9ESFSv5mfDkyufaZhIQTHBHY_jPGRHEE3EmEAGElhoUXTSylMslwHSTK5tYdstW2rOZKbw",
"scope" : "APP",
"status" : "OK"
}
When I search for this place using radar search, I get ZERO_RESULTS.
Request:
https://maps.googleapis.com/maps/api/place/radarsearch/json?key=key&radius=5000&location=37.761926,-121.891856&keyword=p2p
Response:
{
"html_attributions": [ ],
"results": [ ],
"status": "ZERO_RESULTS"
}
Is there something that I am doing the right way? Please help.
Thanks & Regards,
--Rajani

Your scope is "APP". That means you can access it (via PlaceID) from the application that created the entry only. If the location passes Google's moderation process, then it will gain scope "GOOGLE" and be accessible from the general searches.
scope — Indicates the scope of the place_id. The possible values are:
APP: The place ID is recognised by your application only. This is because your
application added the place, and the place has not yet
passed the moderation process.
GOOGLE: The place ID is available to other applications and on Google Maps.
Note: The scope field is included only in Nearby Search results and
Place Details results. You can only retrieve app-scoped places via the
Nearby Search and the Place Details requests. If the scope field is
not present in a response, it is safe to assume the scope is GOOGLE.
See: https://developers.google.com/places/documentation/search

CouchDB URL Rewriting for SEO

I'm trying to create an entire site hosted purely on CouchDB (no nginx reverse proxy either) using a lot of client side Jquery/AJAX magic. Now I'm in the process of making it SEO friendly. I'm using vhosts and URL rewrites to route traffic from the root to my index.html file:
vhost:
example.com /dbname/_design/dd/_rewrite/
In my rewrite definition:
rewrites:[
{
"from": "/db/*",
"to": "/../../../*",
"query": {
}
},
{
"from": "/",
"to": "../../static/index.html",
"query": {
}
}
]
When optimizing a site for SEO, Google requires you to do a few things:
Use the hashbang (#!) in your friendly URL to tell the web crawler that you are an AJAX site with web crawlable material: http://example.com/index.html#!home
Use an http query argument to provide an HTML escaped fragment of that AJAX page: http://example.com/index.html?_escaped_fragment=home
I tried the following with no luck:
rewrites:[
{
"from": "/db/*",
"to": "/../../../*",
"query": {
}
},
{
"from": "/",
"to": "../../static/index.html",
"query": {
}
}, /* FIRST ATTEMPT */
{
"from": "/?_escaped_fragment=:_escaped_fragment",
"to": "/_show/escaped_fragment/:_escaped_fragment",
"query": {
}
}, /* SECOND ATTEMPT */
{
"from": "/?_escaped_fragment=*",
"to": "/_show/escaped_fragment/*",
"query": {
}
}, /* THIRD ATTEMPT */
{
"from": "/",
"to": "/_show/escaped_fragment/:_escaped_fragment",
"query": {
}
}
]
From what I've seen, CouchDB's URL rewriter is not capable of distinguishing the difference between a URLs with args and no args. Has anyone had luck creating such a rule with CouchDB URL rewrites?

I don't have a answer to the question, but I've developed a solution for the bigger problem of making crawlable sites hosted on CouchDB. It is a system that makes use of Facebook's React, list and show functions, ajax on the client and window.history to render the same HTML components filled with data at CouchDB and at the browser:
https://github.com/fiatjaf/reactive-couch
This solution doesn't need the hashbang, because for each unique URL the browser navigates to, using ajax and window.history or simple links (be it _list/listName/viewName/_show/displayKind/c305ee4d-8611-4e08-b9d3-3318835632a9 or something rewritten as /name//kind/c305ee4d-8611-4e08-b9d3-3318835632a9), the server can render the pertinent content.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

CouchDB Rewriting - url-rewriting

As mentioned in the docs the default rewrite is from /site/_design/AshGavsCouch/_rewrite. If you want the rewrite to be from /site/_design/AshGavsCouch/ then you need to specify the URL in your "from" field.

Related

Azure Data Factory REST API paging with Elasticsearch

Elasticsearch: create alias with routing and filter using Java API

How to update multiple documents that match a query in elasticsearch

Google place api - application specific search

CouchDB URL Rewriting for SEO

Categories

Resources