comparing a url containing utf-8 encoded string with a string - utf-8

I have a url
"/complete/search?nolabels=t&client=chrome&ds=yt&q=%D1%8D%D1%82%D0%BE%20%D1%80%D1%8D%D0%BF%20%D0%B8%D0%BC%D0%BF%D1%80&hl=ru"
and a list of key words. I need to determine if the url contains any of those words.
I checked this string %D1%8D%D1%82%D0%BE%20%D1%80%D1%8D%D0%BF%20%D0%B8%D0%BC%D0%BF%D1%80
using
utf-8 encoder
thought I had to change % to \x for it to work.
I try using regular expressions for this
select "/complete/search?nolabels=t&client=chrome&ds=yt&q=%D1%8D%D1%82%D0%BE%20%D1%80%D1%8D%D0%BF%20%D0%B8%D0%BC%D0%BF%D1%80&hl=ru" rlike encode("это рэп импр", "UTF-8");
but it doesn't accept a binary as an argument and base64(encode("это рэп импр", "UTF-8")) doesn't return the type of string I need.
My question is, how do I turn this это рэп импр into this %D1%8D%D1%82%D0%BE%20%D1%80%D1%8D%D0%BF%20%D0%B8%D0%BC%D0%BF%D1%80?

Found this
SELECT reflect("java.net.URLDecoder", "decode", "/complete/search?nolabels=t&client=chrome&ds=yt&q=%D1%8D%D1%82%D0%BE%20%D1%80%D1%8D%D0%BF%20%D0%B8%D0%BC%D0%BF%D1%80&hl=ru", "UTF-8");

Related

Ruby string conversion needed from one format to another needed by a solr query

solr_query_string= '{!term f=f_type_of_work tag=item_type}Thesis", "{!term f=f_type_of_work tag=item_type}Dataset'
How can I convert this string to look like:
solr_query_string='f_type_of_work:(\"Thesis\") OR f_type_of_work:(\"Dataset\")'
"{!term f=f_type_of_work}" is a variant .
Its not always f_type_of_work.
Sometimes the string can contain more than one type of f , ex: f=f_type_of_funder
I want to be able to 'OR' only the like fields. ie:
solr_query_string= '{!term f=computed__contains_full_text}true","{!term f=f_type_of_work tag=item_type}Thesis", "{!term f=f_type_of_work tag=item_type}Dataset"'
How can I convert this string to look like:
solr_query_string='computed__contains_full_text:(\"true\")","f_type_of_work:(\"Thesis\") OR f_type_of_work:(\"Dataset\")'

How to make Get Request with Request param in Postman

I have created an endpoint that accepts a string in its request param
#GetMapping(value = "/validate")
private void validateExpression(#RequestParam(value = "expression") String expression) {
System.out.println(expression);
// code to validate the input string
}
While sending the request from postman as
https://localhost:8443/validate?expression=Y07607=Curr_month:Y07606/Curr_month:Y07608
// lets say this is a valid input
console displays as
Y07607=Curr_month:Y07606/Curr_month:Y07608 Valid
But when i send
https://localhost:8443/validate?expression=Y07607=Curr_month:Y07606+Curr_month:Y07608
//which is also an valid input
console displays as
Y07607=Curr_month:Y07606 Curr_month:Y07608 Invalid
I am not understanding why "+" is not accepted as parameter.
"+" just vanishes till it reaches the api! Why?
I suggest to add this regular expression to your code to handle '+' char :
#GetMapping(value = "/validate")
private void validateExpression(#RequestParam(value = "expression:.+") String expression) {
System.out.println(expression);
// code to validate the input string
}
I didn't find any solution but the reason is because + is a special character in a URL escape for spaces. Thats why it is replacing + with a " " i.e. a space.
So apparently I have to encode it from my front-end
Its wise to encode special characters in a URL. Characters like \ or :, etc.
For + the format or value is %2. You can read more about URL encoding here. This is actually the preferred method because these special characters can sometimes cause unintended events to occur, like / or = which can mean something else in the URL.
And you need not worry about manually decoding it in the backend or server because it is automatically decoded, in most cases and frameworks. In your case, I assume you are using Spring Boot, so you don't need to worry about decoding.

MimeUtility.DecodeText() returns ? characters

"=?utf-8?B?MDJf5rW35aSW5oug54K5X0ZZMTXjgrfjgrnjg4bjg6DlgZzmraLlpInm?= =?utf-8?B?m7TkvZzmpa3kuojlrppfMjAxNTA1MDdWMS5wcHR4?="
I am trying to decode the above string using MimeUtility.DecodeText method. I get the resultant string as follows:
02_海外拠点_FY15システム停止変���作業予定_20150507V1.pptx
I am expecting the string to be :
02_海外拠点_FY15システム停止変更作業予定_20150507V1.pptx

How to return localized content from WebAPI? Strings work but not numbers

Given this ApiController:
public string TestString() {
return "The value is: " + 1.23;
}
public double TestDouble() {
return 1.23;
}
With the browser's language set to "fr-FR", the following happens:
/apiController/TestString yields
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">The value is: 1,23</string>
/apiController/TestDouble yields
<double xmlns="http://schemas.microsoft.com/2003/10/Serialization/">1.23</double>
I would expect TestDouble() to yield 1,23 in the XML. Can anyone explain why this isn't the case and, more importantly, how to make it so that it does?
It is because the conversion from double to string happens at different stage for each API. For the TestString API, double.ToString() is used to convert the number to a string using CurrentCulture of the current thread and it happens when the TestString method is called. Meanwhile, the double number which is returned by TestDouble is serialized to string during the serialization step which uses GlobalConfiguration.Configuration.Formatters.JsonFormatter.SerializerSettings.Culture.
In my opinion, both should use InvariantCulture. On the consumer side, the values will be parsed and be formatted with the correct culture.
Update: this is only used for JsonFormatter. XmlFormatter doesn't have such a setting.
Update 2:
It seems (decimal) numbers need special converter to make it culture-aware:
Handling decimal values in Newtonsoft.Json
Btw, if you want o change data format per action/request, you can try the last piece of code of the following link: http://tostring.it/2012/07/18/customize-json-result-in-web-api/

Prevent Go's json.Marshal from coercing string to valid UTF-8 unicode?

I'm trying to POST some JSON to a server using Go where special characters like ampersand must exist intact in the JSON string I'm sending.
http://play.golang.org/p/bPt9kl88-y
package main
import (
"fmt"
"encoding/json"
)
func main() {
type TransactionStatement struct {
Query string
}
statement := &TransactionStatement{
Query: "my query with this & that",
}
data, _ := json.Marshal(statement)
fmt.Printf("> %v", string(data))
}
Go's json.Marshal is converting certain characters like '&' to unicode. Which is great:
String values encode as JSON strings coerced to valid UTF-8, replacing invalid bytes with the Unicode replacement rune. The angle brackets "<" and ">" are escaped to "\u003c" and "\u003e" to keep some browsers from misinterpreting JSON output as HTML. Ampersand "&" is also escaped to "\u0026" for the same reason.
Blockquote
It may sound rediculous but the JSON I'm sending is for a database query using a RESTful endpoint, where this literal string will be used to perform a freetext database query - so I can't use unicode since it's interpreted literally.
So is there any way to prevent this conversion to unicode? Or will I just have to undo these replacements after the marshaling is done?
Thanks
You can choose not to encode if you wish:
https://golang.org/pkg/encoding/json/#Encoder.SetEscapeHTML
https://play.golang.org/p/gfy1WaDrEIH
I just realized after posting other characters are getting encoded to unicode as well and those are indeed working.

Resources