I am trying a Solr query which is like this
+field1:* AND (field2:1 OR field2:10) NOT(field3:value1 OR field3:value2)
But field3 part of the query is not making any impact. It still brings record which has value1 or value2 in field3
Why is this?
Try this
+field1:* +(field2:1 OR field2:10) -(field3:value1 OR field3:value2)
I think a AND / OR is missing between the two last blocks. It would then become something like :
+field1:* AND (field2:1 OR field2:10) AND NOT(field3:value1 OR field3:value2)
You need to urlencode certain characters in the Solr query to meet UTF8 standards and the + (plus) symbol is one of them, as well as space, brackets etc.
Things to encode are:
Space => +
+ => %2B
( => %28
) => %29
and so forth, you can see an example of an encoded URL on the SOLR website:
https://wiki.apache.org/solr/SolrQuerySyntax
Try:
str_replace(array('+','(',')',' '), array('%2B','%28','%29','+'), '+field1:* (field2:1 field2:10) -(field3:value1 field3:value2)');
This should give you:
%2Bfield1:*+%2B%28field2:1+field2:10%29+-%28field3:value1+field3:value2%29
IF your default query parser operation is set to OR, then any space between fields will be interpreted as an OR operator.
The above result is far from clean & readable, but it is a correctly formatted UTF8 string which Solr requires you to pass to it. You'll notice the difference as soon as you run it.
Why str_replace instead of urlencode? Well you can use urlencode because it will correctly format the string as UTF8 but it might format some string components that don't need to be encoded.
Related
I am trying to filter rows with: name eq 'nameFromQueryData'
One name is breaking the filter: Paul O'Meefe
Its due to the filter needing to escape the ' inside the name. Its reading it as: name eq 'Paul O' (end string) Meffe' (start string) = error
Since this is dynamic data, I cant just use the escape character O/'.
So how would one go about escaping it?
Try using fetchxml instead - it's a lot more adaptable and forgiving instead of an odata query in Power Automate
enter image description here
We know to replace word we can use REPLACE keyword like below...
RELATION = FOREACH data GENERATE REPLACE(string,'a','b');
above statement replace all 'a' letters to 'b'.
But if I want to REPLACE dollar sign($). then how I can do that? Because in Pig '$' indicates no of column. So for example, if want to replace '$' from string like '$1234.56' and want output like '1234.56'.
RELATION = FOREACH data GENERATE REPLACE(string,'$','');
But this not work for me.
Can anyone please help? Thanks in advance.
Using Unicode:
REPLACE(string,'\u0024','')
It can helpful to look at the string regrexes in Java, for instance: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html
In your particular case, you can use the following:
REPLACE(string, '[$]', '')
For increased flexibility, (when dealing with other currency types for instance), it might be a good idea to remove all non-numeric characters, except '.'. In that case use:
REPLACE(string, '[^\\d.]', '')
This worked for me: (triple backslashes)
REPLACE(string,'\\\$','')
I would like to extract a line of strings but am having difficulties using the correct RegEx. Any help would be appreciated.
String to extract: KSEA 122053Z 21008KT 10SM FEW020 SCT250 17/08 A3044 RMK AO2 SLP313 T01720083 50005
For Some reason StackOverflow wont let me cut and paste the XML data here since it includes "<>" characters. Basically I am trying to extract data between "raw_text" ... "/raw_text" from a xml that will always be formatted like the following: http://www.aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&hoursBeforeNow=3&mostRecent=true&stationString=PHNL%20KSEA
However, the Station name, in this case "KSEA" will not always be the same. It will change based on user input into a search variable.
Thanks In advance
if I can assume that every strings that you want starts with KSEA, then the answer would be:
.*(KSEA.*?)KSEA.*
using ? would let .* match as less as possible.
I hope there is no post limit since I have posted more than once today. :-P
Now I have a table in OracleSQL. I noticed there are some useless signs and want to delete them. The way I do it is to replace all of them. Below is my table and my query.
Here is my query:
SELECT
CASE WHEN WORD IN ('!', '"', '#','""') Then ''
ELSE WORD END
FROM TERM_FREQUENCY;
It is not giving me an error, but these special characters are not going away either... Any thoughts?
A little typo of yours: you use - instead of _
SELECT
CASE WHEN WORD IN ('!', '"', '#','""') Then ''
ELSE WORD END
-- FROM TERM-FREQUENCY; --This is where the problem is.
FROM TERM_FREQUENCY; -- Because your table is named TERM _ FREQUENCY
You originally tagged your question with 'replace' but then didn't use that function in your code. You're comparing each whole word to those fixed strings, not seeing if it contains any of them.
You can either use nested replace calls to remove one character at a time:
select replace(replace(word, '!', null), '"', null) from ...
... which would be tedious and rely on you identifying every character you didn't want; or you could use a regular expression only keep alphabetic characters, which I suspect is what you're really after:
select regexp_replace(word, '[^[:alpha:]]', null) from ...
Quick demo.
You might also want to use lower or upper to get everything into the same case, as you probably don't really want to count different capitalisation differently either.
I am using following cts query for the search in MarkLogic
cts:element-word-query(xs:QName('c:l10n'),'\*\漢\*',('wildcarded','case-insensitive','whitespace-sensitive'))
It is not giving any result although there exists some data in the database with "\漢" words.
already tried:
It is working fine with English characters like \r,\n or /r,/n.
also, it gives me perfect result if I use only \ or 漢. but always show 0 results whenever I use \ with any Chinese character.
It is possible that there is a tokenization bug here, but it is hard to tell.
What you (should) have here is a phrase query for "\" (wildcard-word) "\" "漢" "\" and (wildcard-word) in that order. It is punctuation sensitive. Do you have an example of some content you think should match?
What does the query plan show you? What are your index settings?