How to implement substring function? - botframework

I have used substring function in composer and the output should be the number extracted from the given value. Here the condition reaches to false branch.
By following the doc I have used below steps but I am not getting the output. It will be very helpful if somebody provides the solution.
I have used "= substring('${dialog.Str}', 4, 7) " as well.
I want 1234 only intger part to be stored in a property.
Thanks...

use like this ${substring(dialog.str, 4,4)} .. you will get the answers 1234 ..
substring - (string, starting Idx, total length)

Related

Return the count of courses using satisfies contains XQuery

I have been struggling to return the count of courses from this XML file that contain "Cross-listed" as their description. The problem I encounter is because I am using for, it iterates and gives me "1 1" instead of "2". When I try using let instead I get 13 which means it counts all without condition even when I point return count($c["Cross-listed"]. What am I doing wrong and how can I fix it? Thanks in advance
for $c in doc("courses.xml")//Department/Course
where some $desc in $c/Description
satisfies contains($desc, "Cross-listed")
return count($c)
The problem I encounter is because I am using for
You are quite correct. You don't need to process items individually in order to count them.
You've made things much too difficult. You want
count(doc("courses.xml")//Department/Course[Description[contains(., "Cross-listed"]])
The key thing here is: you want a count, so call the count() function, and give it an argument which selects the set of things you want to include in the count.

Assessing from the end of a split array in Hive

I need to split a tag that looks something like "B1/AHU/_1/RoomTemp", "B1/AHU/_1/109/Temp", so with a variable with a variable number of fields. I am interested in getting the last field, or sometimes the last but one. I was disappointed to find that negative indexes do not count from the right and allow me to select the last element of an array in Hive as they do in Python.
select tag,split(tag,'[/]')[ -1] from sensor
I was more surprised when this did not work either:
select tag,split(tag,'[/]')[ size(split(tag,'[\]'))-1 ] from sensor
Both times giving me an error along the lines of this:
FAILED: SemanticException 1:27 Non-constant expressions for array indexes not supported.
Error encountered near token '1'
So any ideas? I am kind of new to Hive. Regex's maybe? Or is there some syntactic sugar I am not aware of?
This question is getting a lot of views (over a thousand now), so I think it needs a proper answer. In the event I solved it with this:
select tag,reverse(split(reverse(tag),'[/]')[0]) from sensor
which is not actually stated in the other suggested answers - I got the idea from a suggestion in the comments.
This:
reverses the string (so "abcd/efgh" is now "hgfe/dcba")
splits it on "/" into an array (so we have "hgfe" and "dcba")
extracts the first element (which is "hgfe")
then finally re-reverses (giving us the desired "efgh")
Also note that the second-to-last element can be retrieved by substituting 1 for the 0, and so on for the others.
There is a great library of Hive UDFs here. One of them is LastIndexUDF(). It's pretty self-explainatory, it retrieves the last element of an array. There are instructions to build and use the jar on the main page. Hope this helps.
This seem to work for me, this returns the last element from the SPLIT array
SELECT SPLIT(INPUT__FILE__NAME,'/')[SIZE(SPLIT(INPUT__FILE__NAME,'/')) -1 ] from test_table limit 10;
After reading the LanguageManual UDF a while, I luckily found the function substring_index exactly meets your requirement, dosen't need any additional calculations at all.
The manual says:
substring_index(string A, string delim, int count) returns the substring from string A before count occurrences of the delimiter delim (as of Hive 1.3.0). If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned. Substring_index performs a case-sensitive match when searching for delim. Example: substring_index('www.apache.org', '.', 2) = 'www.apache'.
Use cases:
SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
--www.mysql
SELECT SUBSTRING_INDEX('www.mysql.com', '.', -1);
--com
See here for more information.

Ruby: find what was replaced with .sub

Given the code
thisString.sub! /piece\_[0-9]{1,4}.ts/, "piece_#{i}.ts"
where thisString could be anything from piece_1.ts to piece_9999.ts
How can I get the number that was replaced?
You can use a numbered capturing group:
thisString.sub! /piece\_([0-9]{1,4}).ts/, "piece_#{i}.ts"
Now $1 will give you the value as a string and you can do $1.to_i.
You can read more about how this works here.
As a stopgap, I ended up getting the number first, then doing the sub:
foundStartPiece = Integer(firstline[/piece\_[0-9]{1,4}.ts/][/[0-9]{1,4}/]) # Returns the number as an integer
Edit: After that, I did move to using #ndn's answer

Python3 Make tie-breaking lambda sort more pythonic?

As an exercise in python lambdas (just so I can learn how to use them more properly) I gave myself an assignment to sort some strings based on something other than their natural string order.
I scraped apache for version number strings and then came up with a lambda to sort them based on numbers I extracted with regexes. It works, but I think it can be better I just don't know how to improve it so it's more robust.
from lxml import html
import requests
import re
# Send GET request to page and parse it into a list of html links
jmeter_archive_url='https://archive.apache.org/dist/jmeter/binaries/'
jmeter_archive_get=requests.get(url=jmeter_archive_url)
page_tree=html.fromstring(jmeter_archive_get.text)
list_of_links=page_tree.xpath('//a[#href]/text()')
# Filter out all the non-md5s. There are a lot of links, and ultimately
# it's more data than needed for his exercise
jmeter_md5_list=list(filter(lambda x: x.endswith('.tgz.md5'), list_of_links))
# Here's where the 'magic' happens. We use two different regexes to rip the first
# and then the second number out of the string and turn them into integers. We
# then return them in the order we grabbed them, allowing us to tie break.
jmeter_md5_list.sort(key=lambda val: (int(re.search('(\d+)\.\d+', val).group(1)), int(re.search('\d+\.(\d+)', val).group(1))))
print(jmeter_md5_list)
This does have the desired effect, The output is:
['jakarta-jmeter-2.5.1.tgz.md5', 'apache-jmeter-2.6.tgz.md5', 'apache-jmeter-2.7.tgz.md5', 'apache-jmeter-2.8.tgz.md5', 'apache-jmeter-2.9.tgz.md5', 'apache-jmeter-2.10.tgz.md5', 'apache-jmeter-2.11.tgz.md5', 'apache-jmeter-2.12.tgz.md5', 'apache-jmeter-2.13.tgz.md5']
So we can see that the strings are sorted into an order that makes sense. Lowest version first and highest version last. Immediate problems that I see with my solution are two-fold.
First, we have to create two different regexes to get the numbers we want instead of just capturing groups 1 and 2. Mainly because I know there are no multiline lambdas, I don't know how to reuse a single regex object instead of creating a second.
Secondly, this only works as long as the version numbers are two numbers separated by a single period. The first element is 2.5.1, which is sorted into the correct place but the current method wouldn't know how to tie break for 2.5.2, or 2.5.3, or for any string with an arbitrary number of version points.
So it works, but there's got to be a better way to do it. How can I improve this?
This is not a full answer, but it will get you far along the road to one.
The return value of the key function can be a tuple, and tuples sort naturally. You want the output from the key function to be:
((2, 5, 1), 'jakarta-jmeter')
((2, 6), 'apache-jmeter')
etc.
Do note that this is a poor use case for a lambda regardless.
Originally, I came up with this:
jmeter_md5_list.sort(key=lambda val: list(map(int, re.compile('(\d+(?!$))').findall(val))))
However, based on Ignacio Vazquez-Abrams's answer, I made the following changes.
def sortable_key_from_string(value):
version_tuple = tuple(map(int, re.compile('(\d+(?!$))').findall(value)))
match = re.match('^(\D+)', value)
version_name = ''
if match:
version_name = match.group(1)
return (version_tuple, version_name)
and this:
jmeter_md5_list.sort(key = lambda val: sortable_key_from_string(val))

Excel - Search an exact match within a string

I'm currently struggling on finding the formula that will resolve my problem.
Here's the status quo:
In Sheet 1, column A, I have a set of string, such as:
/search.action?gender=men&brand=10177&tag=10203&tag=10336
/search.action?gender=women&brand=11579&tag=10001&tag=10138
/search.action?gender=men&brand=12815&tag=10203&tag=10299
/search.action?gender=women&brand=1396&tag=10203&tag=10513
/search.action?gender=women&brand=11&tag=10001&tag=10073
/search.action?gender=women&brand=1396&tag=10203&tag=10336
/search.action?gender=women&brand=13
In Sheet 2, column A, I have a set of strings such as:
brand=10177
brand=12815
brand=13
brand=1396
brand=11579
Finally, in sheet 1, column B will be my "filter" with the formula I'm struggling to find. The goal of my formula is to detect in any of the strings in sheet 1 if one of the string in sheet 2 is present (as an exact match!). Indeed, now it only finds approximative matches. As you can see, the row 5 shouldn't return anything. But with my current formula it does.
Here's the formula:
{=IFERROR(INDEX('Sheet 2'!$A$1:$A$5;MATCH(1;COUNTIF(A1;"*"&'Sheet 2'!$A$1:$A$5&"*");0));"")}
Any idea on the matter?
Please note that I don't want to use VBA, macros, but only a formula.
Thanks a lot for your help!
Following will solve your problem I guess:
=VLOOKUP(MID(A2,FIND("&",A2)+1,FIND("&",A2,FIND("&",A2)+1)-FIND("&",A2)-1),Sheet2!A:A,1,FALSE)
Basically with find function I have identified the start and length of the string in between "&" signs. and used in vlookup.
Another point to mention is this formula is only looking for the first 2 "&" signs.
For completeness, here is another solution based on this answer
=INDEX(Sheet2!$A$1:$A$5,MAX(IF(ISERROR(FIND(Sheet2!$A$1:$A$5,A1)),-1,1)*(ROW(Sheet2!$A$1:$A$5)-ROW(Sheet2!$A$1)+1)))
This is a bit more general and it doesn't matter how many search tags there are.
However as it stands it would match brand=13 in the second sheet with brand=1396 in the first sheet. To avoid that you could add an ampersand to the search strings:-
=INDEX(Sheet2!$A$1:$A$5,MAX(IF(ISERROR(FIND(Sheet2!$A$1:$A$5&"&",A1&"&")),-1,1)*(ROW(Sheet2!$A$1:$A$5)-ROW(Sheet2!$A$1)+1)))
This formula throws a #VALUE error if there is no match: to avoid this, you would need to put an IFERROR statement round it:-
=IFERROR(INDEX(Sheet2!$A$1:$A$5,MAX(IF(ISERROR(FIND(Sheet2!$A$1:$A$5&"&",A1&"&")),-1,1)*(ROW(Sheet2!$A$1:$A$5)-ROW(Sheet2!$A$1)+1))),"")
All these are array formulae.

Resources