Gremlin Post Filter Path - filter

I have a gremlin query that follows this pattern:
g.V().has('LOCATION', textContains('FLORIDA')).
repeat(bothE().otherV().simplePath()).emit().times(5).
has('LOCATION',textContains('VIRGINIA')).
path().by(valueMap('LOCATION')).dedup()
Output could look something like this:
FLORIDA-->ALABAMA-->TENNESSEE-->VIRGINIA
FLORIDA-->GEORGIA-->TENNESSEE-->VIRGINIA
FLORIDA-->GEORGIA-->SOUTH CAROLINA-->NORTH CAROLINA-->VIRGINIA
etc...
Is there a way to filter after the path step to get only routes that go through ALABAMA (for instance). ALABAMA might not always be the second hop either so it would need to be dynamic enough to look across the whole path regardless of where the state to include might fall. Another wrinkle is there could be multiple states to filter on, for instance something like show me paths that include ALABAMA or include SOUTH CAROLINA or etc. In the actual application of this query there are multiple properties fed into the valueMap() as well but just tried to simplify it here. This could be similar to this question:
filter the gremlin results
but I can't figure out how to get back to a valueMap() after the filter step without an error. I tried something like this but wasn't sure where to go from here:
g.V().has('LOCATION', textContains('FLORIDA')).
repeat(bothE().otherV().simplePath()).emit().times(5).
has('LOCATION',textContains('VIRGINIA')).
path().filter(unfold().has('LOCATION', textContains('ALABAMA'))).
by(valueMap('LOCATION')).dedup()

You can use a sack to track this. Here is an example from Practical Gremlin that should help. The query finds routes that go from Austin to Edinburgh with at least one stop in Manchester. Similar to your Alabama use case. If Manchester is encountered along the path, a sack is incremented by one. As simplePath is used a location will never be visited twice. If, by the time EDI is reached, the sack contains a 1 then we know we went via MAN.
g.withSack(0).V().
has('code','AUS').
repeat(out().simplePath().has('country',within('US','UK')).
choose(has('code','MAN'),sack(sum).by(constant(1)))).
until(has('code','EDI')).
where(sack().is(1)).
path().by('code').
limit(10)
For further reading http://www.kelvinlawrence.net/book/PracticalGremlin.html#via

Related

To build a flow using Power Automate to download linked csv report in gmail

I'm trying to create a flow using Power Automate (which I'm quite new to) that can get the link/URL in an email I receive daily, then download the .csv file that normally a click to the link would do, and then save the file to a given local folder.
An example of the email I get:
Screenshot of the email I get daily
I searched in Power Automate Community and found this insightful LINK post & answer almost solved it. However, after following the steps and built the flow, it kept failing at the Compose step.
Screenshot of the Flow & Error Message
The flow
Error message
Expression used:
substring(body('Html_to_text'),add(indexOf(body('Html_to_text'),'here'),5),sub(indexOf(body('Html_to_text'),'Name'),5))
Seems the expression couldn't really get the URL/Link? I'm not sure and searched but couldn't find any more posts that can help.
Please kindly share all insights on approaches or workarounds that you think may help me solve the problem and truly thanks!
PPPPPPPPisces
We need to breakdown the bits of the function here which needs 3 bits of info
substring(1 text to search, 2 starting position of the text you want, 3 length of text)
For example, if you were trying to return an unknown number from the text dog 4567 bird
Our function would have 3 parts.
body('Html_to_text'), this bit gets the text we are searching for
add(indexOf(body('Html_to_text'),'dog'),4), this bit finds the position in the text 4 characters after the start of the word dog (3 letters for dog + the space)
sub(sub(indexOf(body('Html_to_text'),'bird'),2)),add(indexOf(body('Html_to_text'),'dog'),4)), I've changed the structure of your code here because this part needs to return the length of the URL, not the ending position. So here, we take the position of the end of the URL (position of the word bird minus two spaces) and subtract it from the position of the start of the URL (position of the word dog + 4 spaces) to get the length.
In your HTML to text output, you need to check what the HTML looks like, and search for a word before the URL starts, and a word after the URL starts, and count the exact amount of spaces to reach the URL. You can then put those words and counts into your code.
More generally, when you have a complicated problem that you need to troubleshoot, you can break it down into steps. For example. Rather than putting that big mess of code into a single block, you can make each chunk of the code in its own compose, and then one final compose to bring them all together - that way when you run it you can see what information each bit is giving out, or where it is failing, and experiment from there to discover what is wrong.

XPath query on exist-db returns no hits

I'm running a clean exist-db 4.5.0 on MacOS. Just installed the "shakespeare" package for testing. When im running the following request via browser I get no hits. But the he5.xml is a valid TEI file and contains in the body one text element.
http://127.0.0.1:8080/exist/rest/apps/shakespeare/data/he5.xml?_query=//text
<exist:result xmlns:exist="http://exist.sourceforge.net/NS/exist" exist:hits="0" exist:start="1" exist:count="0" exist:compilation-time="0" exist:execution-time="0"/>
Using Basic Auth credentials (user: admin, password: EMPTY) in URL doesn't change anything.
http://admin:#127.0.0.1:8080/exist/rest/apps/shakespeare/data/he5.xml?_query=//text
Only the //* XPATH seems to work (or getting ignored?) because I'm getting the whole content of the file. Other requests don't work either (correctly). Like the //text() xquery:
<exist:result xmlns:exist="http://exist.sourceforge.net/NS/exist" exist:hits="10313" exist:start="1" exist:count="10" exist:compilation-time="1" exist:execution-time="1">
The Life of King Henry the Fifth William Shakespeare Craig A. Berry, Martin Mueller, and Clifford Wulfman
</exist:result>
This is just the first hit of text...
Tried this also on Ubuntu with exist-db 4.4.0... same "result".
I think you have made a mistake in your query. You state that you tried:
http://127.0.0.1:8080/exist/rest/apps/shakespeare/data/he5.xml?_query=//text
That query tries to find all elements named text. I think you probably wanted to just get all the text nodes, so you would instead need:
http://127.0.0.1:8080/exist/rest/apps/shakespeare/data/he5.xml?_query=//text()
Note the additional ().
Also I think your URL path looks suspect too, if you are trying to use the REST API then you should use the full database collection path, i.e.:
http://127.0.0.1:8080/exist/rest/db/apps/shakespeare/data/he5.xml?_query=//text()
Note the additional /db.

Solr query conundrum

I've recently swapped from using Lucene for Sitecore to Solr.
For the most part it has been smooth, but the way I was writing some queries (using Sitecore.ContentSearch.Linq) abstraction now don't seem to be compatible.
Specifically, I have a situation where I've got "global" content and "regional" content, like so:
Home (000)
X
Y
Z
Regions (ID: 111)
Region 1 (ID: 221)
A
B
Region 2 (ID: 222)
D
My code worked on Lucene, but now doesn't on Solr. It should find all "global" and a single region's content, excluding all other region's content. So as an example, if the user's current region was Region 1, I'd want the query to return content X, Y, Z, A, B.
Sitecore's Item Crawler has a field for each item in the index called "_path" which is a multivalued string field of IDs, so as an example, Region 1's _path field value would be [000, 111, 221 ].
When I write this using the Linq abstraction it comes out as below which doesn't return results.
-_path:(111) OR _path:(221)
But _path:(111) does return result. Mind blown.
When I use the Solr interface and wrap each side of the OR in extra brackets like below (which I'd consider redundant) it works! Mind blown v2.
(-_path:(111)) OR (_path:(221))
Firstly, what's the difference between those queries?
Secondly, my real problem is I can't add these extra brackets as I'm working in an abstraction Linq so the brackets will be "optimized" out.
Any advice would be awesome! Cheers.
The problem here is, lucene's negative queries don't work like you think they do. They only remove results from what has been found. -_path:111 doesn't find all documents which aren't in 111, it doesn't find anything at all. It only removes results. So you are finding all results with path "221", then removing any that also have path "111", which from your heirarchy, I assume is all of them. See my answer here for a bit more on that topic.
The OR makes it seem like it ought to work, but really -_path:(111) OR _path:(221) is the same as -_path:(111) _path:(221). The moral here is: Don't use Lucene's AND/OR/NOT syntax, if you can help it. Use +/-. +/- syntax actually expresses how the query operates, AND/OR/NOT doesn't. It attempts to shoehorn it into a different, SQL-like retrieval model and leads to some unexpected behavior like this.
So, what about: (-_path:(111)) OR (_path:(221))
Well, first, does it actually work? Or does it just get some results?
If it just gets some results, but just seems to get the same results as _path:221: The reason is -_path:111 gets no results, so your query is, in practice, something like: (nothing) OR (_path:221), which is equivalent to _path:221
If it really does get the results you expect (I'm guessing it probably does): Something is translating your query into something like: (*:* -_path:111) (_path:221). Solr does have some logic along these lines, though I'm not quite sure in this case. Essentially, it puts a match-all in front of any lonely negative queries it finds, allowing them to do what you were expecting. If the implicit *:* makes you nervous about performance, well, it should. But lucene is an inverted index, it does well with finding matches on a term quickly. Getting everything that doesn't match goes against the grain of that retrieval model, and will pretty much have to do a full scan of the index.

XPATH - cannot select grandparent node

I am trying to parse a live betting XML feed and need to grab each bet from within the code. In plain English I need to use the tag 'EventSelections' for my base query and 'loop' through these tags on the XML so I grab all that data and it creates and entity for each one which I can use on a CMS.
My problem is I want to go up two places in the tree to a grandparent node to gather that info. Each EventID refers to the unique name of a game and some games have more bets than others. It's important that I grab each bet AND the EventID associated with it, problem is, this ID is the grandparent each time. Example:
<Sportsbet Time="2013-08-03T08:38:01.6859354+09:30">
<Competition CompetitionID="18" CompetitionName="Baseball">
<Round RoundID="2549" RoundName="Major League Baseball">
<Event EventID="849849" EventName="Los Angeles Dodgers (H Ryu) At Chicago Cubs (T Wood)" Venue="" EventDate="2013-08-03T05:35:00" Group="MTCH">
<Market Type="Match Betting - BIR" EachWayPlaces="0">
<EventSelections BetSelectionID="75989549" EventSelectionName="Los Angeles Dodgers">
<Bet Odds="1.00" Line=""/>
</EventSelections>
<EventSelections BetSelectionID="75989551" EventSelectionName="Chicago Cubs">
<Bet Odds="17.00" Line=""/>
</EventSelections>
Does anyone know how I can grab the granparent tags as well?
Currently I am using:
//EventSelections (this is the context)
.//#BetSelectionID
.//#EventSelectionName
I have tried dozens of different ways to do this including the ../.. operator which won't work either. I'd be eternally grateful for any help on this. Thanks.
I think you just haven't gone far enough up the tree.
../* is a two-step location bath with abbreviations, expanded to parent::node()/child::* ... so in effect you are going up the tree with the first step, but back down the tree for the second step.
Therefore, ../* gives you your siblings (parent's children), ../../* gives you your aunts and uncles (grandparent's children), and ../../../* gives you your grandparent and its siblings (great-grandparent's children).
For attributes, ../#* is an abbreviation for parent::node()/attribute::* and attributes are attached to elements, they are not considered children. So you are going sideways, not down the tree in the second step.
Therefore, unlike above, ../#* gives you your parent's attributes, while ../../#* gives you your grandparent's attributes.
But using // in your situation is really inappropriate. // is an abbreviation for /descendent-or-self::node()/ which walks all the way down a tree to the leaves of the tree. It should be used only in rare occasions (and I cringe when I see it abused on SO questions).
So ..//..//..//#RoundID may work for you, but it is in effect addressing attributes all over the tree and not just an attribute of your great-grandparent, which is why it is finding the attribute of your grandparent. ../../#RoundID should be all you need to get the attribute of your grandparent.
If you torture a stylesheet long enough, it will eventually work for you, but it really is more robust and likely faster executing to address things properly.
You could go with ancestor::Event/#EventID, which does exactly you asked for: matches an ancestor element named Event and returns it's EventID attribute.

How to get the 'current observation data' from the NDFD (NOAA, NWS) REST service?

I'm trying to use the NDFD (National Digital Forecast Database) to get current temperature and relative humidity given a Lat and Long using their REST based service.
The issue at hand:
I can't match the 'current observation data' WITH the 'results' I get back from the REST-service.
The setup:
Location:
* Apple (1-infinite loop, Cupertino, California)
* Lat = 37.33; Lon = -122.03
If I issue the following REST-call:
http://www.weather.gov/forecasts/xml/sample_products/browser_interface/ndfdXMLclient.php?lat=37.33&lon=-122.03&product=time-series&begin=2009-06-21T17:12:35&end=2009-06-21T17:12:35&appt=appt&rh=rh&temp_r=temp_r&temp=temp
Note 1: I'm passing the begin and end time in UTC. They're the same because I'm
looking for just a single-point-in-time: the latest observed
temp and relative humidity.
AND, then compare it to what is the closet reporting stations (San Jose International Airport, CA - KSJC - 37.37N 121.93W) # http://www.weather.gov/xml/current_obs/KSJC.xml
** I can never get them to MATCH. **
Note 2: The nearest reporting station is return back from the REST call
as well, so I know I'm comparing Location apples to Location apples.
I've had two ideas:
1: I'm doing something wrong with how I'm passing in the begin/end times into the REST call...
2: You can't get 'current observed data' the way I'm trying to...
Lastly:
I've found a solution using outoftime's NOAA ruby lib , [it parses an observation stations YAML file to find the nearest one given Lat/Lng then goes directly to that station via its identifier i.e. http://www.weather.gov/xml/current_obs/KSJC.xml].... but it just feels like I may be missing something obvious here and would like to use the REST-based interface ;)
Any help or pointers would be appreciated!
Thanks!
It looks like the service you are calling isn't for current data. Judging by the URL and the XML results it seems to be for forecasts. You can also put in future dates to get future forecast data. It expects the dates to be in the -0700 time zone according to the response. I'm not sure which service you should be calling to get the data you want though.
I know that this is an old question, but this is what I'm using to get current weather conditions: http://forecast.weather.gov/MapClick.php?lat=43.09110&lon=-79.0162&unit=0&lg=english&FcstType=dwml
Found this api/link yesterday. Its still developmental (operation-mode="developmental"):
http://forecast.weather.gov/MapClick.php?lat=37.33&lon=-122.03&FcstType=dwml
If you want the "current" observation, you use the XML here:
http://w1.weather.gov/xml/current_obs/seek.php?state=or&Find=Find
e.g.,:
http://w1.weather.gov/xml/current_obs/KAST.xml
If you click on the link you'll get a rendered page. However, if you pull from it using normal rest methods or just wget, it delivers an xml file.

Resources