Extracting directory structure as metadata using entity recognition - google-search-appliance

I am attempting to extract values from within a url pattern and apply them as metadata by using regular expressions and entity recognition applied to the URL.
URL: https://example.com/folder1/folder2/folder3/folder4/page.html
Regex:
https:\/\/example\.com\/folder1\/folder2\/([^\/]*).*[^\/]*\/
This should extract folder3. This has been tested and works on regex101 and using reggyapp.com (which uses google RE2 engine, which the GSA uses)
https://regex101.com/r/aF2jR0/2
However when uploading to the GSA as an entity recognition file it does not recognise it.
<?xml version="1.0"?>
<instances>
<instance>
<name>ignoredname</name>
<pattern>https:\/\/example\.com\/folder1\/folder2\/([^\/]*).*[^\/]*\/</pattern>
<store_regex_or_name> regex_tagged_as_first_group </store_regex_or_name>
</instance>
</instances>

By default, GSA's Entity Recognition stores the text extracted from pattern. So you can remove the following portion from the xml.
<store_regex_or_name> regex_tagged_as_first_group </store_regex_or_name>.
Try without store_regex_or_name element.

Got it working now so posting this incase people are interested.
https://example.com/folder1/folder2/(\w+)/.*

Related

Gatsby: How to display a list of images specified in a yaml file?

I use the current version of Gatsby (2.x) and want to use gatsby-image for rendering a gallery for products.
I have several YAML files for products. I can already create pages with the text content of these files but I also want to add a small gallery with images specified in the .yaml file.
An example YAML file looks like this:
product: "Some product"
description: "It is really awesome!"
screenshots:
- /img/product1/screenshot1.jpg
- /img/product1/screenshot2.jpg
- /img/product1/screenshot3.jpg
My problem now is that I can get the screenshots only as strings but I have no idea how to pass them to for rendering.
I thought of creating a component that takes the file name and uses a query to get the image data - but it can't take any parameters because it can only use static queries.
I've also not found a way to pass results from the first graphql query to a second for the image data.
If you install gatsby-transformer-sharp and gatsby-plugin-sharp and use a correct path to your images, Gatsby will automatically pick those up and pipe them through sharp, hence you can query those images. You can have a look at one of my sites which also uses a YML file with image paths that I then use with gatsby-image: https://github.com/LekoArts/gatsby-starter-portfolio/blob/master/src/sites/sites.yaml

How to get the actual Hyperlink element inside the main document part using docx4j

So I have a case where I need to be able to work on the actual Hyperlink element inside the body of the docx, not just the target URL or the internal/externality of the link.
As a possible additional wrinkle this hyperlink wasn't present in the docx when it was opened but instead was added by the docx4j-xhtmlImporter.
I've iterated the list of relationships here: wordMLPackage.getMainDocumentPart().getRelationshipsPart().getRelationships().getRelationship()
And found the relationship ID of the hyperlink I want. I'm trying to use an XPath query: List<Object> results = wordMLPackage.getMainDocumentPart().getJAXBNodesViaXPath("//w:hyperlink[#r:id='rId11']", false);
But the list is empty. I also thought that it might need a refresh because I added the hyperlink at runtime so I tried with the refreshXMLFirst parameter set to true. On the off chance it wasn't a real node because it's an inner class of P, I also tried getJAXBAssociationsForXPath with the same parameters as above and that doesn't return anything.
Additionally, even XPath like "//w:hyperlink" fails to match anything.
I can see the hyperlinks in the XML if I unzip it after saving to a file, so I know the ID is right: <w:hyperlink r:id="rId11">
Is XPath the right way to find this? If it is, what am I doing wrong? If it's not, what should I be doing?
Thanks
XPathHyperlinkTest.java is a simple test case which works for me
You might be having problems because of JAXB, or possibly because of the specific way in which the binder is being set up in your case (do you start by opening an existing docx, or creating a new one?). Which docx4j version are you using?
Which JAXB implementation are you using? If its the Sun/Oracle implementation (the reference implementation, or the one included in their JDK/JRE), it might be this which is causing the problem, in which case you might try using MOXy instead.
An alternative to using XPath is to traverse the docx; see finders/ClassFinder.java
Try without namespace binding
List<Object> results = wordMLPackage.getMainDocumentPart().getJAXBNodesViaXPath("//*:hyperlink[#*:id='rId11']", false);

Why can't I search DocumentReferences for a particular subject using Spark server for FHIR DSTU-2 1.0.1?

I have a Patient resource at this url: http://localhost:49911/fhir/Patient/PHFId1
and DocumentReference resource with the following element:
<subject>
<reference value=" http://localhost:49911/fhir/Patient/PHFId1" />
</subject>
I want to be able to get a list of all DocumentReferences belonging to a certain patient but everything I have tried either returns no results, or else returns all Document References on the system. Some of the variations I have tried include:
fhir/Patient/PHFId1/DocumentReference (404 Not Found)
fhir/DocumentReference?subject:Patient=PHFId1 (no results)
fhir/DocumentReference?fhir/Patient/PHFId1 (no results)
fhir/DocumentReference?subject.reference=PHFId1 (no results)
What am I doing wrong? It must be a common use case to require a list of all documents relating to a Patient. Perhaps I have set up the linkage incorrectly by using the subject element?
Thanks in advance
The search syntax you've used is only correct in the second line, but other than that you're not doing anything wrong. This is a known issue in the Spark server (see https://github.com/furore-fhir/spark/issues/6).

How to make Windows Search work with custom HTML META tags?

Using Windows Search on Win2008.
I'm searching programmatically, using Microsoft.Search.Interop (CSearchQueryHelper, etc).
I have existing html files that include META tags in the header:
<meta name="fooo" content="baaa" />
I need to be able to search on these (query for "fooo:baaa") and also return the values in the result set. (I was able to do this with the old Indexing Service.)
I tried adding a "fooo" property to the Property System (PSRegisterPropertySchema). I can now use "fooo" in QuerySelectColumns without error, but the data never comes back. Also CSearchQueryHelper does not seem to recognize "fooo:" as a property constraint.
Searching for unqualified "baaa" returns the document. (baaa does not appear anywhere beyond the meta tag.)

jstl/jsp print name of file beutiful

I have my tag ${file.name} in a jsp file to display a name of file to download
name containt a full file name,include file extension. for example
testfile.png
a-file.zip
testfile-test505454654.png
a-filenum5468.docx
other_file_with_a_name_very_very_very_larrrrrrrrrrrrrrrrrrrrrge.pdf
Files with very long names, crash my layout.
I think the way to format the file name to shorten it but include the extention. Maybe
testfile.png
a-file.zip
testfile-test505454....png *
a-filenum5468.docx
other_file_with_a_na....pdf *
How I can do?
in href no problem because it is done by id ${file.id}
If file is a POJO, you may add a getter-method to the POJO (something like String getShortName(){}) that returns a short version of the file name. And then use the method in your EL expression: ${file.shortName}.
I would write and register a custom tag that would take care of shortening the output to a maximum length
<custom:short value="${file.name}" var="shortFileName" />
The tag would take care of shortening based on defaults or values you specify in the element and putting it the result in a request attribute you can use anywhere after that declaration.
Since the requirements can be used many times so You should go with CUSTOM Tag solution like #Sotirios Delimanolis suggested.
JSTL function ( Like c:fn ) is another solution. Using jstl function get better performance than Custom tag ( simple / classic model )
Link: http://www.noppanit.com/how-to-create-a-custom-function-for-jstl/

Resources