I am trying to retrieve keys and parent keys from some structured xml stored as binary xml in oracle. I have tried created unstructured index and also an index with a structured component. The structured component works fine when doing a SELECT against XMLTABLE() but I cannot retrieve values of parent node using XMLTable. I am therefore trying the following Xquery to retrieve parent values but this is not using the index at all. Does this style of query support using XmlIndexes? I can't find anything in the docs that say either way.
SELECT y.*
FROM xml_data x, XMLTABLE(xmlnamespaces( DEFAULT 'namespace'),
'for $i in /foo/bar
return element r {
$i/someKey
,element parentKey { $i/../someKey }
}'
PASSING x.import_xml
COLUMNS
someKey VARCHAR2(100) PATH 'someKey'
,parentKey VARCHAR2(100) PATH 'parentKey'
) y
Thanks, Tom
Related
I would like to query the jackrabbit repository on the versions I have stored.
My repository looks like the following:
Following xpath query works well://element(*, nt:frozenNode)[jcr:contains(., '" + keyword + "') ]/rep:excerpt(.) and from the Row object returned I can get the excerpt found in the de:template nodes 'de:content' property (for this to be full-text indexable I have my own lucene configuration).
The problem however is: how to know what elements excerpt is found for, since the query only returns me the path found (/jcr:system/jcr:versionStorage/95/c8/3e/95c83efc-8441-4017-b3af-ae7be49f07e5/1.0/jcr:frozenNode/de:template) and the excerpt itself.
So I would like to know the identifier of the nt:versionHistory node, as stored in Jackrabbit.
I have a solution for this as well, by getting the parent nodes until the nt:versionHistory is reached and getting its identifier:
Row row = (Row) rows.next();
Node node = row.getNode();
Node frozenNode = node.getParent();
Node versionNumber = frozenNode.getParent();
String versionId = versionNumber.getIdentifier();
However this takes too much time and with lots of versions its bad for the performance.
Therefore, I wonder if it's possible to include this version id in the query, such that no parent nodes need to be fetched after the query is executed.
That's probably not possible using an XPath Query. But you could do it using SQL2 using a join:
SELECT n.*, excerpt(n), v.[jcr:uuid]
FROM [nt:frozenNode] AS n
INNER JOIN [nt:version] AS v ON ISDESCENDANTNODE(n,v)
WHERE contains(n.*,'Adobe')
I have a table where each row has a JSON structure as follows that I'm trying to index in a postgresql database and was wondering what the best way to do it is:
{
"name" : "Mr. Jones",
"wish_list": [
{"present_name": "Counting Crows",
"present_link": "www.amazon.com"},
{ "present_name": "Justin Bieber",
"present_link": "www.amazon.com"},
]
}
I'd like to put an index on each present_name within the wish_list array. The goal here is that I'd like to be able to find each row where the person wants a particular gift through an index.
I've been reading on how to create an index on a JSON which makes sense. The problem I'm having is creating an index on each element of an array within a JSON object.
The best guess I have is using something like the json_array_elements function and creating an index on each item returned through that.
Thanks for a push in the right direction!
Please check JSONB Indexing section in Postgres documentation.
For your case index config may be the following:
CREATE INDEX idx_gin_wishlist ON your_table USING gin ((jsonb_column -> 'wish_list'));
It will store copies of every key and value inside wish_list, but you should be careful with a query which hits the index. You should use #> operator:
SELECT jsonb_column->'wish_list'
FROM your_table WHERE jsonb_column->'wish_list' #> '[{"present_link": "www.amazon.com", "present_name": "Counting Crows"}]';
Strongly suggested to check existing nswers:
How to query for array elements inside JSON type
Index for finding an element in a JSON array
I'm using php API to query two sphinx indexes as below
$cl->Query("test","index1 index2");
and I'm getting the result from both of successfully but I can't differentiate which result is from which index. is there a way to tell the difference? or do I need to do 2 queries separately?
Set a unique attribute on each
source1 {
sql_query = SELECT id, 1 as index_id, ....
sql_attr_unit = index_id
}
source2 {
sql_query = SELECT id, 2 as index_id, ....
sql_attr_unit = index_id
}
Results will contain a 'index_id' attribute.
Almost the same if using RT indexes. just need to define a rt_attr_unit and then populate it appropriately when you inject data into the index.
The otherway, persumably you've already arranged for the ids in the two indexes to be non-overlapping (it wont work if have the same ids in both indexes) so can look a the ID to deduce the source index.
I have an external table in hive
CREATE EXTERNAL TABLE FOO (
TS string,
customerId string,
products array< struct <productCategory:string, productId:string> >
)
PARTITIONED BY (ds string)
ROW FORMAT SERDE 'some.serde'
WITH SERDEPROPERTIES ('error.ignore'='true')
LOCATION 'some_locations'
;
A record of the table may hold data such as:
1340321132000, 'some_company', [{"productCategory":"footwear","productId":"nik3756"},{"productCategory":"eyewear","productId":"oak2449"}]
Do anyone know if there is a way to simply extract all the productCategory from this record and return it as an array of productCategories without using explode. Something like the following:
["footwear", "eyewear"]
Or do I need to write my own GenericUDF, if so, I do not know much Java (a Ruby person), can someone give me some hints? I read some instructions on UDF from Apache Hive. However, I do not know which collection type is best to handle array, and what collection type to handle structs?
===
I have somewhat answered this question by writing a GenericUDF, but I ran into 2 other problems. It is in this SO Question
You can use json serde or build-in functions get_json_object, json_tuple.
With rcongiu's Hive-JSON SerDe the usage will be:
define table:
CREATE TABLE complex_json (
DocId string,
Orders array<struct<ItemId:int, OrderDate:string>>)
load sample json into it (it is important for this data to be one-lined):
{"DocId":"ABC","Orders":[{"ItemId":1111,"OrderDate":"11/11/2012"},{"ItemId":2222,"OrderDate":"12/12/2012"}]}
Then fetching orders ids is as easy as:
SELECT Orders.ItemId FROM complex_json LIMIT 100;
It will return the list of ids for you:
itemid
[1111,2222]
Proven to return correct results on my environment. Full listing:
add jar hdfs:///tmp/json-serde-1.3.6.jar;
CREATE TABLE complex_json (
DocId string,
Orders array<struct<ItemId:int, OrderDate:string>>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe';
LOAD DATA INPATH '/tmp/test.json' OVERWRITE INTO TABLE complex_json;
SELECT Orders.ItemId FROM complex_json LIMIT 100;
Read more here:
http://thornydev.blogspot.com/2013/07/querying-json-records-via-hive.html
One way would be to use either the inline or explode functions, like so:
SELECT
TS,
customerId,
pCat,
pId,
FROM FOO
LATERAL VIEW inline(products) p AS pCat, pId
Otherwise you can write UDF. Check out this post and this post for that. Along with the following resources:
Matthew Rathbone's guide to writing generic UDFs
Mark Grover's how to guide
the baynote blog post on generic UDFs
If size of array is fixed ( like 2 ). Please try:
products[0].productCategory,products[1].productCategory
But if not, UDF should be the right solution. I guess that you could do it in JRuby. GL!
I am using BaseX as backend to store XML Files. Front end is in Java. I want to populate
certain elements data into a combobox. The output of the XQuery is string. I am facing problem to load this string in a combobox. Below is the XML file-
<Cities>
<City><C>London</C></City>
<City><C>New Delhi</C></City>
<City><C>Mumbai</C></City>
<City><C>Moscow</C></City>
<City><C>Tokyo</C></City>
<City><C>Mumbai</C></City>
<City><C>Tokyo</C></City>
<City><C>Mumbai</C></City>
<City><C>Tokyo</C></City>
<City><C>Mumbai</C></City>
<City><C>New Delhi</C></City>
</Cities>
Using this XML file, I want to populate all the distinct cities in a combobox. This will be done by following XQuery-
for $x in distinct-values(doc("City")/Cities/City/C)
return $x
The output of this is a simple string -
`London New Delhi Mumbai Moscow Tokyo`
There are 5 cities resulting from the query.
How can I populate this in a combobox..?
This might help:
element select {
distinct-values(doc("City")/Cities/City/C) ! element option { . }
}