My XML document looks like this
When I run XPATH query //collected_objects, I don't get any nodeset selected. What am I doing wrong? I want to select the whole collected_objects node.
Because your XML document has a XML namespace defined (<oval_system_characteristics xmlns="http://oval.mitre.org/XMLSchema/oval-system-characteristics-5") - you need to include that in your query!
How you can do this depends on what system/programming language you're using. In .NET / C#, you could do this something like this:
// create XmlDocument and load XML file
XmlDocument doc = new XmlDocument();
doc.Load(yourXmlFileNameHere);
// define XML namespace manager and a prefix for the XML namespace used
XmlNamespaceManager mgr = new XmlNamespaceManager(doc.NameTable);
mgr.AddNamespace("ns", "http://oval.mitre.org/XMLSchema/oval-system-characteristics-5");
// get list of nodes, based on XPath - using the XML namespace manager
XmlNodeList list = doc.SelectNodes("//ns:collected_objects", mgr);
Related
String mySQLString = "select * from document where documentTitle like '%test%' ";
SearchSQL sql = new SearchSQL(mySQLString);
IndependentObjectSet s = search.fetchObjects(sql, 10, null, true);
Document doc;
PageIterator iterator = s.pageIterator();
iterator.nextPage();
for (Object object : iterator.getCurrentPage()) {
doc = (Document) object;
Properties properties = doc.getProperties();
//I am trying to get an absolute or relative path here for every document.
// for eg: /objectstorename/foldername/filename like this.
}
I have tried searching propeties and class descriptions in document . but can't able to find the path. ?
To do it all in one single query (as you are trying to do in your code) you can create a join with the ReferentialContainmentRelationship table. The property Head of this table points to the document, the property Tail points to the folder the document is filled in and the property ContainmentName is the name the document has in the folder. Use the following code to construct the document path:
SearchSQL searchSQL = new SearchSQL("SELECT R.ContainmentName, R.Tail, D.This FROM Document AS D WITH INCLUDESUBCLASSES INNER JOIN ReferentialContainmentRelationship AS R WITH INCLUDESUBCLASSES ON D.This = R.Head WHERE DocumentTitle like '%test%'");
SearchScope searchScope = new SearchScope(objectStore);
RepositoryRowSet objects = searchScope.fetchRows(searchSQL, null, null, null);
Iterator<RepositoryRow> iterator = objects.iterator();
while (iterator.hasNext()) {
RepositoryRow repositoryRow = iterator.next();
Properties properties = repositoryRow.getProperties();
Folder folder = (Folder) properties.get("Tail").getEngineObjectValue();
String containmentName = properties.get("ContainmentName").getStringValue();
System.out.println(folder.get_PathName() + "/" + containmentName);
}
Paths constructed this way can also be used to fetch the object from the object store. The query code can be optimized by using a property filter as the third argument of the fetchRows() method. Don't know how this behaves if the document is filed in multiple folders.
I suggest you explore the "Creating DynamicReferentialContainmentRelationship Objects" section of FileNet documentation:
https://www.ibm.com/support/knowledgecenter/SSNW2F_5.5.0/com.ibm.p8.ce.dev.ce.doc/containment_procedures.htm#containment_procedures__fldr_creating_a_drcr
A FileNet Ddocument can be assigned to multiple Folders, so you can have several logical "Paths" for a given document.
At end, you should get something like "Folder.get_PathName() + DynamicReferentialContainmentRelationship.get_Name()" to display the full pathname.
As described by samples in FileNet documentation, a relationship object (e.g. DynamicReferentialContainmentRelationship) controls the relation of document/folder:
myRelationshipObject.set_Head(myDocument);
myRelationshipObject.set_Tail(myFolder);
Also, keep in mind that a FileNet Document can be also a "unfiled" document, so there is no actual "pathname" or folder "relationship" to be retrieved.
tl;dr from FileNet Content Engine - Database Table for Physical path
Documents are stored among the directories at the leaf level using a hashing algorithm to evenly distribute files among these leaf directories.
How do I add a schema to an IXMLDOMDocument?
For example, I want to generate the XML:
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1" Type="Frob" Target="Grob"/>
</Relationships>
I can construct the DOMDocument60 object (pseudo-code):
DOMDocument60 doc = new DOMDocument60();
IXMLDOMElement relationships = doc.appendChild(doc.createElement("Relationships"));
IXMLDOMElement relationship = relationships.appendChild(doc.createElement("Relationship"));
relationship.setAttribute("Id", "rId1");
relationship.setAttribute("Type", "Frob");
relationship.setAttribute("Target", "Grob");
Now comes the question of how to add the namespace.
How to add the namespace?
If I do the obvious solution, setting an attribute on the Relationships node called xmlns:
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
through something like:
relationships.setAttribute("xmlns",
"http://schemas.openxmlformats.org/package/2006/relationships");
When the document is saved, it causes the resulting xml to be wrong:
<Relationships xmlns="http://schemas.openxmlformats.org/package/2006/relationships">
<Relationship Id="rId1" Type="Frob" Target="Grob" xmlns=""/>
</Relationships>
It places empty xmlns attributes on every other element. In this little test document it only misapplies the xmlns to one element. In the real world there are dozens, or a few million other elements with an empty xmlns attribute.
namespaceURI property
I tried setting the namespaceURI property of the Relationships element:
relationshps.namespaceURI := "http://schemas.openxmlformats.org/package/2006/relationships";
but the property is read-only.
schemas Property
The document does have a schemas property, which gets or sets an XMLSchemaCache object. But it requires an actual schema document. E.g. trying to just set a schema doesn't work:
schemas = new XMLSchemaCache60();
schemas.add('', 'http://schemas.openxmlformats.org/spreadsheetml/2006/main');
doc.schemas := schemas;
But that tries to actually load the schema url, rather than not loading the schema because it isn't a URI.
Perhaps I have to randomly try other things:
schemas = new XMLSchemaCache60();
schemas.add('http://schemas.openxmlformats.org/spreadsheetml/2006/main', null);
doc.schemas := schemas;
But that causes no xmlns to be emitted.
Rather than trying to build an XML document the correct way, I could always use a StringBuilder to build the XML manually, and then have parse it into an XML Document object.
But I'd rather do it the right way.
The trick is to realize the W3C DOM Level 2 and 3 have a method createElementNS 🕗:
Creates an element with the specified namespace URI and qualified name.
Syntax
element = document.createElementNS(namespaceURI, qualifiedName);
However MSXML 6 only supports DOM Level 1.
Fortunately, W3C DOM Level 1 did have a method to create an element with a namespace: createNode🕗:
Creates a node using the supplied type, name, and namespace.
HRESULT createNode(VARIANT Type, BSTR name, BSTR namespaceURI, out IXMLDOMNode node);
Thus my solution is that i have to change:
relationships: IXMLDOMElement = doc.createElement("Relationships");
into:
const NODE_ELEMENT: Integer = 1;
const ns: string = "http://schemas.openxmlformats.org/package/2006/relationships";
relationships: IXMLDOMElement = doc.createNode(NODE_ELEMENT, "Relationships", namespace);
A sucky part is that every element must be created in that namespace:
function AddElementNS(IXMLDOMNode parentNode, String tagName, String namespaceURI): IXMLDOMElement;
{
doc: IXMLDOMDocument = parentNode as IXMLDOMDocument;
if (doc == null)
doc = parentNode.ownerDocument;
if (namespaceURI <> "")
Result = doc.createNode(NODE_ELEMENT, tagName, namespaceURI)
else
Result = doc.createElement(tagName);
parentNode.appendChild(Result);
}
relationships: IXMLDOMElement = AddElementNS(doc, "Relationships", ns);
relationship: IXMLDOMElement = AddElementNS(relationships, "Relationship", ns);
relationship.setAttribute("Id", "rId1");
relationship.setAttribute("Type", "Frob");
relationship.setAttribute("Target", "Grob");
Bonus Reading
Creating XML with namespaces with Javascript and MSXML 🕗
I have been working with LINQ to XML and have been stuck with an issue. I would really appreciate any help. I am new to LINQ to XML, but I found it easy to work with.
I have two different syndication feeds that I aggregate to one single syndication feed using Union. The final syndication feed contains 10 items.
I am trying to write the syndication feed to an XML file using XDocument and XElement. I have been able to do that successfully for the most part. But, some of the items in the feed do not have a description as a node element. When I get to the items that do not have this node element I am getting an Exception as I don’t have a description node for one of the items. How can I check the items to see if there is a node called description before I start writing the XML file? If the item does not contain the description node how could I populate it with a default value? Could you please suggest me any solution? Thank you for all your time!
SyndicationFeed combinedfeed = new SyndicationFeed(newFeed1.Items.Union(newFeed2.Items).OrderByDescending(u => u.PublishDate));
//save the filtered xml file to a folder
XDocument filteredxmlfile = new XDocument(
new XDeclaration("2.0", "utf-8", "yes"),
new XElement("channel",
from filteredlist in combinedfeed.Items
select new XElement("item",
new XElement("title", filteredlist.Title.Text),
new XElement("source", FormatContent(filteredlist.Links[0].Uri.ToString())[0]),
new XElement("url", FormatContent(filteredlist.Links[0].Uri.ToString())[1]),
new XElement("pubdate", filteredlist.PublishDate.ToString("r")),
new XElement("date",filteredlist.PublishDate.Date.ToShortDateString()),
// I get an exception here as the summary/ description node is not present for all the items in the syndication feed
new XElement("date",filteredlist.Summary.Text)
)));
string savexmlpath = Server.MapPath(ConfigurationManager.AppSettings["FilteredFolder"]) + "sorted.xml";
filteredxmlfile.Save(savexmlpath);
Just check for null:
new XElement("date",filteredlist.Summary !=null ? filteredlist.Summary.Text : "default summary")
I have a node and this node contains 5 childnodes. three of them is RatePlan. How can i select those RatePlan childnodes with LINQ?
Lets clarify something :
my xml is like this :
<hotels>
<hotel id="1" name="hotel 1">
<telephone>123456789</telephone>
<fax>123</fax>
<address>hotels address</address>
<hotelRatePlan>10</hotelRatePlan>
<hotelRatePlan>11</hotelRatePlan>
<hotelRatePlan>12</hotelRatePlan>
</hotel>
<hotel id="2" name="hotel 2">
<telephone>123456789</telephone>
<fax>123</fax>
<address>hotels address</address>
<hotelRatePlan>100</hotelRatePlan>
<hotelRatePlan>110</hotelRatePlan>
<hotelRatePlan>120</hotelRatePlan>
</hotel>
<hotel id="3" name="hotel 3">
<telephone>123456789</telephone>
<fax>123</fax>
<address>hotels address</address>
<hotelRatePlan>10</hotelRatePlan>
<hotelRatePlan>11</hotelRatePlan>
<hotelRatePlan>12</hotelRatePlan>
</hotel>
</hotels>
I am using XMLDocument to read XML file. After i read it i make a selection with SelectNodes. When i get first hotel information i want to select specific childnodes (hotelRatePlan). How can i do that?
Your question isn't particularly clear, but you might just want:
var ratePlans = node.Elements("RatePlan");
That's assuming you're actually using LINQ to XML rather than XmlNode, XmlDocument etc. If you are using the "old" DOM API, you could use:
var ratePlans = node.ChildNodes
.OfType<XmlElement>()
.Where(e => e.LocalName == "RatePlan");
... but I'd moving to LINQ to XML if you can. It's simply a much nicer XML API.
If you are sure that you will only have three rate plans per hotel, then you can load a hotel into an object of type Hotel like so:
XDocument data = XDocument.Load(yourXMLFileNameHere);
//if you have a namespace defined:
XNamespace ns = data.Root.Name.Namespace;
List<Hotels> hotels = (from item in data.Descendants(ns + "hotel")
select new Hotel
{
Id=Convert.ToInt32(item.Attribute("id").Value),
Name=item.Attribute("name").Value,
Telephone=item.Element(ns+"telephone").Value,
Fax=item.Element(ns+"fax").Value,
Address=item.Element(ns+"address").Value,
RatePlan1=item.Element(ns+"hotelRatePlan1").Value,
RatePlan2=item.Element(ns+"hotelRatePlan2").Value,
RatePlan3=item.Element(ns+"hotelRatePlan3").Value
}).ToList<Hotels>();
And then you reference your first rate plan in the following way:
string ratePlan1=hotels[0].RatePlan1;
If the number of your rate plans will vary, you can merge them together into a string like so:
<hotelRatePlans>10 20 30</hotelRatePlans>
Then you change the way you extract your rate plans, and when you need the actual plans, you use the String.Split method to get the array of individual plans.
I think you mean:
var ratePlans = node.ChildNodes.OfType<RatePlan>();
I am trying to get all links of a link when its parent class is name_of_box. I wrote the below but got nothing. How do i do this? With css i believe i can select it with .name_of_box a
var ls = htmldoc.DocumentNode.Elements("//div[#class='name_of_box']//a[#href]");
HtmlAgilityPack doesn't have the ability to directly query an attribute value. You have to loop over the list of anchor nodes. Here's one way:
var ls = new List<string>();
var nodes = htmldoc.DocumentNode.SelectNodes("//div[#class='name_of_box']//a");
nodes.ToList().ForEach(a => ls.Add(a.GetAttributeValue("href", "")));
But there is an experimental build you can look at, that will let you directly query an attribute.
It could be easily done with fizzler - a .NET library to select items from a node tree based on a CSS selector. The default implementation is based on HTMLAgilityPack and selects from HTML documents.
See:
var ls = htmldoc.DocumentNode.QuerySelectorAll(".name_of_box a");