HtmlAgilityPack add new element and copy all inline css attributes to the new one - html-agility-pack

Is there a way to copy inline css attributes from one node to a new node that i will add in HtmlAgilityPack ?

As far as I can tell you can't just assign one HtmlAttributeCollection to another, but you can copy them one by one like this:
foreach (HtmlAttribute attr in oldNode.Attributes)
newNode.Attributes.Add(attr);

Related

CKEditor 5: How to prevent cascading of linkHref attribute changes to all child nodes of a custom element structure?

The Setup
Currently, CKEditor 5 does not support image captions for inline images. As our CMS needs to utilize both block and inline images, I wrote a plugin that extends schema and conversions with a custom <caption> element that works both for <imageInline> and <imageBlock> (and a custom, plaintext-only <alt> element for also have a nested editable for the image alternative text, but that's omitted here because it's not part of the issue).
It starts with a schema extension for <caption> to work for both image types:
schema.register('caption', {
allowIn: ['imageBlock', 'imageInline'],
allowContentOf: '$block',
isLimit: true
});
Then, several conversions and up-/downcast helpers from the original image packages #ckeditor/ckeditor5-image/... (image/imageblockediting.js, image/imageinlineediting.js, image/converters.js, image/utils.js, imagecaption/imagecaptionediting.js) are adapted and/or overwritten with higher priority custom versions to have an editable <caption> element both inside <imageBlock> and <imageInline>.
Both image types can have a linkHref attribute, which is implemented by extended, higher priority versions of upcastImageLink() and downcastImageLink() from #ckeditor/ckeditor5-link/src/linkimageediting.js.
Also, we need to backup and restore the <caption> element and all its child nodes when changing between block and inline images by listening to the imageStyleCommand, because the command itself obviously does not care for our custom <caption> structure (the caption should be visible all the time, so there is no need for a toggle as in the original caption package).
All in all, it's 400+ lines of code, so I won't post it here - you can get the idea by looking at the original code mentioned above.
Working Code
Now, for block images we have the same model/view structure and functionality as the original <imageBlock> version, and for <imageInline> the model looks like this (including the parent paragraph and some text; please ignore the <altContainer> structure):
Which gets converted to this view structure (again, ignore the <altcontainer> structure):
The Problem
When selecting an <inlineImage> and adding a link, the linkHref attribute is not only set on the <inlineImage> element itself (that's what we want), but also on all of the <caption>'s child nodes. So the model is looking like this:
And the view:
As you can see, the linked text from the original caption gets lost, too.
So how do you prevent this?
I'm assuming this is happening because <imageInline> is obviously registered as isInline:true, because when linking an <imageBlock> image this is not an issue! I've tried to fix this with a registerPostFixer() routine, but that can fix only parts of the problem.
I'm working around this problem now by cloning the whole <caption> structure before linking an image, and replacing the "buggy" version with the cloned one after the link command has finished.
There should be a more elegant way to tell the engine not to apply the linkHref attribute to the <caption> child nodes, and leaving existing linkHref attributes inside as is; which is also an issue when removing the link for the image: existing child linkHref attributes get also removed.

Retrieve value from within a div tag in xpath

I am trying to retrieve the value in the data-appid field, I have tried using following-sibling but its not really a sibling per se. Not sure how to go about retrieving this. Any pointers will really be great.
<div class="section app" data-appid="532054761" data-updateid="10184169">
The sibling axis applies to elements, not attributes.
You can reference data-appid simply as an attribute of the div element. For example,
//div/#data-appid
will select 532054761
If you need to be more specific about the particular div element for which you want its data-appid, you can use a predicate to select a particular div element. For example:
//div[#data-updateid='10184169']/#data-appid
You need to use getAttribute() method
Try the following code
String dataAppId = driver.findElement(By.xpath("//div[#class='section app']")).getAttribute("data-appid");
System.out.println(dataAppId);

xpath: find a node whose content has a provided string

I have some HTML like this:
<div> Make </div>
And I want to match it based on the fact that the content of the node contains the text "Make".
Put another way "Make" is a substring of the div node's content and I want to make such a match on this node using XPath.
The obvious solution would be
//div[contains(., 'Make')]
but this will find all divs that contain the string "Make" anywhere within their content, so not only will it find the example you've given in the question but also any ancestor div of that one, or any divs where that substring is buried deep in a descendant element.
If you only want cases where that string is directly inside the div with no other intervening elements then you'd have to use the slightly more complex
//div[text()[contains(., 'Make')]]
This is subtly different from
//div[contains(text(), 'Make')]
which would look only in the first text node child of the div, so it would find <div>Make<br/>Break</div> but not <div>Break<br/>Make</div>
If you want to allow for intervening elements other than div, then try
//div[contains(., 'Make')][not(.//div[contains(., 'Make'])]
Seems like this is what you are looking for: //div[contains(text(),'Make')]
If this will not work you can try: //div[contains(.,'Make')]. This will find all divs, which contain 'Make' in any attribute.
To find that node anywhere in the document, you would need this:
//div[contains(text(), "Make")]

Ace code Editor with XML, hide specific xml attribute?

sorry, a newbie Q. Is it possible to hide a specific attribute throughout an XML doc?
I need a way to synchronize the contents of the editor with non-Ace objects elsewhere on the DOM (unfortunately a SWF file that loads the xml seperately...). I thought to label each node throughout the doc, e.g. tag='1', so that if a node with a given tag is manipulated in Ace, I can just use the tag to figure out what exactly was manipulated (and vice versa, update Ace when the xml is manipulated outside of Ace).
Best that people do not manipuate these tags, hence wanting to hide them from view.
Thanks :)
you can create folds to hide text, but i think for tracking changes it is better to use anchors which keep their position relative to text
a=ace.session.doc.createAnchor(row,col); // create
a.getPosition();
a.detach(); // remove when not needed anymore

Checking the HTML structure with XPATH, any count of nodes

I want to check the structure of some html piece of markup, just checking the structure.
For example I need to check that SOMEWHERE in <list-item-canvas> tag is <image name='category-pic'> tag.
I write:
//div[#class='list-item-canvas'][1]/*/img[#name='category-pic']
That's working if <img> is a second node after any ('*') node in the hierarchy, BUT if I have <img> somewhere deep-deep in the structure, AND I do not want to care about the level hierarchy how then I should write my xpath-query? I would think that instead '*' I might write '**' but I can not..
Is it possible?
Use:
(//div[#class='list-item-canvas'])[1]//img[#name='category-pic']
This selects any img the string value of whose name attribute is 'category-pic' and that is a descendant of the first (in document order) div the string value of whose class attribute is 'list-item-canvas'.
Do note the bracets surrounding the subexpression:
(//div[#class='list-item-canvas'])[1]
this is quite different from:
//div[#class='list-item-canvas'][1]
the latter selects every div element in the document that is the first div child of its parent -- and there may be potentially more than one such elements.
Do this:
//div[#class='list-item-canvas'][1]//img[#name='category-pic']
The // before img lets you find any descendant of the div that is an img, instead of just children or grandchildren of the div.
Also are you sure you want the [1] there? It may not be doing what you think.

Resources