Using HTML Agility Pack in windows phone 7 - html-agility-pack

How can I get text in p tag behind from body tag with using Linq with HtmlAgilitypack?
Iam not sure that people say htmlagility doesn't support xpath.
I will parse html codes.

Simplest way to use HtmlAgility ->
HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(string); //string contains the html code
var paragraphTags = doc.DocumentNode.SelectNodes("p"); //selects all p tags
for (int i = 0; i < paragraphTags.Count; i++) //loop through the p tags
{
String text = paragraphTags[i].InnerHtml;
//text has your paragraph content. use it here.
}

Related

Can we set List-Style-Position in pdf using iText 7 ? is there any API available?

Am trying to find an equivalent method in Itext 7 to set the below style on lists. Is there any API available in Itext7 ?
list-style-position: inside;
Yes, there is API available to set analogue of list-style-position CSS property directly in layout code. For that, use listElement.setProperty(Property.LIST_SYMBOL_POSITION, ListSymbolPosition.INSIDE);
Example code:
Document document = new Document(pdfDocument);
List defaultList = new List();
for (int i = 0; i < 3; i++) {
defaultList
.add("Am trying to find an equivalent method in Itext 7 to set the below style on lists. Is there any API available in Itext7");
}
List bulletPositionInsideList = new List();
for (int i = 0; i < 3; i++) {
bulletPositionInsideList
.add("Am trying to find an equivalent method in Itext 7 to set the below style on lists. Is there any API available in Itext7");
}
bulletPositionInsideList.setProperty(Property.LIST_SYMBOL_POSITION, ListSymbolPosition.INSIDE);
document.add(defaultList);
document.add(bulletPositionInsideList);
Visual result of default list vs one with list symbol position set to inside:

kendo ui editor how to modify user selection with range object

Kendo UI 2015.2.805 Kendo UI Editor for Jacascript
I want to extend the kendo ui editor by adding a custom tool that will convert a user selected block that spans two or more paragraphs into block of single spaced text. This can be done by locating all interior p tags and converting them into br tags, taking care not to change the first or last tag.
My problem is working with the range object.
Getting the range is easy:
var range = editor.getRange();
The range object has a start and end container, and a start and end offset (within that container). I can access the text (without markup)
console.log(range.toString());
Oddly, other examples I have seen, including working examples, show that
console.log(range);
will dump the text, however that does not work in my project, I just get the word 'Range', which is the type of the object. This concerns me.
However, all I really need however is a start and end offset in the editor's markup (editor.value()) then I can locate and change the p's to br's.
I've read the telerik documentation and the referenced quirksmode site's explanation of html ranges, and while informative nothing shows how to locate the range withing the text (which seems pretty basic to me).
I suspect I'm overlooking something simple.
Given a range object how can I locate the start and end offset within the editor's content?
EDIT: After additional research it appears much more complex than I anticipated. It seems I must deal with the range and/or selection objects rather than directly with the editor content. Smarter minds than I came up with the range object for reasons I cannot fathom.
Here is what I have so far:
var range = letterEditor.editor.getRange();
var divSelection;
divSelection = range.cloneRange();
//cloning may be needless extra work...
//here manipulate the divSelection to how I want it.
//divSeletion is a range, not sure how to manipulate it
var sel = letterEditor.editor.getSelection()
sel.removeAllRanges();
sel.addRange(divSelection);
EDIT 2:
Based on Tim Down's Solution I came up with this simple test:
var html;
var sel = letterEditor.editor.getSelection();
if (sel.rangeCount) {
var container = document.createElement("div");
for (var i = 0, len = sel.rangeCount; i < len; ++i) {
container.appendChild(sel.getRangeAt(i).cloneContents());
}
html = container.innerHTML;
}
html = html.replace("</p><p>", "<br/>")
var range = letterEditor.editor.getRange();
range.deleteContents();
var div = document.createElement("div");
div.innerHTML = html;
var frag = document.createDocumentFragment(), child;
while ((child = div.firstChild)) {
frag.appendChild(child);
}
range.insertNode(frag);
The first part, getting the html selection works fine, the second part also works however the editor inserts tags around all lines so the result is incorrect; extra lines including fragments of the selection.
The editor supports a view html popup which shows the editor content as html and it allows for editing the html. If I change the targeted p tags to br's I get the desired result. (The editor does support br as a default line feed vs p, but I want p's most of the time). That I can edit the html with the html viewer tool lets me know this is possible, I just need identify the selection start and end in the editor content, then a simple textual replacement via regex on the editor value would do the trick.
Edit 3:
Poking around kendo.all.max.js I discovered that pressing shift+enter creates a br instead of a p tag for the line feed. I was going to extend it to do just that as a workaround for the single-space tool. I would still like a solution to this if anyone knows, but for now I will instruct users to shift-enter for single spaced blocks of text.
This will accomplish it. Uses Tim Down's code to get html. RegEx could probably be made more efficient. 'Trick' is using split = false in insertHtml.
var sel = letterEditor.editor.getSelection();
if (sel.rangeCount) {
var container = document.createElement("div");
for (var i = 0, len = sel.rangeCount; i < len; ++i) {
container.appendChild(sel.getRangeAt(i).cloneContents());
}
var block = container.innerHTML;
var rgx = new RegExp(/<br class="k-br">/gi);
block = block.replace(rgx, "");
rgx = new RegExp(/<\/p><p>/gi);
block = block.replace(rgx, "<br/>");
rgx = new RegExp(/<\/p>|<p>/gi);
block = block.replace(rgx, "");
letterEditor.editor.exec("insertHtml", { html: block, split: false });
}

Extract images from PDF with iTextSharp using Jscript

I've seen a few posts on extracting images from PDF using iTextSharp, but all are VB/C# based.
A core part of these solutions is something like:
PdfDictionary res = (PdfDictionary)(PdfReader.GetPdfObject(dict.Get(PdfName.RESOURCES)));
PdfDictionary xobj = (PdfDictionary)(PdfReader.GetPdfObject(res.Get(PdfName.XOBJECT)));
if (xobj != null)
{
foreach (PdfName name in xobj.Keys)
I can create the res and xobj objects fine in Jscript, but JScript does not support foreach loops. I have to do something like
for
(var x = 0; x < xobj.Keys.Count; x++)
{
var name = xobj.Keys(x)
...
}
But this is of course invalid.
Can someone explain how I can parse all the keys in xobj, without using foreach loops?

Parse PDF with ABCPDF

I want to parse a PDF document I download with ABCPDF, but I cant find any elements in the document or how to reach them and iterate them. I want to parse out some text.
var webClient = new WebClient();
var bytes = webClient.DownloadData("http://test.com/test.pdf");
var doc = new Doc();
doc.Read(bytes);
Use the Doc.GetText method to extract content from the current page, specifying the format in which content is to be returned.
doc.PageNumber = 1;
string pageContent = doc.GetText("Text");
The example above will return plain text in layout order. Specifying "SVG" or "SVG+" returns additional information along with the text, such as style and position.

Generate empty alt attribute in GWT

I working on a big project in GWT, and I would like to generate an empty alt attribute to image tags without rewrite the all image in the project. Is there any possibility in GWT?
Thanks
I'm not sure why you'd want to add empty alt tags, but this should add an empty alt attribute to every img tag on the page:
NodeList<Element> elems = Document.get().getElementsByTagName("img");
for (int i = 0; i < elems.getLength(); i++) {
elems.getItem(i).setPropertyString("alt", "");
}
Alternatively, you could do ((ImageElement) elems.getItem(i)).setAlt("");

Resources