Remove HTML formatting in Razor MVC 3 - asp.net-mvc-3

I am using MVC 3 and Razor View engine.
What I am trying to do
I am making a blog using MVC 3, I want to remove all HTML formatting tags like <p> <b> <i> etc..
For which I am using the following code. (it does work)
#{
post.PostContent = post.PostContent.Replace("<p>", " ");
post.PostContent = post.PostContent.Replace("</p>", " ");
post.PostContent = post.PostContent.Replace("<b>", " ");
post.PostContent = post.PostContent.Replace("</b>", " ");
post.PostContent = post.PostContent.Replace("<i>", " ");
post.PostContent = post.PostContent.Replace("</i>", " ");
}
I feel that there definitely has to be a better way to do this. Can anyone please guide me on this.

Thanks Alex Yaroshevich,
Here is what I use now..
post.PostContent = Regex.Replace(post.PostContent, #"<[^>]*>", String.Empty);

The regular expression is slow. use this, it's faster:
public static string StripHtmlTagByCharArray(string htmlString)
{
char[] array = new char[htmlString.Length];
int arrayIndex = 0;
bool inside = false;
for (int i = 0; i < htmlString.Length; i++)
{
char let = htmlString[i];
if (let == '<')
{
inside = true;
continue;
}
if (let == '>')
{
inside = false;
continue;
}
if (!inside)
{
array[arrayIndex] = let;
arrayIndex++;
}
}
return new string(array, 0, arrayIndex);
}
You can take a look at http://www.dotnetperls.com/remove-html-tags

Just in case you want to use regex in .NET to strip the HTML tags, the following seems to work pretty well on the source code for this very page. It's better than some of the other answers on this page because it looks for actual HTML tags instead of blindly removing everything between < and >. Back in the BBS days, we typed <grin> a lot instead of :), so removing <grin> is not an option. :)
This solution only removes the tags. It does not remove the contents of those tags in situations where that might be important -- a script tag, for example. You'd see the script, but the script wouldn't execute because the script tag itself gets removed. Removing the contents of an HTML tag is VERY tricky, and practically requires that the HTML fragment be well formed...
Also note the RegexOption.Singleline option. That's very important for any block of HTML. as there's nothing wrong with opening an HTML tag on one line and closing it in another.
string strRegex = #"</{0,1}(!DOCTYPE|a|abbr|acronym|address|applet|area|article|aside|audio|b|base|basefont|bdi|bdo|big|blockquote|body|br|button|canvas|caption|center|cite|code|col|colgroup|datalist|dd|del|details|dfn|dialog|dir|div|dl|dt|em|embed|fieldset|figcaption|figure|font|footer|form|frame|frameset|h1|h2|h3|h4|h5|h6|head|header|hr|html|i|iframe|img|input|ins|kbd|keygen|label|legend|li|link|main|map|mark|menu|menuitem|meta|meter|nav|noframes|noscript|object|ol|optgroup|option|output|p|param|pre|progress|q|rp|rt|ruby|s|samp|script|section|select|small|source|span|strike|strong|style|sub|summary|sup|table|tbody|td|textarea|tfoot|th|thead|time|title|tr|track|tt|u|ul|var|video|wbr){1}(\s*/{0,1}>|\s+.*?/{0,1}>)";
Regex myRegex = new Regex(strRegex, RegexOptions.Singleline);
string strTargetString = #"<p>Hello, World</p>";
string strReplace = #"";
return myRegex.Replace(strTargetString, strReplace);
I'm not saying this is the best answer. It's just an option and it worked great for me.

Related

Reversing string not working using Processing (https://processing.org/)

I have an assignment for school where I need to do some things with text. One of them being reversing a string.
Now I've got a while-loop that kind of works, but I have some questions about it.
if(drawRev){
int i = textBoxInput.length();
while(i>0){
textRev += textBoxInput.substring(i-1,i);
i--;
if(i==0){
finalReversed = textRev;
drawRev = false;
drawReverse = true;
}
}
}
So first thing I'd like to ask is: Why does the while-loop not stop when i reaches 0?
The boolean drawRev is true when I click a button but I have to manually make it false if i==0.
I shouldn't have to do this right?
Second question I have is: How do I keep the reversed text to display it?
It does in fact reverse the text when I enter it, but it immediately turns into an empty string when it finishes.
I'm a beginning student and pretty new to programming in general, so keep it simple please!
If you'd like to see the whole code it's available here: http://pastebin.com/f1dW8b0Y
I've got it working.
I tried to make it too complex.
Thanks to deamentiaemundi.
This works:
if(drawRev){
int i = textBoxInput.length();
while(i>0){
textRev += textBoxInput.substring(i-1,i);
i--;
}
}
Here's the working code for someone with a similar issue: http://pastebin.com/mQC9AwVD
another way to reverse a string
$(document).ready( function(){
var str = "test";
var revstr = str.split("").reverse().join(""); //"test" to ['t','e','s','t'] to ['t','s','e','t'] to "tset"
$(".test").text(revstr)
});
For reference: How do you reverse a string in place in JavaScript?

MVC3 Display Templates Truncate String

I have created a custom Display For template that will be used mainly in my index file so that when the records are shown in the lists, they are not turned into ugly looking creature if some records are way too lengthy. I have tried following:
#model string
#{
string text = Html.Encode(Model??"");
if (text.Length >= 35)
{
text = text.Substring(0, 35)+"...";
}
#Html.DisplayFor(model=>text)
}
Though it works fine for the strings having length more than 35 or equal to it, but it doesn't work if the string is lesser than that. I have tried the else statement, but it doesn't work either.
What is the correct way to do this?
Edit: Null string. In the source page file, between the two there is nothing.
try this for the template
#model string
#{
string text = Html.Encode(Model ?? "");
if (text.Length >= 35)
{
text = text.Substring(0, 35) + "...";
}
}
#text

T4 FieldName in camelCase without Underscore?

I'm using T4 to generate some class definitions and find that I'm getting an underscore in front of my field names.
I have set
code.CamelCaseFields = true;
just to be safe (even though I understand that's the default) but still end up with _myField rather than myField.
How can I generate a field name without the '_' character?
Also, where is the documentation for T4? I'm finding plenty of resources such as
Code Generation and Text Templates and numerous blogs, but I have not found the class-by-class, property-by-property documentation.
You're probably talking about EF4 Self Tracking Entities. The CodeGenerationTools class is included via the <## include file="EF.Utility.CS.ttinclude"#> directive, which you can find at "[VSInstallDir]\Common7\IDE\Extensions\Microsoft\Entity Framework Tools\Templates\Includes\EF.Utility.CS.ttinclude".
The FieldName function is defined as such:
private string FieldName(string name)
{
if (CamelCaseFields)
{
return "_" + CamelCase(name);
}
else
{
return "_" + name;
}
}
The "_" is hardcoded in the function. Coding your own shouldn't be difficult. Note that the CodeGenerationTools class is specific to this ttinclude file and isn't a generic and embedded way to generate code in T4.
I've written the following method to make first character upper case, remove spaces/underscores and make next character upper case. See samples below. Feel free to use.
private string CodeName(string name)
{
name = name.ToLowerInvariant();
string result = name;
bool upperCase = false;
result = string.Empty;
for (int i = 0; i < name.Length; i++)
{
if (name[i] == ' ' || name[i] == '_')
{
upperCase = true;
}
else
{
if (i == 0 || upperCase)
{
result += name[i].ToString().ToUpperInvariant();
upperCase = false;
}
else
{
result += name[i];
}
}
}
return result;
}
input/output samples:
first_name = FirstName,
id = Id,
status message = StatusMessage
This is good advice however it doesn't help you in knowing WHERE the right place to put such a function is...
Is there any guidance on DECOMPOSING the EF .tt files or stepping through the output generation to see how it builds the output?
I was able to use the above function successfully by plugging it into a function called
(Ef4.3)
public string Property(EdmProperty edmProperty)
Which appears to be used to output the lines like "public int fieldname { get; set; }"
and changed the 3rd (index {2}) param to the formating to wrap with the function to modify the name, like this:
_typeMapper.GetTypeName(edmProperty.TypeUsage), //unchanged
UnderScoreToPascalCase(_code.Escape(edmProperty)), //wrapped "name"
_code.SpaceAfter(Accessibility.ForGetter(edmProperty)), // unchanged
This is not perfect, eg: it doesn't keep existing "Ucasing" and doesn't care about things like this:
customerIP
outputs: Customerip
which IMO is not very readable...
but its better than what I WAS looking at which was a nightmare because the database was intermingled mess of camelCase, PascalCase and underscore separation, so pretty horrific.
anyway hope this helps someone...

How can I search for a text and fill/click on a link with Selenium?

Here's the deal:
Is there a way to search for an input name or type witch is not precise and fill it?
For example, I want to fill any input with the name email with my email, but I maybe have some inputs named email-123, emailemail, emails etc... Is there a way to do something like * email * ?
And how can I click on a link verifying some text that could be on the link, or above the link, or close, or at class etc ?
ps: I'm using selenium ide with firefox
You can use Xpath to find it with something like //input[contains(#name,'email'). If you have multiple instances like that on the page it will be worth moving your test to your favourite programming language and then doing
emailInstances = sel.get_xpath_count("//input[contains(#name,'email')]")
for i in range(int(emailInstances)):
sel.type("//input[contains(#name,'email')]["+ i + 1 +"]","email#address.tld")
Xpath works well and the solution above is good. If you are trying to test old verions of IE you could also use JavaScript injection. I find it is very fast, although can be a bit trickier to debug. I didn't actually check if the below works but hopefully it gives you an idea of what you can do:
String javaScript = "_sl_enterEmailStr = function(parentObj,str) { "+
" var allTags = parentObj.getElementsByTagName('input'); "+
" for (var i = 0; i < allTags.length; ++i) { "+
" var tag = allTags[i]; "+
" if (tag.name && tag.type && tag.type === 'text' "+
" && tag.name.match(/email/)) { "+
" tag.value = str; "+
" } "+
" } "+
"}; "+
"_sl_enterEmailStr(this.browserbot.getCurrentWindow().document "+
" ,'myemail#mydomain.org'); ";
mySelenium.getEval(javaScript);
I find JavaScript injection with regular expressions allows me to do great things to dynamic input fields. Note you can use findElement() to be more specific about where you look for tags.
Regarding clicking a link and getting text, those are simple click() and getText() operations that can be done given the proper locator. I would check out the selenium API. for example, here is the link to the Java one for 1.0b2.

Image tag not closing with HTMLAgilityPack

Using the HTMLAgilityPack to write out a new image node, it seems to remove the closing tag of an image, e.g. should be but when you check outer html, has .
string strIMG = "<img src='" + imgPath + "' height='" + pubImg.Height + "px' width='" + pubImg.Width + "px' />";
HtmlNode newNode = HtmlNode.Create(strIMG);
This breaks xhtml.
Telling it to output XML as Micky suggests works, but if you have other reasons not to want XML, try this:
doc.OptionWriteEmptyNodes = true;
Edit 1:Here is how to fix an HTML Agilty Pack document to correctly display image (img) tags:
if (HtmlNode.ElementsFlags.ContainsKey("img"))
{ HtmlNode.ElementsFlags["img"] = HtmlElementFlag.Closed;}
else
{ HtmlNode.ElementsFlags.Add("img", HtmlElementFlag.Closed);}
replace "img" for any other tag to fix them as well (input, select, and option come up frequently). Repeat as needed. Keep in mind that this will produce rather than , because of the HAP bug preventing the "closed" and "empty" flags from being set simultaneously.
Source: Mike Bridge
Original answer:
Having just labored over solutions to this issue, and not finding any sufficient answers (doctype set properly, using Output as XML, Check Syntax, AutoCloseOnEnd, and Write Empty Node options), I was able to solve this with a dirty hack.
This will certainly not solve the issue outright for everyone, but for anyone returning their generated html/xml as a string (EG via a web service), the simple solution is to use fake tags that the agility pack doesn't know to break.
Once you have finished doing everything you need to do on your document, call the following method once for each tag giving you a headache (notable examples being option, input, and img). Immediately after, render your final string and do a simple replace for each tag prefixed with some string (in this case "Fix_", and return your string.
This is only marginally better in my opinion than the regex solution proposed in another question I cannot locate at the moment (something along the lines of )
private void fixHAPUnclosedTags(ref HtmlDocument doc, string tagName, bool hasInnerText = false)
{
HtmlNode tagReplacement = null;
foreach(var tag in doc.DocumentNode.SelectNodes("//"+tagName))
{
tagReplacement = HtmlTextNode.CreateNode("<fix_"+tagName+"></fix_"+tagName+">");
foreach(var attr in tag.Attributes)
{
tagReplacement.SetAttributeValue(attr.Name, attr.Value);
}
if(hasInnerText)//for option tags and other non-empty nodes, the next (text) node will be its inner HTML
{
tagReplacement.InnerHtml = tag.InnerHtml + tag.NextSibling.InnerHtml;
tag.NextSibling.Remove();
}
tag.ParentNode.ReplaceChild(tagReplacement, tag);
}
}
As a note, if I were a betting man I would guess that MikeBridge's answer above inadvertently identifies the source of this bug in the pack - something is causing the closed and empty flags to be mutually exclusive
Additionally, after a bit more digging, I don't appear to be the only one who has taken this approach:
HtmlAgilityPack Drops Option End Tags
Furthermore, in cases where you ONLY need non-empty elements, there is a very simple fix listed in that same question, as well as the HAP codeplex discussion here: This essentially sets the empty flag option listed in Mike Bridge's answer above permanently everywhere.
There is an option to turn on XML output that makes this issue go away.
var htmlDoc = new HtmlDocument();
htmlDoc.OptionOutputAsXml = true;
htmlDoc.LoadHtml(rawHtml);
This seems to be a bug with HtmlAgilityPack. There are many ways to reproduce this, for example:
Debug.WriteLine(HtmlNode.CreateNode("<img id=\"bla\"></img>").OuterHtml);
Outputs malformed HTML. Using the suggested fixes in the other answers does nothing.
HtmlDocument doc = new HtmlDocument();
doc.OptionOutputAsXml = true;
HtmlNode node = doc.CreateElement("x");
node.InnerHtml = "<img id=\"bla\"></img>";
doc.DocumentNode.AppendChild(node);
Debug.WriteLine(doc.DocumentNode.OuterHtml);
Produces malformed XML / XHTML like <x><img id="bla"></x>
I have created a issue in CodePlex for this.

Resources