CKEditor: How to get only characters - ckeditor

I'm using wordcount plugin for CKEditor. It perfectly displays wordcount and characters count ignoring spaces.
How do I get only characters(without spaces/line-breaks)? Is there any default API provided CKeditor or wordcount plugin?
editor.getData() - returns complete text with HTML
editorContent.text().trim() - returns text(without HTML) but it doesn't ignore line-breaks and spaces.

No, there is no official API or plugin for that.
Instead of trimming editorContent.text() you could replace all whitespace characters using regex (e.g. /\s/g).

Related

Google Sheets - Find and replace specific characters in a cell

I need to find some specific characters in a cell and replace them with other characters.
So far I can do that by using :
=SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(A1,"★","•",0),"<b>","",0),"</b>","",0),"✔ ","",0)
However, this formula will become very long if I need to replace a lot of characters. Is there any way to reduce the duplicate parts, especially when I need to replace some characters with only the other one. Ex: Replace , , ✔ with "" as example above.
Demo sheet: https://docs.google.com/spreadsheets/d/1wX9mEykCMjeotTRTg_jSMcm9Mm7WM0kPTetRwGLzaYU/edit#gid=0
Google Sheets (but not Excel) has a handy formula, REGEXREPLACE, that will let you do what you need:
=SUBSTITUTE(REGEXREPLACE(A1,"<b>|</b>|✔",""),"★","•")
If you need to remove any more characters, just add them after the checkmark, separated by |.

Render non english characters in asciidoctor-pdf

I am trying to write documentation with asciidoctor-pdf and I need to use characters like : ă,â,î,ş,ţ. The pdf output is rendered but the mentioned characters are rendered empty. I am not sure how to handle the issue.
For example:
I wrote this code:
= Document Title
Doc Writer <doc#example.com>
:doctype: book
:source-highlighter: coderay
:listing-caption: Listing
// Uncomment next line to set page size (default is Letter)
//:pdf-page-size: A4
A simple http://asciidoc.org[AsciiDoc] document.
== Introducţie
A paragraph followed by a simple list with square bullets.
And the result was the word Introducţie rendered as Introduc ie and finally the error:
/usr/local/rvm/gems/ruby-2.2.2/gems/pdf-core-0.2.5/lib/pdf/core/pdf_object.rb:55: warning: regexp match /.../n against to UTF-8 string
Can be a system encoding configuration problem?
Do I need to set different encoding configuration in ruby?
Thank you.
I think that if you want to be sure, you can always use the decimal entity references form. For the latin small Letter T with cedilla it is: ţ
Check this table for the complete list:
List of Unicode characters
In addition, if you want to use this special char in a title, there was an issue with it:
Section id with characters outside of Windows-1252 encoding causes warning
It seems to be fixed now, but I did not verify it.
One of possible ways to write such special characters in titles is to declare them in preamble of your asciidoc document, for example,
:t-cedil: ţ
and to call it in the main text
== pass:normal[Test-{t-cedil}]
So your title will look like
Test-ţ

How to escape the back-tick (`) character in tiddlywiki?

I would like to use the back-tick in regular text (not in a code snippet) in TW5. Is this possible?
It is also possible to use the hex code (`) or the HTML code (&#96) for back-tick
I wanted to display a back-tick in a code block, so used:
<code>`</code>
Using the HTML code looks like:
<code>`</code>
This has the advantage that you're not disabling any parsing rules.
In TiddlyWiki5 you can disable certain parsing rules using the \rules pragma
A pragma is a special component of WikiText that provides control over the way the remaining text is parsed.
http://tiddlywiki.com/#Pragma
So if you add
\rules except codeinline
at the very(!) beginning of your tiddler text, any following backtick symbol in the text is not interpreted as special character.
This comes however at the cost that you cannot use this symbol as wikitext-directive anymore to achieve inline-code for programming snippets. Instead you would need to add the html code tag manually.

Error while rendering '&' in telerik html textbox

How can I handle an ampersand ("&") character in a Telerik HTML textbox?
While rendering, it's giving me an error. Also, does anybody know about any other character that may cause errors in an HTML textbox?
Ampersand is a special character in HTML that specifies the start of an escape sequence (so you can do something like © to get a copyright symbol, etc.). If you want to display an ampersand you have to escape it. So if you replace all ampersands with &, that should take care of the error.
However, if there were ampersands in your input that were already escaped - like maybe your data had © - you wouldn't want to escape that ampersand. But if your data won't have any of these ampersands, a simple replace should be fine.
You also need to replace greater than and less than symbols (> and <) with > and < respectively.
Telerik talks about these limitations/issues on this page http://www.telerik.com/help/reporting/report-items-html-text-box.html
Also according to the HTML specification (and the general XML
specification as well) the "&", "<" and ">" characters are considered
special (markup delimiters), so they need to be encoded in order to be
treated as regular text. For example the "&" character can be escaped
with the "&" entity. More information on the subject you can find in
this w3.org article.

Parsing out abnormal characters

I have to work with text that was previously copy/pasted from an excel document into a .txt file. There are a few characters that I assume mean something to excel but that show up as an unrecognised character (i.e. that '?' symbol in gedit, or one of those rectangles in some other text editors.). I wanted to parse those out somehow, but I'm unsure of how to do so. I know regular expressions can be helpful, but there really isn't a pattern that matches unrecognisable characters. How should I set about doing this?
you could work with http://spreadsheet.rubyforge.org/ maybe to read / parse the data
I suppose you're getting these characters because the text file contains invalid Unicode characters, that means your '?'s and triangles could actually be unrecognized multi byte sequences.
If you want to properly handle the spreadsheet contents, i recommend you to first export the data to CSV using (Open|Libre)Office and choosing UTF-8 as file encoding.
https://en.wikipedia.org/wiki/Comma-separated_values
If you are not worried about multi byte sequences I find this regex to be handy:
line.gsub( /[^0-9a-zA-Z\-_]/, '*' )

Resources