Spacing issue between letters while converting Word to PDF on Windows - windows

I am having a word document(docx) of urdu text in Jameel Noori Nastaleeq Font. And in word its showing 10 pages file but after exporting into PDF its showing 11 pages pdf file becuase every letter contains extra space.
Can anyone please provide information ?
Edited:
Please download the file from
File

It has to do with the XML formatting of Word. When any text is pasted into Word (while the font is Jameel Noori Nastaleeq) Word places extra formatting in between the words. That formatting shows fine in Word however in when the file is converted into PDF the extra space becomes visible. When the text is merely typed in Word, the formatting is applied to entire paragraphs rather than words. That is why a typed document doesn't contain the extra spaces.

Related

Interpreting a text character copied from a website and its format

I'm curious as to how this works from a low-level point of view.
I understand that computers deal with text characters using Ascii code, or unicode.
For example, just now I copied a '€' character symbol from a website to put in an email because the character is not on my keyboard.
How does Windows store this character? as a unique integer identifying this character? When I paste this character into an email or word document, even it preserves its text format.
How does the email editor or word application know how to translate what I copied with exact same format? What if where I copied the character from, it was using its own special type of character-encoding, would it translate to the wrong character then when I pasted it in an email.

iText: Word document has multiple columns

I found the following issues with iText PDF conversion:
- if a Word file has multiple columns, iText produces a PDF with just one column.
- if a Word file has multiple lines of text next to a picture, only the first line of text is displayed next to the picture. The other lines of text are displayed within the picture.
These seem to be bugs in iText. Is there a way to fix these issues or a workaround around them?
Regards,

Batch file to remove text between 2 characters

Is it possible to write a Windows batch file that can delete all text between 2 characters, including the characters themselves?
I am dynamically generating text files that includes a piece of text in HTML format. I want to extract only the non-HTML part of the text, meaning, I want to remove all HTML tags from it.
So, I want a Windows batch file that takes a text file as input, removes all characters between < and > (including) and creates an output file. Can you please help me with this?

Inserting Vertical Space in Pandoc Markdown

Is it possible to insert a extra vertical space using Pandoc flavored Markdown? Something that would show up as a blank line in a Word document or a <br> in HTML or \vspace in LaTeX. Or anything equivalent?
My problem is that I don't want a title for my reference list, but this puts my references too close to the preceding paragraph in both Word and in LaTeX.
One way to do it is to insert a paragraph containing just a nonbreaking space.
You can use either of these forms in pandoc:
\_ (where "_" signifies a space)
For pdf, do the following (replace the s with space): \s\s

How can I output .doc files with bolded and colored text

I need to output text to a .doc file. I am currently just outputting to a file like usual and using a .doc at the end of the file name
File.open('output_file.doc', 'a+') {|x| x.write(str)}
The issue is I want to make some of the text red and bold. How can this be achieved? I am using ruby, but I can easily switch to jruby thanks to the amazingness that is rvm, so if there are java libraries for this, that'd be great as well.
The short answer: use .rtf and then convert to .doc using word or open office. The following .rtf file (writes "normal text red text more normal text." and colors and bolds the red text):
{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf350
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;\red255\green0\blue0;}
\margl1440\margr1440\vieww13280\viewh10420\viewkind0
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural
\f0\fs24 \cf0 normal text
\b \cf2 red text
\b0 \cf0 more normal text.}
The long answer:
Strings are just plain ascii text, so there is no command that can make them bold. This is a property of all files in general, not just how Ruby works with files.
What text-editors do is use key strings within the file as commands to render the text in a certain way. For example, double asterisk surrounds bold text in the Stack Overflow editor. The file format of a file determines these rules.
.rtf is a basic file format that has the features you want and is easy to convert to .doc using msword or open office. THe advantage to .rtf is that it is human readable. So you can write an rtf file with red text, rename it .txt and open in a text editor and see what "decorations" the red font added. Play around with the parameters
If you are curious, the complete .rtf specifications can be found here:
http://www.biblioscape.com/rtf15_spec.htm
What's all the garbage at the top? That is header stuff. Fortunately you don't need to add more header material to add more text.

Resources