How to create PDU concatenated sms? - sms

Hi I am writing a pdu and im so confused on generating user data. I found an example
0041000C913619873721670000A0050003000301986F79B90D4AC3E7F53688FC66BFE5A0799A0E0AB7CB741668FC76CFCB637A995E9783C2E4343C3D4F8FD3EE33A8CC4ED359A079990C22BF41E5747DDE7E9341F4721BFE9683D2EE719A9C26D7DD74509D0E6287C56F791954A683C86FF65B5E06B5C36777181466A7E3F5B0AB4A0795DDE936284C06B5D3EE741B642FBBD3E1360B14AFA7E7
which will send the string "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.Ut enim ad minim veniam, qui" the bold letter on pdu.
i have a pdu encode created in .net which matches the other online encoder and engnick.blogspot.com/2011/09/gsm-7bit-part-of-pdu-packencoding.html
which results to this:
CCB7BCDC06A5E1F37A1B447EB3DF72D03C4D0785DB653A0B347EBBE7E531BD4CAFCB4161721A9E9EA7C769F7195466A7E92CD0BC4C0691DFA072BA3E6FBFC9207AB90D7FCB4169F7384D4E93EB6E3AA84E07B1C3E2B7BC0C2AD341E437FB2D2F83DAE1B33B0C0AB3D3F17AD855A583CAEE741B142683DA6977BA0DB297DDE9709B058AD7D3
when I try to do this
0041000C913619873721670000A0050003000301CCB7BCDC06A5E1F37A1B447EB3DF72D03C4D0785DB653A0B347EBBE7E531BD4CAFCB4161721A9E9EA7C769F7195466A7E92CD0BC4C0691DFA072BA3E6FBFC9207AB90D7FCB4169F7384D4E93EB6E3AA84E07B1C3E2B7BC0C2AD341E437FB2D2F83DAE1B33B0C0AB3D3F17AD855A583CAEE741B142683DA6977BA0DB297DDE9709B058AD7D3
it sends a wrong string. who can I generate a pdu data same the example?

You gotta send multiple messages by using UDH.
ESM CLASS attribute attribute has to have a value of "64", or "67" for unicode messages.
Also, the beginning of the must contain a hex of the identifier of the message like this:
05 00 03 CC 02 01 [ message 1 text ]
05 00 03 CC 02 02 [ message 2 text ]
http://www.activexperts.com/activsms/sms/multipart/
http://www.integrat.co.za/wiki/images/1/16/SMPP_v3_4_ESM_Class.pdf
http://en.wikipedia.org/wiki/Concatenated_SMS#PDU_Mode_SMS
Note that each message will be 3 letters shorter than the standard length of your message datacoding.

Related

MigraDoc: How to apply vertical line spacing to a paragraph?

I am creating a PDF using MigraDoc.
Everything works fine except the setting of line spacing of a paragraph.
I want to have more vertical space between paragraph lines.
What I tried so far without any change in the resulting PDF:
string text = "Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.";
Paragraph para = CreateParagraph(text , "Helvetica", 7, "0.1mm", Colors.Black, ParagraphAlignment.Left);
// tried this:
para.Format.LineSpacing = MigraDoc.DocumentObjectModel.Unit.FromMillimeter(12);
// and tried that:
para.Format.LineSpacing = 12;
Can anyone point me in the right direction?
The meaning of LineSpacing depends on the value set for LineSpacingRule.
If LineSpacingRule is set to e.g. Single or Double then the value set for LineSpacing will be ignored.
Try AtLeast or Exactly for LineSpacingRule.

what is Naur Text-Processing

Can someone please explain to me in layman terms what the Naur Text-Processing rules? I'm having trouble understanding what the rules mean such as line by line form and line breaks.
Imagine that you have a text, say
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do
eiusmod tempor incididunt ut labore et dolore magna aliqua.\nUt enim
ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut
aliquip ex ea commodo consequat. Duis aute irure dolor in
reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Excepteur sint occaecat cupidatat non proident, sunt in
culpa qui officia deserunt mollit anim id est laborum.
The text contains three kinds of characters:
Spaces ()
New Line characters (\n)
Letters (all other characters: letters, digits, punctuations...)
You have to split the given text into lines in the most efficient way (you want to obtain as few lines as possible), but the split must meet restrictions:
New Line character \n must start a new line
You can split text and start a new line on space only
Each line can contain at most MaxPos (given constant) characters.
In the sample above for MaxPos = 30 we can split as
Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor
incididunt ut labore et
dolore magna aliqua.\n <- \n New Line must break; we can't add "Ut" in the line
Ut enim ad minim veniam,
...
These splits broke the rules and that's why are invalid:
Lorem ipsum dolor sit amet, consectetur <- The line is too long, exceeds MaxPos = 30
...
Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor incidi <- wrong split: we can split on spaces only
dunt
...
Lorem ipsum dolor sit amet,
consectetur adipiscing elit,
sed do eiusmod tempor
incididunt ut labore et
dolore magna aliqua.\nUt enim <- \n (New Line) must start a new line
ad minim veniam, quis nostrud
...

Text in columns (like in a table)

I would like to have one column with a label and a second column with a longer text inside with line breaks like in a table.
Label Text: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed
diam nonumy eirmod tempor invidunt ut labore et dolore magna
aliquyam erat, sed diam voluptua. At vero eos et accusam et
justo duo dolores et ea rebum. Stet clita kasd gubergren, no
sea takimata sanctus est Lorem ipsum dolor sit amet. Lorem
ipsum dolor sit amet, consetetur sadipscing elitr, sed diam.
I tried:
paste label.txt long.txt | column -s $'\t'
Thank you very much in advance!
Glad you have accepted an answer. Just for others who might want to have the
text re-wrapped to avoid over-long lines, this sort of text-processing is what nroff was invented for
over 40 years ago. It's now part of the groff package. Here's an example:
(echo -e '.na\n.nh'
cat label.txt
echo "'in \\w' $(<label.txt)'u"
cat long.txt ) |
nroff | sed '/^$/d'
Nroff commands begin with . or ' at start of line.
.na stops justification, .nh stops hyphenation, 'in sets the indent
to the width of the string (\w'...'), and the sed is to remove trailing blank lines.
You can set the line width with .ll 80 eg for 80 columns.
Long live nroff!
Label Text: Lorem ipsum dolor sit amet, consetetur sadipscing
elitr, sed diam nonumy eirmod tempor invidunt ut
labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores
et ea rebum. Stet clita kasd gubergren, no sea
takimata sanctus est Lorem ipsum dolor sit amet.
Lorem ipsum dolor sit amet, consetetur sadipscing
elitr, sed diam.
The following bash script might help you:
padded-paste.sh:
#!/bin/bash
label=$1
text=$2
# get the number of lines in the text
nline=$(wc -l ${text} | cut -f 1 -d' ')
# get the width of the label
padding=$(awk 'NR==1{ print length }' ${label})
# create a temp directory
tmpdir=$(mktemp -dt "$(basename $0).XXXXXXXXXX")
templabel=${tmpdir}/label.tmp
# print the first line of the label file to a temp file:
awk 'NR==1{ print }' ${label} > ${templabel}
# add blank padding to the temp label file:
for i in $(seq 2 $nline); do
printf "%*s\n" $padding "" >> ${templabel}
done
# pasted the padded lable to the long text
paste -d' ' ${templabel} ${text}
Based on the following inputs:
label.txt:
Label Text:
long.txt:
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit
amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam.
You can use it like:
sh padded-paste.sh label.txt long.txt
And it will output:
Label Text: Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit
amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam.

Nokogiri fails outputting XML with UTF-16 declaration (understanding and working around)

Summary
Attempting to read and serialize XML documents that have a UTF-16 encoding and declaration causes Nokogiri to produce garbage after a certain point.
Is this a bug, or is there a reasonable explanation for this?
What's the best way to avoid it?
Environment
C:\>nokogiri -v
# Nokogiri (1.5.5)
---
warnings: []
nokogiri: 1.5.5
ruby:
version: 1.9.3
platform: i386-mingw32
description: ruby 1.9.3p194 (2012-04-20) [i386-mingw32]
engine: ruby
libxml:
binding: extension
compiled: 2.7.7
loaded: 2.7.7
Details
I have an XML file encoded with UTF-16(LE), and it also includes a PI XML Declaration at the top indicating that the encoding is UTF-16. Summarized, it looks like this:
<?xml version="1.0" encoding="UTF-16" ?>
<Foo>
<Bar><![CDATA[
Lorem ipsum dolor ...about 3900 more bytes of content here...
]]></Bar>
<Jim>Oh! Hello there.</Jim>
</Foo>
When I use Nokogiri to read this document, all seems well:
xml = File.open('Simplified.xml','rb:utf-16le',&:read)
p xml.encoding # #<Encoding:UTF-16LE>
p xml.valid_encoding? # true
doc1 = Nokogiri.XML(xml,&:noblanks)
xml1 = doc1.to_xml.encode('utf-8')
p xml1.encoding # #<Encoding:UTF-8>
p xml1.valid_encoding? # true
However, the output of serializing the document becomes munged after a certain point:
p xml1 # Correct contents of CDATA removed from the following output
#=> "<?xml version=\"1.0\" encoding=\"UTF-16\"?>\n<Foo>\n <Bar><![CDATA[\n...\n\t]]></Bar>\n <Jim>Oh! Hello there.\uFFFE\u3C00\u0000\u2F00\u0000\u4A00\u0000\u6900\u0000\u6D00\u0000\u3E00\u0000\u0A00\u0000\u3C00\u0000\u2F00\u0000\u4600\u0000\u6F00\u0000\u6F00\u0000\u3E00\u0000\u0A00\u0000"
(The limit seems to be related to the number of characters. I can add and remove a few words from the Lorem ipsum text with no change, but removing text below a certain point suddenly fixes the output.)
The Nokogiri document is not broken, however. I can independently serialize <Jim> successfully:
puts doc1.at('Jim').to_xml.encode('utf-8')
#=> <Jim>Oh! Hello there.</Jim>
The only workaround I've found is to remove the XML Declaration at the top of the document before parsing it. With this, all works as desired:
decl = '<?xml version="1.0" encoding="UTF-16" ?>'.encode('UTF-16LE')
doc2 = Nokogiri.XML(xml.sub(decl,''),&:noblanks)
puts doc2.to_xml.encode('utf-8')
#=> <?xml version="1.0"?>
#=> <Foo>
#=> <Bar><![CDATA[
#=> Lorem ipsum dolor...and more...
#=> ]]></Bar>
#=> <Jim>Oh! Hello there.</Jim>
#=> </Foo>
Full XML
Here's the full file to test for yourself:
<?xml version="1.0" encoding="UTF-16" ?>
<Foo>
<Bar><![CDATA[
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Etiam ac augue arcu, eget laoreet lorem. Quisque ac augue velit. Integer consectetur suscipit vehicula. Etiam et convallis enim. Etiam varius massa sit amet lacus rhoncus varius in non ante. Sed dictum, metus eu bibendum ornare, ligula dui commodo urna, ut dignissim felis dolor eget nisl. Proin sit amet nisi nunc. Vestibulum a urna sed dui dignissim blandit nec vel enim. Vivamus tincidunt nulla id dui hendrerit hendrerit.
Aliquam neque orci, luctus sit amet fringilla eu, varius vitae diam. Suspendisse varius rutrum lorem eget malesuada. Sed dapibus dapibus nisl, in cursus ante lacinia non. Aenean id sagittis ipsum. Suspendisse elit nunc, porta sit amet blandit ut, laoreet sed est. Nunc eget sem vitae nisl elementum ullamcorper ut sit amet urna. Sed ligula quam, fringilla in facilisis tincidunt, vehicula in nisi. Maecenas a augue in augue semper scelerisque sit amet ut arcu.
Praesent hendrerit, enim in elementum ornare, lorem nisi euismod dolor, sit amet ornare mi sem sodales lacus. Fusce et tempor mauris. In non quam nisl, non consequat diam. Duis sit amet massa ultrices massa cursus iaculis. Nunc ullamcorper malesuada sem dignissim semper. Fusce aliquet lacus quis nisi tincidunt sodales. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque posuere commodo aliquet. Aliquam blandit vestibulum facilisis. Sed pellentesque viverra dignissim. Etiam est lacus, mollis eu pretium vitae, lacinia eleifend augue. Mauris vitae quam nisl. In venenatis nunc ac eros elementum cursus.
Sed a metus sit amet nunc euismod condimentum id non orci. Curabitur velit turpis, lacinia non eleifend sed, rhoncus id est. Fusce ut massa dolor, ut sodales odio. Donec aliquam convallis tellus, eu pharetra tortor iaculis non. Integer imperdiet feugiat ipsum a gravida. Mauris sapien ipsum, ultricies ac placerat ut, imperdiet eu justo. Quisque quis consectetur velit. Etiam facilisis sapien nec enim tincidunt pulvinar. Duis fermentum faucibus felis, sed consequat libero pretium at. Phasellus nibh purus, suscipit in vestibulum vel, blandit at leo. Suspendisse placerat elit sed enim bibendum vel hendrerit mauris pretium. Maecenas ut lacus eu nisi euismod pretium.
Aliquam feugiat felis id massa aliquam pharetra sed non eros. Morbi interdum molestie iaculis. Curabitur varius ante ac dui dapibus non laoreet risus blandit. Nunc sit amet magna lacus. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices posuere cubilia Curae; Phasellus egestas nunc sed turpis imperdiet a rhoncus massa aliquam. Nulla facilisi. Phasellus sit amet neque felis, nec vestibulum massa. Donec luctus fringilla dolor et gravida. Phasellus euismod lectus eget elit hendrerit non vehicula tellus venenatis. Phasellus sit amet ligula et purus dignissim feugiat at vitae libero. Proin ut tortor eros, quis laoreet lectus. Quisque nec urna mattis ante gravida fermentum eu at nibh. Phasellus sapien elit, tincidunt quis laoreet id, lobortis sed magna. Aliquam pulvinar erat eu sapien pretium bibendum. Maecenas eleifend, leo quis sodales tincidunt, leo felis tristique dolor, vitae ultrices neque felis ut metus.
Etiam dignissim egestas ipsum, eget tempor ipsum rutrum eu. Donec vehicula eleifend ullamcorper. Mauris justo nulla, varius a mattis a, cursus sit amet risus. Phasellus rutrum interdum blandit. Donec ut justo eros, ut auctor dolor. Suspendisse potenti. Cras ultricies, dui eget mattis bibendum, leo dui luctus purus, sit amet rhoncus libero metus eget purus. Pellentesque scelerisque ornare sapien faucibus tempor.
Suspendisse potenti. Proin fermentum bibendum dapibus. Pellentesque facilisis aliquam. Nam egestas tellus non mauris scelerisque feugiat pellentesque lacus dignissim. Quisque id nulla felis. Mauris justo mauris, posuere sed facilisis in, venenatis nec risus. Mauris eu dui sed tellus laoreet tempor a in turpis volutpat.
]]></Bar>
<Jim>Oh! Hello there.</Jim>
</Foo>
Rather than serialising the xml then calling encode on the string, you can specify the encoding to use in the options to to_xml; instead of
xml1 = doc1.to_xml.encode('utf-8')
use:
xml1 = doc1.to_xml(:encoding => 'utf-8')
This seems to clear up the problems.
As for what’s going on, I can only offer some observations.
Firstly, the encoding of the string produced by to_xml without specifying the encoding is UTF-16, which in Ruby is a “dummy encoding” (whatever that means):
xml1 = doc1.to_xml
p xml1.encoding
#=> #<Encoding:UTF-16 (dummy)>
The docs say this about dummy encodings:
A dummy encoding is an encoding for which character handling is not properly implemented. It is used for stateful encodings.
The other thing I noticed is that the values in the munged part of the output actually correspond to the codepoints that should appear.
Hello there.\uFFFE\u3C00\u0000\u2F00\u0000\u4A00\u0000\u6900...
3C is <, 2F is /, 4A is J, 69 is i etc, producing (if you ignore the zeros and extra BOM)
Hello there.</Ji...
If you write out the XML produced by Nokogiri before encoding to UTF-8) and point a hex editor at it, the start looks like this:
0000000 ff fe 3c 00 3f 00 78 00 6d 00 6c 00 20 00 76 00
It starts with FF FE, i.e. a little endian BOM.
At the point the munging starts, it looks like this:
0001f20 3c 00 4a 00 69 00 6d 00 3e 00 4f 00 68 00 21 00
0001f30 20 00 48 00 65 00 6c 00 6c 00 6f 00 20 00 74 00
0001f40 68 00 65 00 72 00 65 00 2e 00 fe ff 00 3c 00 00
0001f50 00 2f 00 00 00 4a 00 00 00 69 00 00 00 6d 00 00
0001f60 00 3e 00 00 00 0a 00 00 00 3c 00 00 00 2f 00 00
fe ff is where the munged output starts (on the middle line). fe ff is also the big endian BOM, and the other characters seem to be BE (you can see how the columns of zeros don’t line up before and after the fe ff. There are extra pairs of zero bytes in between the characters though.

Ruby - Find the top 3 longest words in a string

I want to be able to get the 3 longest words from a string. Is there a neat way of doing this without getting into arrays etc?
>> str = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod
tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam,
quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.'
>> str.split.map { |s| s.gsub(/\W/, '') }.sort_by(&:length)[-3..-1]
=> ["adipisicing", "consectetur", "exercitation"]
"some string with words that are of different length".split(/ /).sort_by(&:length).reverse[0..2]
Since Ruby 2.2 Enumerable max_by, min_by,maxand min take an optional argument, allowing you to specify how many elements will be returned.
str.scan(/[[:alnum:]]+/).max_by(3, &:size)
# => ["exercitation", "consectetur", "adipisicing"]

Resources