how to read from byte array to generate binary file

how to read from byte array to generate binary file - vbscript

Here a code snippet for downloading a binary file using VBScript:
...
Dim fs,ts
varByteArray = http.ResponseBody
Set fs = CreateObject("Scripting.FileSystemObject")
Set ts = fs.CreateTextFile("filetowrite", True)
For lngCounter = 0 to UBound(varByteArray)
ts.Write Chr(255 And Ascb(Midb(varByteArrary, lngCounter + 1, 1)))
Next
ts.Close
(full code can be found here)
I am wondering about:
Chr(255 And Ascb(...
From my understandig Chr generates 2 bytes UTF-8, not one (https://support.microsoft.com/en-us/kb/145745). But wouldn't this be necessary for a correct byte output for a newly generated binary file?
Why do you mask 255 using an And operator with the number of a one byte ANSI character. What purpose does this have?

That code is not using "Option Explicit" so variable declarations are useless. It is using undeclared variables. Two declared and initialized variables are not used.
The "binary and" with 255 seems to serve no purpose
I downloaded a test file of 4 MB using 4 different methods
Using Chrome, the regular way
Using the ADO method in the script. Is very fast and is byte-identical to browser version (Hex comparison)
Using the AscB method with "binary and with 255". It is very, very, very slow but is byte-identical to browser version (Hex comparison)
Using the AscB method without "binary and with 255". It is very, very slow (but a little faster than 5) but is byte-identical to browser version (Hex comparison)
Bottom Line: That code works. Tries multiple methods to connect in order of preference, tries two methods to download in order of preference (it tries ADO first and only falls back to AscB method if ADO fails). I like that code.

Related

What is wrong with this PDF file?

I have to work with a PDF form created by a person unknown to me. Why did the program with which the form was created (Word + PDF export?) split the term "Stunde" into "S", "t" and "unde" in line 6909 of the decoded PDF? There is no visual break between the three parts.
/TT1 1 Tf
11.04 0 0 11.04 59.16 476.1203 Tm
(Datum)Tj
/C2_1 1 Tf
<0003>Tj
/TT1 1 Tf
(der)Tj
0.424 -1.315 Td
(Tätigkeit)Tj
-0.0022 Tc 0 11.04 -11.04 0 261.24 437.7203 Tm
[(Ve)-4.6<7267fc74>-4.2(ungssat)-4.2(z)]TJ
/C2_1 1 Tf
0 Tc <0003>Tj
/TT1 1 Tf
-0.0021 Tc 0.935 -1.315 Td
[<2880>-6.1(/)-7.2(S)0.8(t)-4.1(unde)-4.5(\))]TJ % <<< the important line
0 Tc 11.04 0 0 11.04 340.92 468.8003 Tm
(Anlass/Art)Tj
/C2_1 1 Tf
resulting in
[]
To get the source code above, I decoded the PDF file as described here. I have no know-how concerning the PDF file format.
Background: I had to replace the word "Stunde", it drove me crazy to find the place where "Stunde" was written (in parts) within the source code, since no free PDF editor seems to be able to work with horizontal text without problems.
Academic Bonus questions: Is it possible to set the sum over a column as default value for a form field? (Modifiable; changed every time the column is changed.) Why was I able to replace "Stunde" with "Einsatz" without making the PDF file corrupt due to now irregular offsets?

Why did the program with which the form was created (Word + PDF export?) split the term "Stunde" into "S", "t" and "unde" in line 6909 of the decoded PDF?
As #gettalong mentioned in his answer, in your case this most likely has been done to apply kerning.
If you start looking into the outputs of some other PDF producers, you'll see that this export from Word actually is very unobtrusive in regard to splitting words:
there are PDF producers that draw each character individually after explicitly setting the text matrix for it, and
there also are PDF producers that have the width information for the characters of the used fonts set to zero and use the numbers in TJ instructions to forward the current text matrix between characters accordingly.
And this doesn't cover all the variants to be found, not by far...
Thus,
I had to replace the word "Stunde", it drove me crazy to find the place where "Stunde" was written (in parts) within the source code
in your case replacing actually was a fairly trivial task...
Is it possible to set the sum over a column as default value for a form field? (Modifiable; changed every time the column is changed.)
If all the column values in question are stored in form fields, you can use JavaScript to recalculate sums after form changes. To have it serve as "default" only, you can use some other (hidden) field for a flag whether the field has already been touched. Beware, though: JavaScript is not supported by all PDF viewers. Furthermore, the JavaScript object model for PDF is not specified in an independent (like ISO) specification but in an Adobe one which can make interpretation of the specification biased.
Why was I able to replace "Stunde" with "Einsatz" without making the PDF file corrupt due to now irregular offsets?
As we don't know how exactly you applied the changes, this obviously is hard to tell.
Most likely, though, you did corrupt the PDF and the PDF viewers you opened it in merely repair the corruption under the hood. There is a strong tendency in PDF viewers to do such under-the-hood repairs without informing the user; the result is that a large part of the PDFs in the wild actually being broken.

You don't see a visual break but the standard distance between "S", "t" and "unde" has been changed nonetheless. This is done by PDF writers that support e.g. kerning so that the word appear nicer. This is the reason why it is split that way.

Opening file with write throws "No implicit conversion of String into Integer"

It's been quite a while time since I last wrote code in Ruby (Ruby 2 was new and wow it's 3 already), so I feel like an idiot.
I have a text file containing only the word:
hello
My ruby file contains the following code:
content = File.read("test_file_str.txt","w")
puts content
When I run it, I get:
`read': no implicit conversion of String into Integer (TypeError)
I've never had this happen before, but it has been quite a while since I wrote code, so clearly PEBKAC.
However, when I run this without ,"w" all is seemingly well. What am I doing wrong?
ruby 3.0.3p157 (2021-11-24 revision 3fb7d2cadc) [x64-mingw32]

As per the docs, the second argument for File.read is the length of bytes to be read from the given file which is meant to be an integer.
Opens the file, optionally seeks to the given offset, then returns length bytes (defaulting to the rest of the file). read ensures the file is closed before returning.
So, in your case the error happens because you're passing an argument which must be an integer. It doesn't state this per-se in the docs for File.read, but it does it for File#read:
Reads length bytes from the I/O stream.
length must be a non-negative integer or nil.
If you want to specify the mode, you can use the mode option for that:
File.read("filename", mode: "r") # "r" or any other
# or
File.new("filename", mode: "r").read(1)

Open Files for Reading Don't Accept Write Mode
In general, it doesn't make sense to open a filehandle for reading in write mode. So, you need to refactor your method to something like:
content = File.read("test_file_str.txt")
or perhaps:
content = File.new("test_file_str.txt", "r+").read
depending on exactly what you're trying to do.
See Also: File Permissions in IO#new
The documentation for File in Ruby 3.0.3 points you to IO#new for the available mode permissions. You might take a look there if you don't see exactly the options you're looking for.

Scripting Word from vbs

I'm trying to get Word to fill in cells in a table. The script works when run as a macro from within Word, but fails when saved as a .vbs file and double-clicked, or run with wscript. This is a part of it.
set obj = GetObject(,"Word.Application)
With obj
With .Selection
MsgBox .text
If (.Information(wdWithInTable) = True) Then
.Collapse Direction:=wdCollapseStart
tCols = .Tables(1).Columns.Count
tRow = .Information(wdStartOfRangeRowNumber)
tCol = .Information(wdStartOfRangeColumnNumber)
For I = 2 To 5
.Tables(1).Cell(tRow, I).Range.Text = "fred" & Str(I)
Next
` now make new row
For I = 1 To tCols - tCol + 1
.MoveRight unit:=wdCell
Next
End If
End With
End With
I have three problems. First, it won't compile unless I comment out the .Collapse and .MoveRight lines. Second, although the MsgBox .text displays the selected text, I get "out of range" errors if I try to access any .Information property.
I'm sure I'm missing something very simple: I usually write software for Macs, and I'd do this using AppleScript. This is my first attempt at getting anything done under Windows.

VBScript and VBA are different languages.
They are a bit similar, but not very. Moreover, VBScript is not like AppleScript; it doesn't let you easily interface with running programs.
The interfaces you'll get from VBScript can behave subtly differently in VBA and VBScript. However, I think you've got two problems here:
:= is invalid syntax in VBScript; you'll need to find an alternative way of calling the function. Try just using positional arguments.
You've no guarantee that this will open the expected file; there could be another instance of Word that it's interacting with instead.

Since your code is not running within the Word environment it would require a reference to the Word object library in order to use enumeration constants (those things that start with wd).
VBScript, however, cannot work with references, which means the only possibility is to use the long value equivalents of the enumerations. You'll find these in the Word Language References. Simplest to use is probably the Object Browser in Word's VBA Editor. (In Word: Alt+F11 to open the VBA Editor; F2 to start the Object Browser; type in the term in the "Search" box, click on the term, then look in the bottom bar.)
The code in the question uses, for example:
wdWithInTable
wdCollapseStart
wdStartOfRangeRowNumber
wdStartOfRangeColumnNumber
wdCell
The reason you get various kinds of errors depends on where these are used.
Also, VBScript can't used named parameters such as Unit:=. Any parameters must be passed in comma-delimited format, if there's more than one, in the order specified by the method or property. If there are optional parameters you don't want to use these should be left "blank":
MethodName parameter, parameter, , , parameter

Ruby - read bytes from a file, convert to integer

I'm trying to read unsigned integers from a file (stored as consecutive byte) and convert them to Integers. I've tried this:
file = File.new(filename,"r")
num = file.read(2).unpack("S") #read an unsigned short
puts num #value will be less than expected
What am I doing wrong here?

You're not reading enough bytes. As you say in the comment to tadman's answer, you get 202 instead of 3405691582
Notice that the first 2 bytes of 0xCAFEBABE is 0xCA = 202
If you really want all 8 bytes in a single number, then you need to read more than the unsigned short
try
num = file.read(8).unpack("L_")
The underscore is assuming that the native long is going to be 8 bytes, which definitely is not guaranteed.

How about looking in The Pickaxe? (Ruby 1.9, p. 44)
File.open("testfile")
do |file|
file.each_byte {|ch| print "#{ch.chr}:#{ch} " }
end
each_byte iterates over a file byte by byte.

There are a couple of libraries that help with parsing binary data in Ruby, by letting you declare the data format in a simple high-level declarative DSL and then figure out all the packing, unpacking, bit-twiddling, shifting and endian-conversions by themselves.
I have never used one of these, but here's two examples. (There are more, but I don't know them):
BitStruct
BinData

Ok, I got it to work:
num = file.read(8).unpack("N")
Thanks for all of your help.

What format are the numbers stored in the file? Is it in hex? Your code looks correct to me.

When dealing with binary data you need to be sure you're opening the file in binary mode if you're on Windows. This goes for both reading and writing.
open(filename, "rb") do |file|
num = file.read(2).unpack("S")
puts num
end
There may also be issues with "endian" encoding depending on the source platform. For instance, PowerPC-based machines, which include old Mac systems, IBM Power servers, PS3 clusters, or Sun Sparc servers.
Can you post an example of how it's "less"? Usually there's an obvious pattern to the data.
For example, if you want 0x1234 but you get 0x3412 it's an endian problem.

Visio to image command line conversion

At work we make pretty extensive use of Visio drawing as support for documentation. Unfortunately vsd files don't play nicely with our wiki or documentation extraction tools like javadoc, doxygen or naturaldocs. While it is possible to convert Visio files to images manually, it's just a hassle to keep the image current and the image files are bound to get out of date. And let's face it: Having generated files in revision control feels so wrong.
So I'm looking for a command line tool that can convert a vsd file to jpeg, png, gif or any image that can be converted to an image that a browser can display. Preferably it will run under unix, but windows only is also fine. I can handle the rest of the automation chain, cron job, image to image conversion and ssh, scp, multiple files, etc.
And that's why I'm turning to you: I can't find such a tool. I don't think I can even pay for such a tool. Is my Google-fu completely off? Can you help me?
I mean, it has got to be possible. There has to be a way to hook into Visio with COM and get it to save as image. I'm using Visio 2007 by the way.
Thanks in advance.

I slapped something together quickly using VB6, and you can download it at:
http://fournier.jonathan.googlepages.com/Vis2Img.exe
You just pass in the input visio file path, then the output file path (visio exports based on file extension) and optionally the page number to export.
Also here is the source code I used, if you want to mess with it or turn it into a VBScript or something, it should work, though you'd need to finish converting it to late-bound code.
hope that helps,
Jon
Dim TheCmd As String
Const visOpenRO = 2
Const visOpenMinimized = 16
Const visOpenHidden = 64
Const visOpenMacrosDisabled = 128
Const visOpenNoWorkspace = 256
Sub Main()
' interpret command line arguments - separated by spaces outside of double quotes
TheCmd = Command
Dim TheCmds() As String
If SplitCommandArg(TheCmds) Then
If UBound(TheCmds) > 1 Then
Dim PageNum As Long
If UBound(TheCmds) >= 3 Then
PageNum = Val(TheCmds(3))
Else
PageNum = 1
End If
' if the input or output file doesn't contain a file path, then assume the same
If InStr(1, TheCmds(1), "\") = 0 Then
TheCmds(1) = App.Path & "\" & TheCmds(1)
End If
If InStr(1, TheCmds(2), "\") = 0 Then
TheCmds(2) = App.Path & "\" & TheCmds(2)
End If
ConvertVisToImg TheCmds(1), TheCmds(2), PageNum
Else
' no good - need an in and out file
End If
End If
End Sub
Function ConvertVisToImg(ByVal InVisPath As String, ByVal OutImgPath As String, PageNum As Long) As Boolean
ConvertVisToImg = True
On Error GoTo PROC_ERR
' create a new visio instance
Dim VisApp As Visio.Application
Set VisApp = CreateObject("Visio.Application")
' open invispath
Dim ConvDoc As Visio.Document
Set ConvDoc = VisApp.Documents.OpenEx(InVisPath, visOpenRO + visOpenMinimized + visOpenHidden + visOpenMacrosDisabled + visOpenNoWorkspace)
' export to outimgpath
If Not ConvDoc.Pages(PageNum) Is Nothing Then
ConvDoc.Pages(PageNum).Export OutImgPath
Else
MsgBox "Invalid export page"
ConvertVisToImg = False
GoTo PROC_END
End If
' close it off
PROC_END:
On Error Resume Next
VisApp.Quit
Set VisApp = Nothing
Exit Function
PROC_ERR:
MsgBox Err.Description & vbCr & "Num:" & Err.Number
GoTo PROC_END
End Function
Function SplitCommandArg(ByRef Commands() As String) As Boolean
SplitCommandArg = True
'read through command and break it into an array delimited by space characters only when we're not inside double quotes
Dim InDblQts As Boolean
Dim CmdToSplit As String
CmdToSplit = TheCmd 'for debugging command line parser
'CmdToSplit = Command
Dim CharIdx As Integer
ReDim Commands(1 To 1)
For CharIdx = 1 To Len(CmdToSplit)
Dim CurrChar As String
CurrChar = Mid(CmdToSplit, CharIdx, 1)
If CurrChar = " " And Not InDblQts Then
'add another element to the commands array if InDblQts is false
If Commands(UBound(Commands)) <> "" Then ReDim Preserve Commands(LBound(Commands) To UBound(Commands) + 1)
ElseIf CurrChar = Chr(34) Then
'set InDblQts = true
If Not InDblQts Then InDblQts = True Else InDblQts = False
Else
Commands(UBound(Commands)) = Commands(UBound(Commands)) & CurrChar
End If
Next CharIdx
End Function

F# 2.0 script:
//Description:
// Generates images for all Visio diagrams in folder were run according to pages names
//Tools:
// Visio 2010 32bit is needed to open diagrams (I also installed VisioSDK32bit.exe on my Windows 7 64bit)
#r "C:/Program Files (x86)/Microsoft Visual Studio 10.0/Visual Studio Tools for Office/PIA/Office14/Microsoft.Office.Interop.Visio.dll"
open System
open System.IO
open Microsoft.Office.Interop.Visio
let visOpenRO = 2
let visOpenMinimized = 16
let visOpenHidden = 64
let visOpenMacrosDisabled = 128
let visOpenNoWorkspace = 256
let baseDir = Environment.CurrentDirectory;
let getAllDiagramFiles = Directory.GetFiles(baseDir,"*.vsd")
let drawImage fullPathToDiagramFile =
let diagrammingApplication = new ApplicationClass()
let flags = Convert.ToInt16(visOpenRO + visOpenMinimized + visOpenHidden + visOpenMacrosDisabled + visOpenNoWorkspace)
let document = diagrammingApplication.Documents.OpenEx(fullPathToDiagramFile,flags)
for page in document.Pages do
let imagePath = Path.Combine(baseDir, page.Name + ".png")
page.Export (imagePath)
document.Close()
diagrammingApplication.Quit()
let doItAll =
Array.iter drawImage getAllDiagramFiles
doItAll

You can try "Visio to image" converter
http://soft.postpdm.com/visio2image.html
Tested with MS Visio 2007 and 2010

There has to be a way to hook into Visio with COM and get it to save as image.
Why not try writing something yourself, then, if you know how to use COM stuff? After all, if you can't find anything already made to do it, and you know you can figure out how to do it yourself, why not write something to do it yourself?
EDIT: Elaborating a bit on what I stated in my comment: writing a script of some sort does seem to be your best option in this situation, and Python, at least, would be quite useful for that, using the comtypes library found here: http://starship.python.net/crew/theller/comtypes/ Of course, as I said, if you prefer to use a different scripting language, then you could try using that; the thing is, I've only really used COM with VBA and Python at this point (As an aside, Microsoft tends to refer to "Automation" these days rather than specifically referencing COM, I believe.) The nice thing about Python is that it's an interpreted language, and thus you just need a version of the interpreter for the different OSes you're using, with versions for Windows, OSX, Linux, Unix, etc. On the other hand, I doubt you can use COM on non-Windows systems without some sort of hack, so you may very well have to parse the data in the source files directly (and even though Visio's default formats appear to use some form of XML, it's probably one of those proprietary formats Microsoft seems to love).
If you haven't used Python before, the Python documentation has a good tutorial to get people started: http://docs.python.org/3.1/tutorial/index.html
And, of course, you'll want the Python interpreter itself: http://python.org/download/releases/3.1/ (Note that you may have to manually add the Python directory to the PATH environment variable after installation.)
When you write the script, you could probably have the syntax for running the script be something like "python visioexport.py <source/original file[ with path]>[ <new file[ with path]>]" (assuming the script file is in your Python directory), with the new file defaulting to a file of the same name and in the same folder/directory as the original (albeit with a different extension; in fact, if you wish, you could set it up to export to multiple formats, with the format defaulting to that of whatever default extension you choose and being specified by an alternate extension of you specify one in the file name. As well, you could likely set it up so that if you only have the new file name after the source file, no path specified, it'll save with that new file name to the source file's directory. And, of course, if you don't specify a path for the source file, just a file name, you could set it up to get the file from the current directory).
On the topic of file formats: it seems to me that converting to SVG might be the best thing to do, as it would be more space-efficient and would better reflect the original images' status as vectored images. On the other hand, the conversion from a Visio format to SVG is not perfect (or, at least, it wasn't in Visio 2003; I can't find a source of info similar to this one for Visio 2007), and as seen here, you may have to modify the resultant XML file (though that could be done using the script, after the file is exported, via parts of the Python standard library). If you don't mind the additional file size of bitmaps, and you'd rather not have to include additional code to fix resultant SVG files, then you probably should just go with a bitmap format such as PNG.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio