itext7: Inconsistent Permission Flag in API - itext7

Before merging a list of files, I grab their assembly permission using PdfReader:
long PdfReader.getPermissions()
and check the result using PdfEncryptor:
static boolean PdfEncryptor.isAssemblyAllowed(int permissions)
As you see the first line of code returns a long, while the second expects an int. If I just cast the long to an int, the method PdfEncryptor.isAssemblyAllowed always returns true. But later in the code when I go to perform the actual merging, an error is thrown saying I lack the necessary permissions.
Is this a bug, or am I missing something in how the permission flag should be used?
As a workaround one can use the method
public boolean PdfReader.isOpenedWithFullPermission()
and don't merge if this returns false. But this might be over cautious.

Neither iText 5 nor iText 7 checks the detail permissions when operating on PDFs. When they do check permissions for some operation, they only call PdfReader.isOpenedWithFullPermission().
The reason for this may be that the individual permissions originally have been designed to match specific GUI operations in Adobe Acrobat which do not directly match specific API calls of iText.
Thus, if you regularly have to deal with documents with restricted permissions, consider setting the UnethicalReading flag, checking the appropriate flags in your code, and rejecting documents only according to that check.
As an aside, your current check is based on isAssemblyAllowed. The flag is specified as:
Assemble the document (insert, rotate, or delete
pages and create document outline items or thumbnail images), even if bit 4 [Modify the contents of the document] is clear.
Thus, this does not really match the permission you are looking for as it essentially refers to the target document (i.e. the merge result) while you test the source document. More appropriately, therefore, would be testing isCopyAllowed.

Related

Do (document) bundle entries always have to be referenced or referencing?

The specification for FHIR documents seems to mandate that all bundle entries in the document resource be part of the reference graph rooted at the Composition entry. That is, they should be the source or the target of a reference relation that traces all the way up to the root entry.
Unfortunately I have not been able to locate all the relevant passages in the FHIR specification; one place where it is spelled out is in 3.3.1 Document Content, but it is not really clear whether this pertains to all bundles of type 'document' (i.e. even those that happen to be bundles with type code 'document' but are merely collections of machine-processable data without any aspirations to represent a FHIRy document).
The problem with the referencedness requirement lies in the fact that the HAPI validator employs linear search for checking the references. So, if we have to ship N bundle entries full of data to a payor, we have to include a list with N references (one for each data-bearing bundle entry). That leads to N reference searches with O(N) effort during validation, which makes the reference checking complexity effectively quadratic in the number of entries.
This easily brings even the most powerful computers to their knees. Current size contraints effectively cap the number of entries per file at roughly 25000, and the HAPI validator needs several hours to chew through that, even on the most powerful CPUs currently available. Without the references, validation would take less than a minute for the same file.
In our use case, data-bearing entries have no identity outside of the containing bundle file. Practically speaking they would need neither entry.fullUrl nor entry.resource.id, because their business identifiers are contained in included base64 blobs. However, presence or absence of these identifiers has no practical influence on the time needed for validation (fractions of a second even for a 1 GB file), so who cares. It's the list of references that kills the HAPI validator.
Perhaps it would be possible to fulfil the letter of the referencedness requirement by making all entries include a reference to the Composition. The HAPI validator doesn't care either way, so I don't know whether that would be valid or not. But even if it were FHIRly valid, it would be a monstrously silly workaround.
Is there a way to ditch the referencedness requirement? Perhaps by changing the bundle type to something like 'collection', or by using contained resources?
P.S.: for the moment we are using a workaround that cuts the time for validation from hours to less than a minute, but it's a hack, and we currently don't have the resources to fix the HAPI validator. What I'm mostly concerned about is the question how the specifications (profiles) need to be changed in order to avoid the problem I described.
(i.e. even those that happen to be bundles with type code 'document' but are merely collections of machine-processable data without any aspirations to represent a FHIRy document)
If it is not a document, and not intended to be one, do not use the 'document' Bundle type. If you do, you would me misrepresenting the data which is what FHIR tries to avoid.
It seems like you want to send a collection of resources that are not necessarily related, so
Is there a way to ditch the referencedness requirement? Perhaps by changing the bundle type to something like 'collection'
Yes, I would use 'collection', or maybe a 'batch/transaction' depending on what I want to tell the receiver to do with the data.
The documents page says:
The document bundle SHALL include only:
The Composition resource, and any resources directly or indirectly (e.g. recursively) referenced from it
A Binary resource containing a stylesheet (as described below)
Provenance Resources that have a target of Composition or another resource included in the document
A document is a frozen set of content intended as an attested, human-readable, frozen set of content. If that's not what you need, then use a different Bundle type. However, if you do need the 'document' type, that doesn't mean that systems should necessarily validate all requirements at runtime

Wrap "Open" VB6 function with custom function as drop in replacement

Question
Is there any way to write a custom function that uses the same pattern as the Open function? Including the fluff keywords like For and As?
Background
I am working on migrating an old VB6 project to use online data via an API, as a first step I'd like to replace all instances of
Open SomeFilename For Binary Access Read As #39
With a custom OpenOnline function
OpenOnline SomeFilename For Binary Access Read As #39
But I do not know how to indicate those keywords are necessary when creating a function, or even if it's possible to do so.
Function openOnline(FileName As String) [For] (Optional Access As AccessType = Binary Access) [As] (Optional FileNumber As Integer) As Boolean
' Do the work of connecting to the online data equivalent of FileName with that access type
End Function
Qualifiers
I understand that these keywords are nonsensical in the context of an OpenOnline function. I also understand that I can use regular expressions to find and replace the syntax to remove keywords like "For" and "Read".
There are hundreds of thousands of instances of this Open function, the Put and Get functions and a few other file related functions, I realize that long term the correct solution is changing the mechanisms fundamentally to use online paradigms, and that work is in progress- on schedule to be completed with about 4 months of effort at the rate things are going.
Bonus Question
Secondarily, is there any way for me to pass a "User Defined Type" variable to the new Put/Get replacements in a way that I can access their fields directly without knowing the type beforehand? (I understand that variants are only available for .cls classes or public user defined types in dlls, neither of which apply in this situation)
As for 1), you can get close but you can't exactly replicate the VB Open statement. Which means you won't get around of some search & replace passes for the current Open statement lines with your newly created one.
For 2), can you illustrate that with an example? I'm trying to think of a situation where you know the UDT member's name in advance, but not its type.
That said, perhaps looking at VB's VarType function gives you an idea for solving that.

How does Windows interpret multiple VersionInfo Resources?

I am currently studying the VersionInfo Resource(s) for Windows.
It is kind of confusing that you can have multiple VS_VERSIONINFO/VS_FIXEDFILEINFO structures within a VS_VERSION_INFO Resource.
As far as I get it, you can have multiple RT_VERSION->VS_VERSION_INFO Resources with different language ids. (Just as shown as in the picture)
These 2 language ids (0 and 1031) have actually 2 different VS_VERSIONINFO/VS_FIXEDFILEINFO in each.
0 is a neutral language and seems to be prioritized than your actual local language id (which is 1031).
To me this seems to be kind of a mess and confusing.
How is it possible to have multiple VS_VERSIONINFO structures within a VS_VERSION_INFO resource and what is the point? How does Windows interpret multiple Resources,Structures?
And how is it possible to get only one piece of buffer when you call GetFileVersionInfo?
It all makes little sense to me and I can't find much documentation about it.
You have to make a difference between the textual infos, and the bare VS_FIXEDFILEINFO block. The first block exist only once. The text Information is language dependent.
"Windows" does not prefers a specific one ;) What the explorer does is a different thing. It just shows the resource information. But in fact this is just the string information and not the information from the fixed version info.
When you call GetFileVersionInfo you get all language blocks! VerQueryValue is used to access he separate blocks.
The installer and other routines inside windows only use the VS_FIXEDFILEINFO block. They don't care about any text blocks. And this block only exists once.
I assume that the explorer just shows the first text block and also doesn't prefer a specific one. Just use a text editor and exchange the blocks in the resource file. But maybe the resource compiler reorders them.
To access the separate parts:
- VerQueryValue with "\" gives you the fixed version info block VS_FIXEDFILEINFO
- VerQueryValue with "\VarFileInfo\Translation" gives you a list of translations
- with "\StringFileInfo\langId_charset\keyname" you get the specific string parts
You find this information in the MSDN

Is the ReplaceFile Windows API a convenience function only?

Is the ReplaceFile Windows API a convenience function only, or does it achieve anything beyond what could be coded using multiple calls to MoveFileEx?
I'm currently in the situation where I need to
write a temporary file and then
rename this temporary file to the original filename, possibly replacing the original file.
I thought about using MoveFileEx with MOVEFILE_REPLACE_EXISTING (since I don't need a backup or anything) but there is also the ReplaceFile API and since it is mentioned under Alternatives to TxF.
This got me thinking: Does ReplaceFile actually do anything special, or is it just a convenience wrapper for MoveFile(Ex)?
I think the key to this can be found in this line from the documentation (my emphasis):
The replacement file assumes the name of the replaced file and its identity.
When you use MoveFileEx, the replacement file has a different identity. Its creation date is not preserved, the creator is not preserved, any ACLs are not preserved and so on. Using ReplaceFile allows you to make it look as though you opened the file, and modified its contents.
The documentation says it like this:
Another advantage is that ReplaceFile not only copies the new file data, but also preserves the following attributes of the original file:
Creation time
Short file name
Object identifier
DACLs
Security resource attributes
Encryption
Compression
Named streams not already in the replacement file
For example, if the replacement file is encrypted, but the
replaced file is not encrypted, the resulting file is not
encrypted.
Any app that wants to update a file by writing to a temp and doing the rename/rename/delete dance (handling all the various failure scenarios correctly), would have to change each time a new non-data attribute was added to the system. Rather than forcing all apps to change, they put in an API that is supposed to do this for you.
So you could "just do it yourself", but why? Do you correctly cover all the failure scenarios? Yes, MS may have a bug, but why try to invent the wheel?
NB, I have a number of issues with the programming model (better to do a "CreateUsingTemplate") but it's better than nothing.

How might one cope with the ambiguous value produced by GetDllDirectory?

GetDllDirectory produces an ambiguous value. When the string this call produces is empty, it means one of the following:
nobody has called SetDllDirectory
somebody passed NULL to SetDllDirectory
somebody passed an empty string to SetDllDirectory
The first two cases are equivalent for my purposes, but the third case is a problem. If I want to write save/restore code (call GetDllDirectory to save the "old" value, SetDllDirectory to set a "new" value temporarily, and later SetDllDirectory again to restore the "old" value), I run the risk of reversing some other programmer's intent.
If the other programmer intended for the current working directory to be in the DLL search order (in other words, one of the first two bullets is true), and I pass an empty string to SetDllDirectory, I will be taking the current working directory out of the DLL search order, reversing the other programmer's intent.
Can anyone suggest an approach to eliminate or work around this ambiguity?
P.S. I know having the current working directory in the DLL search order could be interpreted as a security hole. Nevertheless, it is the default behavior, and my code is not in a position to undo that; my code needs to be compatible with the expectations of all potential callers, many of which are large and old and beyond my control.
No fix for this. Between a rock and a hard place, you ought to assume that NULL was passed. There is already a way to enable safe searching with a registry setting.

Resources