Copying file security permissions - windows

I'm copying a file from folder A to folder B and then trying to copy the file permissions. Here are the basic steps I'm using:
CopyFile(source, target)
GetNamedSecurityInfo(source, GROUP_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION)
Print source SD using ConvertSecurityDescriptorToStringSecurityDescriptor
SetNamedSecurityInfo(target, GROUP_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION)
GetNamedSecurityInfo(target, GROUP_SECURITY_INFORMATION | DACL_SECURITY_INFORMATION)
Print target SD using ConvertSecurityDescriptorToStringSecurityDescriptor
At #3 I get this SD:
G:S-1-5-21-1454471165-1482476501-839522115-513D:AI(A;ID;0x1200a9;;;BU)(A;ID;0x1301bf;;;PU)(A;ID;FA;;;BA)(A;ID;FA;;;SY)(A;ID;FA;;;S-1-5-21-1454471165-1482476501-839522115-1004)
At #6 I get
G:S-1-5-21-1454471165-1482476501-839522115-513D:AI(A;ID;0x1301bf;;;PU)(A;ID;FA;;;BA)(A;ID;FA;;;SY)
The call to SetNamedSecurityInfo returns ERROR_SUCCESS, yet the results are the source and target file do not have the same SDs. Why is that? What am I doing wrong here?

SHFileOperation can copy files together with their security attributes, but from your other question I see you're concerned that this won't work within a service. Maybe the following newsgroup discussions will provide some useful information for you:
Copy NTFS files with security
How to copy a disk file or directory with ALL attributes?
Copying files with security attributes

Robocopy from the server tools kit http://www.microsoft.com/downloads/details.aspx?familyid=9d467a69-57ff-4ae7-96ee-b18c4790cffd&displaylang=en
Will copy all NTFS settigs and ACLs, it's also more robust and reliable than copy/xcopy

Related

what is the file ORA_DUMMY_FILE.f in oracle?

oracle version: 12.2.0.1
As you know, these are then unix processes for the parallel servers in oracle:
ora_p000_ora12c
ora_p001_ora12c
....
ora_p???_ora12c
They can be seen also with the view: gv$px_process.
The spid for each parallel server can be obtained from there.
Then I look for the open files associated with te parallel server here:
ls -l /proc/<spid>/fd
And I'm obtaining around 500-10000 file descriptors for several parallel servers equal to this one:
991 -> /u01/app/oracle/admin/ora12c/dpdump/676185682F2D4EA0E0530100007FFF5E/ORA_DUMMY_FILE.f (deleted)
I've deleted them using:(actually I've create a small script for doing it because there are thousands of them)
gdb -p <spid>
gdb> p close(<fd_id>)
But after some hours the file descriptors start being created again (hundreds every day)
If they are not deleted then eventually the linux limit is reached and any parallel query throws an error like this:
ORA-12801: error signaled in parallel query server P001
ORA-01116: error in opening database file 132
ORA-01110: data file 132: '/u02/oradata/ora12c/pdbname/tablespacenaname_ts_1.dbf'
ORA-27077: too many files open
Does anyone have any idea of how and why this file descriptors are being created, and how to avoid it?.
Edited: Added some more information that could be useful.
I've tested that when a new PDB is created a directory DATA_PUMP_DIR is created in it (select * from all_directories) that is pointing to:
/u01/app/oracle/admin/ora12c/dpdump/<xxxxxxxxxxxxx>
The linux directory is also created.
Also one file descriptor is created pointing to ORA_DUMMY_FILE.f in the new dpdump subdirectory like the ones described initially
lsof | grep "ORA_DUMMY_FILE.f (deleted)"
/u01/app/oracle/admin/ora12c/dpdump/<xxxxxxxxxxxxx>/ORA_DUMMY_FILE.f (deleted)
This may be ok, the problem I face is the continuos growing of the file descriptors pointing to ORA_DUMMY_FILE that reach the linux limits.

nifi - fetchsftp - subfolders

I am using nifi to transfer files between ftp locations.
I have to transfer files from a sftp location to a ftp directory.
I have the below folder structure in the remote sftp location.
/rootfolder/
/subfolder1
/subfolder2
/subfolder3
I need to download respective files from each subfolder to a local directory which has the similar structure.
My workflow includes
ListSFTP -> FetchSFTP (3) -> PutFTP
In ListSFTP
Remote path: /rootfolder
In FetchSFTP1
Remote path: /rootfolder/subfolder1
In FetchSFTP2
Remote path: /rootfolder/subfolder2
In FetchSFTP3
Remote path: /rootfolder/subfolder3
But, this does not seem to work. can someone help me how i can transfer files from a remote sftp sub-folder(s).
Thanks,
Aadil
You should be able to set ListSFTP to recursive search and then coming out of ListSFTP each flow file will have attributes for "path" ad "filename".
Lets say you had one file under each directory in your example, you should get three flow files like the following:
ff 1
path = /rootfolder/subfolder1
filename = file1
ff 2
path = /rootfolder/subfolder2
filename = file2
ff 3
path = /rootfolder/subfolder3
filename = file3
You should only need one FetchSFTP processor with Remote Filename set to ${path}/${filename}.
If you have the same structure on your destination system, just set PutFTP's Remote Path to ${path}.
If you have a slightly different structure, use UpdateAttribute to modify "path" right before PutFTP.

Writing Spark dataframe as parquet to S3 without creating a _temporary folder

Using pyspark I'm reading a dataframe from parquet files on Amazon S3 like
dataS3 = sql.read.parquet("s3a://" + s3_bucket_in)
This works without problems. But then I try to write the data
dataS3.write.parquet("s3a://" + s3_bucket_out)
I do get the following exception
py4j.protocol.Py4JJavaError: An error occurred while calling o39.parquet.
: java.lang.IllegalArgumentException: java.net.URISyntaxException:
Relative path in absolute URI: s3a://<s3_bucket_out>_temporary
It seems to me that Spark is trying to create a _temporary folder first, before it is writing to write into the given bucket. Can this be prevent somehow, so that spark is writing directly to the given output bucket?
You can't eliminate the _temporary file as that's used to keep the intermediate
work of a query hidden until it's complete
But that's OK, as this isn't the problem. The problem is that the output committer gets a bit confused trying to write to the root directory (can't delete it, see)
You need to write to a subdirectory under a bucket, with a full prefix. e.g.
s3a://mybucket/work/out .
I should add that trying to commit data to S3A is not reliable, precisely because of the way it mimics rename() by what is something like ls -rlf src | xargs -p8 -I% "cp % dst/% && rm %". Because ls has delayed consistency on S3, it can miss newly created files, so not copy them.
See: Improving Apache Spark for the details.
Right now, you can only reliably commit to s3a by writing to HDFS and then copying. EMR s3 works around this by using DynamoDB to offer a consistent listing
I had the same issue when writing the root of S3 bucket:
df.save("s3://bucketname")
I resolved it by adding a / after the bucket name:
df.save("s3://bucketname/")

file's date changes after zip in and out again, according to XCOPY

So, here's the problem: I have files which are regular files, and they are put into a ZIP file (see below for details on ZIP). Then I unzip them (see below for details on the tool used), and the files are restored. The date of the file is restored, as in standard in the ZIP/UNZIP tools used. When querying using DIR, or in Windows Explorer, the files involved have the same date as they had, before being handled by the ZIP/UNZIP process.
So, all OK.
But then, I'm using the XCOPY /D command, to further manipulate different copies of those files on the disk ... and, XCOPY says : one file is newer than the other one. Given the fact the date, hour, up until minutes is the same .. the difference would be regarding a smaller entity, like seconds ?
All involved disks have NTFS file system.
Example:
C:\my>dir C:\windows\Background_mycomputer.cmd C:\my\directory\Background_mycomputer.cmd
Volume in drive C is mycomputerC
Volume Serial Number is 1234-5678
Directory of C:\windows
31/12/2014 19:50 51 Background_mycomputer.cmd
1 File(s) 51 bytes
Directory of C:\my\directory
31/12/2014 19:50 51 Background_mycomputer.cmd
1 File(s) 51 bytes
0 Dir(s) 33.655.316.480 bytes free
C:\my>xcopy C:\windows\Background_mycomputer.cmd C:\my\directory\Background_mycomputer.cmd /D
Overwrite C:\my\directory\Background_mycomputer.cmd (Yes/No/All)? y
C:\windows\Background_mycomputer.cmd
1 File(s) copied
C:\my>xcopy C:\my\directory\Background_mycomputer.cmd C:\windows\Background_mycomputer.cmd /D
0 File(s) copied
C:\my>xcopy C:\windows\Background_mycomputer.cmd C:\my\directory\Background_mycomputer.cmd /D
0 File(s) copied
C:\my>unzip -v
UnZip 6.00 of 20 April 2009, by Info-ZIP. Maintained by C. Spieler. Send
bug reports using http://www.info-zip.org/zip-bug.html; see README for details.
Latest sources and executables are at ftp://ftp.info-zip.org/pub/infozip/ ;
see ftp://ftp.info-zip.org/pub/infozip/UnZip.html for other sites.
Compiled with Microsoft C 13.10 (Visual C++ 7.1) for
Windows 9x / Windows NT/2K/XP/2K3 (32-bit) on Apr 20 2009.
UnZip special compilation options:
ASM_CRC
COPYRIGHT_CLEAN (PKZIP 0.9x unreducing method not supported)
NTSD_EAS
SET_DIR_ATTRIB
TIMESTAMP
UNIXBACKUP
USE_EF_UT_TIME
USE_UNSHRINK (PKZIP/Zip 1.x unshrinking method supported)
USE_DEFLATE64 (PKZIP 4.x Deflate64(tm) supported)
UNICODE_SUPPORT [wide-chars] (handle UTF-8 paths)
MBCS-support (multibyte character support, MB_CUR_MAX = 1)
LARGE_FILE_SUPPORT (large files over 2 GiB supported)
ZIP64_SUPPORT (archives using Zip64 for large files supported)
USE_BZIP2 (PKZIP 4.6+, using bzip2 lib version 1.0.5, 10-Dec-2007)
VMS_TEXT_CONV
[decryption, version 2.11 of 05 Jan 2007]
UnZip and ZipInfo environment options:
UNZIP: [none]
UNZIPOPT: [none]
ZIPINFO: [none]
ZIPINFOOPT: [none]
C:\my>ver
Microsoft Windows [Version 6.1.7601]
C:\my>zip -?
Copyright (c) 1990-2006 Info-ZIP - Type 'zip "-L"' for software license.
Zip 2.32 (June 19th 2006). Usage:
zip [-options] [-b path] [-t mmddyyyy] [-n suffixes] [zipfile list] [-xi list]
The default action is to add or replace zipfile entries from list, which
can include the special name - to compress standard input.
If zipfile and list are omitted, zip compresses stdin to stdout.
-f freshen: only changed files -u update: only changed or new files
-d delete entries in zipfile -m move into zipfile (delete files)
-r recurse into directories -j junk (don't record) directory names
-0 store only -l convert LF to CR LF (-ll CR LF to LF)
-1 compress faster -9 compress better
-q quiet operation -v verbose operation/print version info
-c add one-line comments -z add zipfile comment
-# read names from stdin -o make zipfile as old as latest entry
-x exclude the following names -i include only the following names
-F fix zipfile (-FF try harder) -D do not add directory entries
-A adjust self-extracting exe -J junk zipfile prefix (unzipsfx)
-T test zipfile integrity -X eXclude eXtra file attributes
-! use privileges (if granted) to obtain all aspects of WinNT security
-R PKZIP recursion (see manual)
-$ include volume label -S include system and hidden files
-e encrypt -n don't compress these suffixes
C:\my>
Question: I do not want XCOPY to make updates where I know they are invalid cause the time format is doing something wrong. How do I prevent that ?
From how I see, there's different things involved, being XCOPY, very specific ZIP and UNZIP, and NTFS file system. Which one is doing something wrong ?
I must stress that apart from ZIP and UNZIP, there are no other changes done to the file, like changing 1 file, then making a change to another one, in less than 60 seconds time.
At moment of test, the time shown was NOT the current time, and not close to it either. No file is adjusting to the current time, the times refer to last changes of the file in question, which may be any time in the past. In this case, it's one day later, but it can be anything.
I noticed the peculiar behavior Raymond Chen describes when writing a Powershell script (GitHub link) to freshen a zip archive using the System.IO.Compression and System.IO.Compression.FileSystem libraries.
Interestingly, Zip archives can store multiple copies of the same file with identical metadata (name, relative path, modification dates). Extracting the second copy of the file will fail in Windows Explorer because the file already exists.
When trying to prevent re-zipping a file was already archived, I checked the relative path and date, and noticed that there was a discrepancy of up to two seconds in the LastWriteTime. This workaround compensates for the loss of precision:
$AlreadyArchivedFile = ($WriteArchive.Entries | Where-Object {#zip will store multiple copies of the exact same file - prevent this by checking if already archived.
(($_.FullName -eq $RelativePath) -and ($_.Length -eq $File.Length) ) -and
([math]::Abs(($_.LastWriteTime.UtcDateTime - $File.LastWriteTimeUtc).Seconds) -le 2) #ZipFileExtensions timestamps are only precise within 2 seconds.
})
Also, the IsDaylightSavingTime flag is not stored in the Zip archive. As a result I was surprised when extracted files became an hour newer than the original archived file. I tried this several times and saw the extracted file's timestamp incremented by an hour every time it was compressed and extracted.
Here's a very ugly workaround that decreases the archived file time by one hour to make the original source file and extracted file timestamps consistent:
If($File.LastWriteTime.IsDaylightSavingTime() -and $ArchivedFile){#HACK: fix for buggy date - adds an hour inside archive when the zipped file was created during PDT (files created during PST are not affected).
$entry = $WriteArchive.GetEntry($RelativePath)
$entry.LastWriteTime = ($File.LastWriteTime.ToLocalTime() - (New-TimeSpan -Hours 1))
}
There's probably a better way to handle this. Unfortunately I'm not aware of any way to store a Daylight Savings indicator for a file in a .Zip archive, and that information is lost.

Talend - Get all files (in several directories) from FTP

There are many FTP components to extract files. What should I use if I have a root Directory, with some sub-directories and several files in all of them, and I want to extract all files?
For example:
rootDirectory
- file1.txt
- file2.txt
- file3.txt
- subDirectory1
- file4.txt
- file5.txt
- subDirectory2
- file6.txt
- subDirectory2
- file7.txt
- file8.txt
How can I get files 1 to 8, just by giving the component the path to the rootDirectory?
I've not used the FTP components yet but typically you'd use a tFileList connected to a tFileCopy to move files around. So in your case I would expect you should use a tFTPFileList connected to your FTP server with a filemask of "*.txt" and then connect that to a tFTPGET. Set this component to the local directory of your choice, a remote directory of "/" and then use ((String)globalMap.get("tFTPFileList_1_CURRENT_FILEPATH")) in your Filemask.
This approach seems to be the one I've just found now in the Talend documentation although it might require logging in (free account sign up and probably worth doing if you're using Talend much at all).
It's probably equally fair to say that unless you're planning on doing something complicated with the data rather than just grabbing it most FTP tools should comfortably be able to GET everything from an FTP server and Talend might not be the best approach here.

Resources