Possible to extract Embedded image to a file? - image

Given a SSRS report definition file with an embedded image in it, just wondering if its possible to extract that image XML to recreate the original image file.
e.g. :
inside the rdlc file, you might see xml like this :
<EmbeddedImage Name="tick">
<MIMEType>image/bmp</MIMEType>
<ImageData>Qk1mAwAAAAAAADYAAAAoAAAAEAAAABEAAAABABgA ... <<REST OF IMAGE HERE>>
</ImageData>
</EmbeddedImage>
Is it possible to take the ImageData, and transform form it in some way to re-create the original image bitmap byte stream ?
(This might be useful in cases such as when you've lost the original image file on which the embedded image was based.)

Two approaches are detailed in this blog post:
Copy the encoded image from one report to another if you need to reuse it there.
Export a copy of the report to Excel and copy the image from the spreadsheet.
Or if you need access to the image more directly, I found this utility that will parse the XML and load and export the images. Looks like source code is available.

I have created a small Power Shell script to solve this problem:
$ErrorActionPreference = 'Stop';
Get-ChildItem -Filter '*.rdl' | ForEach {
$reportFile = $_;
Write-Host $reportFile;
$report = [xml](Get-Content $reportFile);
$report.Report.EmbeddedImages.EmbeddedImage | Foreach {
$imagexml = $_;
$imageextension = $imagexml.MIMEType.Split('/')[1];
$filename = $imagexml.Name + '.' + $imageextension;
Write-Host '->' $filename;
$imageContent = [System.Convert]::FromBase64String($imagexml.ImageData);
Set-Content -Path $filename -Encoding Byte -Value $imageContent;
}
}
https://gist.github.com/Fabian-Schmidt/71746e8e1dbdf9db9278
This script extracts all images from all reports in the current folder.

Open the XML (in notepad++ or anything)
Look for the <ImageData></ImageData> tags
Copy the 64-bit encoded string between the tags
Find a utility to convert x64 encoded strings to files. I used this website and downloaded the image

I just needed to do this and realised that it is possible to cut and paste the embedded image, even though it is not possible to copy and paste.

Related

script that searches for word documents (.rtf) and change the resolution of the images

I need an script that searches in a folder for word documents (.rtf) and change the resolution of the images. The problem is that I have a lot of .rtf files that are taking a lot of space because they have high resolution images. If I change the resolution of the images the file reduces it space about 97%. Please help me.
Thank you.
Unfortunately, there is no programmatic way to do "Select Image > Picture Format > Compress Pictures". It might be worth setting up an AutoHotKey script to run through your files.
If your rtf files were originally created in word, they likely saved two copies of each image (original file and huge uncompressed version). You can change this behavior by setting ExportPictureWithMetafile=0 in the registry, then re-saving each file. This can be done with a script, for example:
# Set registry key: (use the correct version number, mine is 16.0)
Try { Get-ItemProperty HKCU:\SOFTWARE\Microsoft\Office\16.0\Word\Options\ -Name ExportPictureWithMetafile -ea Stop}
Catch { New-ItemProperty HKCU:\SOFTWARE\Microsoft\Office\16.0\Word\Options\ -Name ExportPictureWithMetafile -Value "0" | Out-Null }
# Get the list of files
$folder = Get-ChildItem "c:\temp\*.rtf" -File
# Set up save-as-filetype
$WdTypes = Add-Type -AssemblyName 'Microsoft.Office.Interop.Word, Version=14.0.0.0, Culture=neutral, PublicKeyToken=71e9bce111e9429c' -Passthru
$RtfFormat = [Microsoft.Office.Interop.Word.WdSaveFormat]::wdFormatRTF
# Start Word
$word = New-Object -ComObject word.application
$word.Visible = $False
ForEach ($rtf in $folder) {
# save as new name temporarily (otherwise word skips the shrink process)
$doc = $word.documents.open($rtf.FullName)
$TempName=($rtf.Fullname).replace('.rtf','-temp.rtf')
$doc.saveas($TempName, $RtfFormat)
# check for success, then delete original file
# re-save to original name
# check for success again, then clean up temp file
if (Test-Path $TempName) { Remove-Item $rtf.FullName }
$doc.saveas($rtf.FullName, $RtfFormat)
if (Test-Path $rtf.FullName) { Remove-Item $TempName }
# close the document
$doc.SaveAs()
$doc.close()
}
$word.quit()
I made some default word files with a 2mb image, saved as rtf (without the registry change), and saw the rtf files were a ridiculous 19mb! I ran the script above, and it shrunk them down to 5mb.

Show all files in folder and subfolders and list names and encoding

I want to see all the files in a folder and its sub folders and list its encoding.
I know that you can use git ls-files to see the files and file* to get the name + its encoding.
But I need help how I can do both at the same time.
The reason is that we have problem with encoding and need to see what files are encoded in what way. So I guess a PS script would work fine as well.
I think the best way to solve this by Powershell is first get your files by following Script:
$folder = Get-ChildItem -Path "YourPath"
and in a foreach ($file in $folder) use one of the following scripts to get the encoding (which is straightforward)
https://www.powershellgallery.com/packages/PSTemplatizer/1.0.20/Content/Functions%5CGet-FileEncoding.ps1
https://vertigion.com/2015/02/04/powershell-get-fileencoding/

Downloading and opening a series of image urls

What I am trying to do is download 2 images from URL's and open them after download. Here's what I have:
#echo off
set files='https://cdn.suwalls.com/wallpapers/cars/mclaren-f1-gtr-42852-400x250.jpg','http://www.dubmagazine.com/home/media/k2/galleries/9012/GTR_0006_EM-2014-12-21_04_GTR_007.jpg'
powershell "(%files%)|foreach{$fileName='%TEMP%'+(Split-Path -Path $_ -Leaf);(new-object System.Net.WebClient).DownloadFile($_,$fileName);Invoke-Item $fileName;}"
Im getting 'Cannot find drive' A drive with the name 'https' cannot be found.
It's the Split-path command that is having problems but cant seem to find a solution.
You could get away with basic string manipulation but, if the option is available, I would opt for using anything else that is data aware. In your case you could use the [uri] type accelerator to help with these. I would also just opt for pure PowerShell instead of splitting between batch and PS.
$urls = 'https://cdn.suwalls.com/wallpapers/cars/mclaren-f1-gtr-42852-400x250.jpg',
'http://www.dubmagazine.com/home/media/k2/galleries/9012/GTR_0006_EM-2014-12-21_04_GTR_007.jpg'
$urls | ForEach-Object{
$uri = [uri]$_
Invoke-WebRequest $_ -OutFile ([io.path]::combine($env:TEMP,$uri.Segments[-1]))
}
Segments will get you the last portion of the url which is a proper file name in your case. Combine() will build the target destination path for you. Feel free to add you invoke item logic of course.
This also lacks error handling if the url cannot be accessed or what not. So be aware of that possibility. The code above was meant to be brief to give direction.

Files bulk renaming - match a predefined text file

Good day,
I am trying to rename/organize files based on the match/lookup found in the text file.
I have a couple of hundred Cyrillic(Russian) named media files in a folder like this:
файл 35.avi
файл34.avi
файл2 4.avi
файл14.avi
*note that some files have spaces
The text file, with the desired names, looks like this:
файл 35.avi| 4. файл 35.avi
файл34.avi| 3. файл34.avi
файл2 4.avi| 1. файл2 4.avi
файл14.avi| 2. файл14.avi
The reason it looks that way (with | as a separator) is because I tried using "Bulk Renaming Utility" which uses pipe | as a separator for "Rename Pairs" function. So essentially, the filename to the right of pipe | is the final product. Unfortunately, that function does not work with Cyrillic(Russian) or other non standard characters.
I found PowerShell script HERE which appears to be almost what I need except that it does not match file names before renaming.
Similarly, I found this Python script HERE which does what i need but it's for Ubuntu. Unfortunately, I am on a Windows7 and not sure it applies to me.
Any recommendations?
Thank you very much for your time!
You could read the text file into a hashtable, where the key is the old name (the value on the left hand side of the |), and the value is the new name:
$RenameTable = #{}
Get-Content textfile.txt |ForEach-Object {
$OldName,$NewName = $_.Split('|')
$RenameTable[$OldName] = $NewName
}
Then rename the files based on what is in the hashtable:
Get-ChildItem .\folder\with\avi\files |Rename-Item -NewName {
if($RenameTable.ContainsKey($_.Name)){
$RenameTable[$_.Name]
} else {
$_.Name
}
}

Merge .txt files in one .doc, adding file names and page breaks

I have a bunch of .txt files with various names in a folder, and I need to merge them into a single file that can be read in Office Word or LibreOffice Writer.
The tricky part is, the pasted files should be organized by creation date, have a title put before the content and a page break at the end, like this
Title of older file
File content
Page break
Title of newer file
File content
Page break
I could do this with Java, but it seems a little overkill. It would be nice if this could be done using Windows Powershell, or Unix bash. Added newlines should be Window style, though.
Full disclaimer: I know something about Bash, little about the Powershell and almost nothing about .doc/.odf formats.
Merging TXTs into one DOCX and adding page breaks (PowerShell, requires MS Word):
[Ref]$rSaveFormat = "Microsoft.Office.Interop.Word.WdSaveFormat" -as [Type]
$oWord = New-Object -ComObject Word.Application
$oWord.Visible = $false
$sPath = <path to dir with txt files>
$cInFiles = Get-ChildItem $sPath
$sOutFile = $sPath + "\outfile.docx"
$iWordPageBreak = 7
$iNewLineChar = 11
$oDoc = $oWord.Documents.Add()
$oWordSel = $oWord.Selection
foreach ($sInFile in $cInFiles) {
$sInFileTxt = Get-Content $sInFile
$oWordSel.TypeText($sInFile)
$oWordSel.TypeText([Char]$iNewLineChar)
$oWordSel.TypeText($sInFileTxt)
$oWordSel.InsertBreak($iWordPageBreak)
}
$oDoc.SaveAs($sOutFile, $rSaveFormat::wdFormatDocumentDefault)
$oDoc.Close()
$oWord.Quit()
$oWord = $null
For explanations see this blog post on TechNet.
Edit: without Word you probably should use ODT format and directly edit content.xml. Example in Python. Though personally I would simply concatenate the TXT files. Unless you have a million of them it's faster and easier to add page breaks manually than actually edit XML.

Resources