Ruby Imap / mail excluding specific files - ruby

My code gets all attachments from an email and checks the file extension before downloading it. So far I have excluded every file that is a picture, so .PNG and .JPG files.
mail.attachments.each do |attachment|
next unless File.extname(attachment.filename).match(ACCEPTED_FILE_EXTENSIONS)
File.open("../files/#{attachment.filename}", 'wb') do |file|
file.write(attachment.body.decoded)
end
end
The problem with this is that, if I want to download .JPG and .PNG files then every file in the email that is a picture, will be downloaded. So if someone sends me an email and has some custom made footer with an ad for some toothbrush, then the picture of the toothbrush will also be downloaded, even tho I don't want it. I only want the attachments that were selected as "attach a file", when you choose the files that you actually want to send someone, if it makes sense.
so my question is. Is there a way to exclude these "random" pictures?

Related

Original filetype of ashx file

I'm building a PHP application that uses data from a web service. I add an image to a desktop application which then saves it to the web. The web service provides image URLs using the .ashx file extension. If I put one of these in an <img src="file.ashx?pictureId=abc123">, it displays as an image.
I want to store these images. I know they'll generally be .jpg files and can run file_get_contents on this and save it as such. However, if one was a .png, for example, I'd still be saving it with a .jpg extension, so it's an assumption I don't wish to make.
I've had a look at the raw string of characters of the file and cannot see any identifying features to tell me that it's a .jpg, apart from perhaps the clue that it was created in Photoshop. Nowhere does it say what kind of file it was originally, either extension or original filename.
Is there a way of finding the original filetype of a file contained within .ashx URL?
The question doesn't make any sense. Maybe the .ashx script generates the image on the fly out of nothing and there is no "original".
The correct question is: how to find the type of the image retrieved from the .ashx URL?
Save the image into a (temporary) file then use getimagesize() to find its type (GIF, JPEG, PNG etc) and choose the correct termination for its final file name.

save html file to memorystream using vbscript

Long story short, I need to be able to save .msg files to .pdf using vbscript. As I can't do this directly using Outlook, I need an intermediate step of saving the .msg to .html (using Outlook) and then save the .html to .pdf (using Word). Is there a way to save the html file into memory somehow instead of actually having to create the .html file only to delete it later (as it's only an intermediate step)? I'm looking at memorystream...am I on the right track?

Images not displaying in word File after linking images in the file

I have a word file.
I have inserted images by linking it to avoid more size of doc file. Here I have kept the all images into a folder and linked in the doc file. This works fine.
But, when I send to my friend the doc file the the image file, the images cannot be seen. Quite obvious due to path has been changed.
I don't want to host the images online. If I do, this will resolve if I'm online.
If I'm off line I can't view the images in the doc file.
How do I overcome this problem?
Why not just get your friend to edit hyperlinks to update it to location of each image on their machine?

How to download pdf file in ruby without .pdf in the link

I need to download a pdf from a website which does not provide a link ending with (.pdf) using ruby. Manually, when i click on the link to download the pdf, it takes me to a new page and the dialog box to save/open the file appears after some time.
Please help me in downloading the file.
The link
You an do this
require 'open-uri'
File.open('my_file_name.pdf', "wb") do |file|
file.write open('http://someurl.com/2013-1-2/somefile/download').read
end
I have been doing this for my projects and it works.
If you just need a simple ruby script to do it, I'd just run wget. Like this exec 'wget "http://path.to.the.file/and/some/params"'
At that point though, you might as well run wget.
The other way, is to just run a get on the page that you know the pdf is at
source = Net::HTTP.get("http://the.website.com", "/and/some/params")
There are a number of other http clients that you could use, but as long as you make a get request to the endpoint that the pdf is at, it should give you the raw data. Then you can just rename the file, and you'll have the pdf
In your case, I ran the following commands to get the pdf
wget http://www.lawcommission.gov.np/en/documents/prevailing-laws/constitution/func-download/129/chk,d8c4644b0f086a04d8d363cb86fb1647/no_html,1/
mv index.html thefile.pdf
Then open the pdf. Note that these are linux commands. If you want to get the file with a ruby script, you could use something like what I previously mentioned.
Update:
There is an added complication that was not initially stated, which is that the url to the pdf changes every time there is an update to the pdf. In order to make this work, you probably want to do something involving web scraping. I suggest nokogiri. This way you can look at the page where the download is and then perform a get request on the desired URL. Furthermore, the server that hosts the pdf is misconfigured, and breaks chrome within a few seconds of opening the page.
How to solve this problem: I went to the site, and refreshed it. Then broke the connection to the server (press the X where there would otherwise be a refresh button). Then right click next to the download link, and select inspect element. Then browse the dom to find something that is definitively identifying (like an id). Thankfully, I found something <strong id="telecharger"> Download</strong>. This means that you can use something like page.css('strong#telecharger')[0].parent['href'] This should give you a URL. Then you can perform a get request as described above. I don't have time to make the script for you (too much work to do), but this should be enough to solve the problem.

Cannot delete particular file

I am developing gallery viewer app. App will fetch the image file from Isolated Storage and will show in an image control. Most of the things are already setup and working fine.
Now images are stored in folders which act as album and user can delete the whole album. I tested with many many folders with assorted images and delete works fine. But I see that when there is a particular image file of name "XXXX.jpg", it doesn't get deleted. Although all the images from folder are shown in image control. That file doesn't gets deleted, and an exception is thrown "ArgumentUnhandledException". I tried after renaming the file but earth didn't move.
Also, for testing purposes I am transferring folders(with images in it) using "Windows Phone Device Manager". I know it is not official to use it, but it makes testing easy. Also peculiar thing is that "Windows Phone Device Manager" also cannot delete that particular file, although I dumped that file in app's isolated storage using "Windows Phone Device Manager". So I think that there is some problem with that file.
How can I delete the file. And if I cannot, how can I know it before hand that some files cannot be dealt with properly and should not be dumped in Isolated storage.Here is that file. File is inside zip file. I think file is required and not not just uploading image to a image hosting site. Please take a look. Also try not to look over the content of image file, that's only random file which is not working and I want to know why.
I found that file was marked read-only and was causing problem when a delete attempt was made. Removing read-only solves the problem.

Resources