I have an image that I'm trying to embed in the markdown text of the document. When I click on the "Insert image" icon and navigate to my image, it inserts an ENORMOUS block of text. I copy/pasted into a text file and it's over 287,000 characters long. It makes the notebook obnoxious to work with. My questions are:
Is there a way to redact this somehow?
Why on earth would the path be so long?
I've searched for workarounds (like this one, but I can't find any that work for me.
Related
I am trying to do exactly what is described in the following thread:
AppleScript/Automator: renaming PDF with extracted text content of this PDF
So I am using the Chino22's version and there are two issues with it:
First, instead of the contents of the pdf, theFileContentsText gets some metadata stuff.
Second, althought the script runs to the end, I get the following error for the last step:
error "The variable thisFile is not defined." number -2753 from "thisFile"
So, how do I get the text contents instead, and how do I define thisFile to the current pdf that is being processed in the loop?
Thanks in advance!
I would not expect the linked script to work.
Except for document metadata, extracting text content from PDF is notoriously difficult and unreliable, and not a road you want to go down if you can possibly avoid it. Adobe’s PDF file format is designed for printing, not for data processing. PDF files contain blocks of Postscript-like page drawing instructions, typically compressed, and while it’s possible for PDFs also to include the original plain text for accessibility use, most PDF generators do not do this so the only way to get the original text is by reconstructing it from those low-level drawing instructions—not a trivial job.
AppleScript’s read command only reads that raw file data; it does not parse it into drawing instructions, never mind translating those drawing instructions back into plain text. Change a PDF file’s extension to .txt and open it in a plain text editor, and you’ll see what I mean. Nasty.
If you need to work with the PDF’s original content (text, images, whatever), your best solution is to get those files before they were converted into a PDF.
If you must extract content from a PDF file, use an existing tool that knows how to do it.
For instance, if you’re lucky enough to have PDFs that contain XFDF (XML form) or accessibility data, there are 3rd-party apps and libraries to extract that content in readable form. I can’t think offhand of any that are AppleScriptable (Adobe Acrobat has only minimal AS support) so you’ll probably need to find one you can run from command line (do shell script in AS).
Or, if the PDFs have a consistent visual structure, a 3rd-party library such as Python’s PDFMiner (which I’ve used in the past) can identify blocks of characters by position and convert those back into strings with varying degrees of reliability (it has to convert font glyphs back into Unicode characters, guess at which characters are close enough to constitute a word, and where to insert space and return characters between those words). You’ll have to write some Python code to extract the bits you want, so look for tutorials to get started (or pay someone to write it for you).
But again, if you can possibly avoid having to extract text from PDF, you should. You will save yourself a lot of trouble.
(Only understandable if you're using Chrome)
Whenever I try to press Ctrl+U on an image page, nothing happens. So, I manually typed in the search bar "view-source:http://assets.silverdroid.ga/assets/discord/stop2.gif" and there, I found some very strange unicode characters, none of them readable.
Can someone explain to me what this is?
Linked screenshot of "image code".
That's the image data of the GIF. You can see that it starts with "GIF89a", which is a GIF format
Basically what you are seeing is the binary data of the image displayed as characters, that's why most of them are "weird" or non-readable.
I've looked at several posts, including this one, but I can't seem to get it to work. I've done as what one user commented under the chosen answer. Here's what I literally typed in the Markdown cell in iPython:
[blahblah](/Users/MyName/Python/testfile.png)
But it doesn't work. I also tried putting qutation marks around the file path and it doesn't work either. I'm certain that that is the correct path.
How can I upload an image? What should I literally type?
Image links in Markdown begin with a !, see https://daringfireball.net/projects/markdown/syntax#img. What you typed is only a link. Try this:
![blahblah](/Users/MyName/Python/testfile.png)
In case of spaces in your image path, escape them with \:
![blahblah](/Users/My\ Name/Python\ iNotebook/testfile.png)
While in class I like to take handwritten notes, afterwards I scan them and then type them up (helps me remember them and also makes them easily searchable). The main issue is I have is I use A LOT of drawings and complex math and converting the math formulas into latex (or word) is very time consuming and the drawings require that I keep the PDF and the text document. What I would like to do is take the basic text that I have typed myself (no OCR) and add a text layer to the PDF's that way the PDF's will be searchable and I can save a lot of time by not converting the math or drawings.
I've looked into Preview, PDFpenPro, acrobat, a couple of linux programs but so far I haven't really found anything that will do this.
Any idea of how I could do this or a program to use?
I also scan my notes. Sometimes I go back and add some text to them using this technique:
Open up the scanned pdf in Preview, then click on the "Edit" button in the top right corner, then the "Text tools" button on the left side (its a little box with Aa in it). From there you can drag open a text box and type into it.
Now the secret trick is that if you save it here as it is and try to open it in your ipad using PDFExpert or some other program then the text might not be there. So here's how to go through that slight hiccup: After you've annotated your notes how you want instead of just saving it as a pdf, use the Print option: File->Print or Command+P. Now click the PDF button on the left to "Save it as a pdf". Now that its printed you can open it and search it in any program that reads pdfs. Attached is an example.
One other thing, it seems like maybe you want to write over your existing handwritten text with typed text? I'm not sure if this is the best way. But if that's what I was trying to do I would:
Scan my notes
Read through them, typing them up as you said
Open the scanned notes in Photoshop or some other program
Draw a giant White Fill White Stroke rectangle over the handwritten text
Save it as a pdf
Do the technique above and copy and paste the typed text from step 2.
I hope this helps. And I wish you luck, I'm still working out the kinks myself for scanned notes but the possibilities have me pretty excited!
EDIT: I just checked out PDFpenPro, which I highly recommend because you don't have to go through that printing trick, you can just save the pdf document after annotating and other programs will recognize the annotations.
I'm looking for a nice way to generate either a Keynote file from XML or a Powerpoint file that I can then import to Keynote. Basically, I'm looking for a simple human-writable markup format (for easy scripting) that can be exported into slides.
I volunteer with a local nonprofit, where anything remotely technical falls to me. On a fairly regular basis, I'm sent information for events and produce a nice looking printed program in Word, though much of the same material also goes into slides in Keynote. (Keynote is used rather than PowerPoint so that Keynote Remote can be used.)
Anyway, there's a large volume of text I work with that I'm sent via email, and it has to go in both a Keynote presentation and a Word document, and requires all sorts of odd manual formatting to not break pages or slides at odd times, also requiring a good deal of manual restyling, since I'm not going to allow something I do to come out looking like something sloppy from the 1990s.
My hope is to write up a Ruby script that I can feed the source text to, and it'll go do all the processing for me, at least for Powerpoint or Keynote. I've normally had fantastic luck finding a gem for just about any format or service I've wanted to work with, but I haven't found anything that works with Powerpoint or Keynote.
My next thought was to have the Ruby code generate appropriate XML since both Office and I Work allegedly open the Office XML format, but I couldn't find any actual friendly documentation for human-writable XML code.
Is it wishful thinking to want to be able to do something like the following?
<SLIDE FORMAT="Title & Bullets">
<SLIDE_TITLE>
Lorem Ipsum
</SLIDE_TITLE>
<PARAGRAPH>
[etc.]
All I can find as far as converter scripts is all related to charts and tables and such which is of zero use here), usually revolves around opening or converting FROM Powerpoint or Keynote rather than creating, and furthermore generally seems to be for Windows using OLE or VBScript. This needs to run on the Macs they have there, so no Visual Studio stuff, Windows related scripting, etc will work. I don't HAVE to do it in Ruby, but that's what I'd be most comfortable with on the Mac end of things.
So is there documentation out there on a marginally friendly XML format for Powerpoint or Keynote, or even better, a Ruby gem for either?
If all you need to do is title + bullet point slides, you simply need to create an ascii text file. Each line of text will become the title of a new slide. But if the first character in a line of text is a tab, the line will become a first level bullet point on the same slide as the previous title. If two tabs, it indents the text to a second level bullet point and so on.
This becomes the title on slide one
This becomes the title on slide two
<tab>This is a bullet point, first level
<tab><tab>And this is a bullet point, second level
<tab>Back to first level bullet point
And another new slide
Once you have the text file, you can do File Open in PPT and force files of type to all files . and select your .TXT file. Or you can use Insert Slide From File to bring the .TXT file into an existing presentation.
There's a limit to the number of slides you can create at one go like this; 100 perhaps?
Note also that VBA disappeared in Mac Ofice 2008 but is back in Mac Office 2011, so if you can find examples of VB/VBA code that do what you want, you can use them on Mac, so long as it doesn't have to happen in Office 2008.