Extract Hyperlinks From Rich Text Clipboard Contents Or Text Selection On The Mac - macos

I would like to be able to get a list of all the hyperlinked URLs in any formatted text that I select on the Mac (formatted text such as a web page or a word processor document).
Preferably I'd like use Applescript or Automator to extract this list of hyperlinks from the text (so that I can then use Applescript to perform further processing on these URLs).
Note that I am talking about hyperlinks being extracted from formatted text, not just extracting URLs from text containing plaintext URLs.
This hyperlink extraction from formatted text seems like it should be a simple programming task, but I have been struggling to find a way to do this in either Applescript or Automator.
Automator can be set to accept rich text input from a text selection, or can input rich text from the clipboard, but I cannot find any way to access this rich text as a string within Automator or Applescript, such that I can then extract the hyperlinked URLs from the string of rich text data.
Once I get access to the rich text data as a string, there will be no problem in extracting the URLs.
Any suggestions on how I might solve this issue are gratefully received.

Applescript itself does not unpack embedded text, so you'll have to use a helper app one way or another. You can use do shell script 'textutil' to do some unimbedding of links:
perl -ne 'print chr foreach unpack("C*",pack("H*",substr($_,11,-3)))' |
textutil -stdin -stdout -convert html -format rtf
Then, you'll have to extract the URLs. I would suggest using the Automator action 'Extract Data' to do this. If you set the whole thing up as an Automator Workflow, you could invoke it from Applescript. Or, if you save it as a Service, you can just run the whole thing from the Service.
Here's a screenshot of that method that should show what you need:
Let me know if you have questions. You can see variations on this technique here.
Update:
If you want to create this into a Service, it is difficult to coerce the built-in input from Automator into RTF. An effective method is to ignore the input and do a
keystroke "c" using command down
to copy the selected text to the clipboard and then use the workflow from there. See example:

Related

AppleScript does not display dialogs that contain Unicode characters

I use AppleScript's script editor. When I try to display \u... formatted characters in a dialog box as readable text, I don't succeed. What's the problem?
Here's what I tried:
set theTextItems to (do shell script "printf \"\\u82f9\\u679c\"")
display dialog theTextItems as text
The output of the dialog box that pops up is:
\u82f9\u679c
It sounds like OP is trying to parse JSON-encoded data using plain AppleScript. Don’t. Use JSON Helper app (available in AppStore) or NSJSONSerialization via the AppleScript-ObjC bridge. Those will process any character escapes for you.

Automator adding unwanted line breaks

I'm using Automator to create an HTML page and everything works great but I'm running into one small problem. The user is asked for information at the beginning that is then set into variables. The page is created by grabbing some code using Get Specified Text and copying it to the clipboard, getting one of the variables and then putting them both into a text document. This process is then repeated several times, eventually creating an HTML file. I'm running into issues because Automator is creating line breaks (maybe carriage returns?) in between each bit of specified text and each variable. So, what I want to look like this:
<code grabbed using "Get Specified Text" followed by a Variable. And now some more text and another Variable.>
ends up looking like this:
<code grabbed using "Get Specified Text" followed by a
Variable
. And now some more text and another
Variable
.>
This is breaking my page in a few parts. Is there a way to prevent these line breaks?
The items passed along from action to action are in a list, so it appears that setting the TextEdit contents separates the individual items by a newline, which is the normal paragraph delimiter.
Many of the text actions assume TextEdit and/or rich text and don’t use variables (or get along with other plain text actions), so a Run AppleScript action can be used before an action to convert or concatenate items, for example (Mojave):
Automator (or TextEdit for that matter) isn’t really a very good tool for HTML editing. You might take a look at BBEdit (the light version is free), which also has excellent AppleScript support.
EDIT:
Use the following in a Run AppleScript action to combine the text using a specified delimiter (this example uses an empty string):
on run {input, parameters}
set separator to "" -- text to separate the items with
set tempTID to AppleScript's text item delimiters
set AppleScript's text item delimiters to separator
set output to input as text
set AppleScript's text item delimiters to tempTID
return output
end run

Support "styled text" in a scriptable Mac application (Cocoa Scripting)

My app supports being scripted with Applescript.
I am trying to make styled text content, stored in NSAttributedString objects, available to an Applescript user.
I thought I could simply deliver styled text with the NSAttributedString class, just like I deliver plain text with the NSString class, but that does not work - Cocoa Scripting then reports that it cannot convert or coerce the data.
I wonder if I'm missing something or if this is just plain impossible with the standard classes supported by Cocoa Scripting?
AppleScript does know the "styled text" type, as seen in this example:
set stxt to "foo" as styled text
So, if AppleScript knows this type by default, shouldn't the Cocoa Scripting engine support it as well somehow?
As always there are many choices for solving an AS problem.
In my scriptable text editor (Ted), I implemented the Text Suite, which is based on rich text (NSTextStorage, a subclass of NSMutableAttributedString). I wanted to be able to script tabs in my paragraphs, so I added a style record, which contains all the paragraph style information. This lets me write scripts like this:
tell application "Ted"
set doc1 to make new document at beginning with properties {name:"Document One"}
tell doc1
set p1 to make new paragraph at end with data "Paragraph One" with properties {size:24, color:maraschino}
set p2 to make new paragraph at end with data "Paragraph Two" with properties {style:style of paragraph 1}
set color of paragraph 1 to blue
end tell
set doc2 to make new document at beginning with properties {name:"Document Two"}
copy p1 to beginning of doc2
properties of paragraph 1 of doc2
end tell
Since p1 is rich text, the second document ends up with both the text and formatting of the first paragraph of the first document.
You can also ask for the properties of a piece of text, where I have implemented the usual Text Suite properties, as well as a "style" property for paragraph style (backed by NSParagraphStyle, since I wanted to be able to script the tab stops):
properties of paragraph 1 of doc2
Result:
{height:60.0, italic:false, size:24, style:{paragraph spacing after:0.0, head indent:0.0, line break mode:0, alignment:4, line spacing:0.0, minimum line height:0.0, first line head indent:0.0, paragraph spacing before:0.0, tabs:{"28L", "56L", "84L", "112L", "140L", "168L", "196L", "224L", "252L", "280L", "308L", "336L"}, tail indent:0.0, maximum line height:0.0, line height multiple:0.0, default tab interval:0.0}, color:blue, width:164.109375, font:"Helvetica", bold:false, class:attribute run}
This works well for passing rich text within my application, but may not be as useful for passing styled text to other applications, which may be what you wanted to do. I think adding a "style" property (of type record) is probably the best way to convey style info for use in other scriptable apps. Then in the second app, the scripter can make use of any properties in the style record that the second app understands.
It looks like there is no implicit support for styled text in AppleScript. And there is also no common interchange record type for passing styled text.
AppleScript was developed in the pre-OSX days when styled text was often represented by a combination of a plain text (in System or MacRoman encoding) and a styl resource. With Unicode came an alternative format of a ustl style format. These are still used with the Carbon Pasteboard API (PasteboardCreate etc.) today. Yet, none of these seem to have made it into the use with AppleScript.
The fact that AppleScript knows of a styled text type has no special meaning. Even its class is just text.
Update
I just found that Matt Neuburg's book "AppleScript The Definitive Guide" mentions styled text and gives an example where it's indeed showing a record containing both the plain text (class ktxt) and style data (class ksty) with data of type styl, just as I had expected above. He also points out that most applications don't use that format, though.
So, it appears using a record with style resource data is indeed the intended way, only that hardly anyone knows about it. Go figure.

AppleScript to take text and turn it into pasteable HTML

We work with bugzilla. Whenever you need to query a ticket you just need to know the bugid (integer) and you simply prepend this to it.
http://<bugzilla_server>/bugzilla/show_bug.cgi?id=<bug_id>
Suppose I have a bug link which looks like this 777. If I select and copy this it is preserved on the pasteboard so when I paste this into mail it will correctly preserve the link and it's attributes.
What I am looking for is to simple type '777' select it and run an applescript on it and replace it with a link like the one above. Can anyone help me out??
The following AppleScript will take the contents of the clipboard and replace it with the URL prepended:
set the clipboard to "http://bugzilla_server/bugzilla/show_bug.cgi?id=" & (the clipboard)
You can compile that to an AppleScript scpt and make it available in a Scripts folder or compile it to a launchable app:
osacompile -e 'set the clipboard to "http://bugzilla_server/bugzilla/show_bug.cgi?id=" & (the clipboard)' -o replacebug.scpt # or -o replacebug.app
If your primary use case for this is in composing mail in Mail.app, this may not be the most user-friendly approach, though. If you are using Snow Leopard (10.6), a simpler solution is to take advantage of the new Text Substitution feature. Open the System Preferences -> Language & Text preference panel, select the Text tab, and click + to add a new substitution, perhaps:
Replace With
(b) http://bugzilla_server/bugzilla/show_bug.cgi?id=
Then, in Mail.app, start a New Message and, with the cursor clicked within the text body, do a Control click of the mouse to bring up the contextual menu. From it, select Substitutions -> Text Replacement. From now on, as you are typing in the text body of the email when you type:
(b)777
the (b) will automatically change to the URL text you saved:
http://bugzilla_server/bugzilla/show_bug.cgi?id=777
This will also work in other Cocoa text-enabled applications like Safari.
EDIT:
When talking about composing URL links in email, there are at least three different formats of email, each with a different solution. Since you don't say which kind you are using, I'll cover all three:
Plain text format - There's no way to "hide" the URL in the composed email although some email readers might present a clickable link for a plain-text URL.
HTML-formatted email - Apple's Mail.app does not support composing email in this format although it will display it. Using some other mail writer client or your own program, it's easy enough to compose a link using a standard HTML anchor <a href=...> tag.
Rich Text Format email - AFAIK, this is the only way to compose a URL link with Mail.app. Unfortunately, there does not appear to be an easy way to directly create an RTF hyperlink using AppleScript commands. Based on a suggestion here, this is a way to do it by creating a modifiable RTF template via the clipboard.
In TextEdit.app, create a new Document window.
Insert the text you want to appear in the email, i.e. 777.
Select the text (⌘A) then add a link (⌘K). Enter the full URL also with 777 into the "Link destination" field; click OK.
Modify the text format as desired with Format menu commands.
Save the file (⇧⌘S) as temp.rtf with File Format -> Rich Text Format.
Close the document window.
Open a document window (⌘O) selecting file temp.rtf and selecting Ignore rich text commands.
Insert the following before the first line in the file:
#!/bin/sh
sed -e "s/777/$(pbpaste -Prefer txt)/g" <<EOF | pbcopy -Prefer rtf
Append EOF as a separate line at the end of the file.
It should now look something like this:
#!/bin/sh
sed -e "s/777/$(pbpaste -Prefer txt)/g" <<EOF | pbcopy -Prefer rtf
{\rtf1\ansi\ansicpg1252\cocoartf1038\cocoasubrtf250
{\fonttbl\f0\fswiss\fcharset0 Helvetica;}
{\colortbl;\red255\green255\blue255;}
\margl1440\margr1440\vieww9000\viewh8400\viewkind0
\pard\tx720\tx1440\tx2160\tx2880\tx3600\tx4320\tx5040\tx5760\tx6480\tx7200\tx7920\tx8640\ql\qnatural\pardirnatural
{\field{\*\fldinst{HYPERLINK "http://bugzilla_server/bugzilla/show_bug.cgi?id=777"}}{\fldrslt
\f0\fs24 \cf0 777}}}
EOF
Save this as a Plain Text file and execute directly as a shell script or call it via the AppleScript do shell script command.
This kind of solution will work with most other applications that support Rich Text format.
Not sure exactly the function you're looking for, but this will take a number from your clipboard and process it into a link and put the link on the clipboard as a standard href URL that will work in plain or rich text, like:
Bug number 777 link
Change <bugzilla_server> to your working URL.
set bug_number to the clipboard
set the_text to "Bug number " & bug_number & " link"
set the clipboard to the_text

Using AppleScript to hide Keynote Text Fields in A Slide

I am no AppleScript Jedi, I've only done a few simple things, but I haven't been able to figure this one out and could use some help:
My wife uses slides for her Art History courses and would like to use the same slides for exams (sans identifying names). Rather than create a new presentation, I'd like a tool that iterates through the slides and hides the text fields.
Looking through the Keynote dictionary didn't give me any clues as to how to approach this, any ideas?
AFAIK, with Applescript you can only access the title and the body boxes of the slides. If the text you wish to remove is consistently in either of these boxes the simplest solution would be to loop through the slides replacing that text and then saving a copy of the document.
tell application "Keynote"
open "/Path/To/Document"
repeat with currentSlide in slides of first slideshow
set title of currentSlide to " "
set body of currentSlide to " "
end repeat
save first slideshow in "/Path/To/Document without answers"
end tell
If the text is in a container created with the textbox tool, I don't think you can solve it with Applescript, but Keynote uses an XML based file format, so you could try doing it by editing the XML with your scripting language of choice. The XML schema is documented in the iWork Programming Guide.

Resources