I’m attempting to create an Applescript that will grab the URL from a selected hyperlink.
For some backstory: the system that my company has in place doesn’t play well with generating reports, so I created a script into which I can paste a list of URLs, at which point Safari will go through each page and select all the data, copy it, and parse out what I need.
However, each page that I’m parsing has a link on it that says, for example, “Edit”. If I post it into, say, Pages, the hyperlink is preserved. It would GREATLY speed up my flow if I could somehow get the URL contained in that hyperlink.
Any ideas?
Drew, I suspect you got no answer because it's a little difficult to discern what you're wanting. But, here is a script that will grab the raw text of a web page, and then find the first href hyperlink that is named "Edit", and then return the target URL that it's linking to. It uses CURL to pull the content and offset to find the link name. You might have to adjust the tag identifiers surrounding the Link Name you're searching for.
property baseURL : "http://www.mycoolsite.index.html"
property linkName : "Here"
set rawHTML to do shell script "curl '" & baseURL & "'"
set theOffset to offset of ("\">" & linkName & "</a>") in rawHTML
set rawHTML to text 1 thru (theOffset - 1) of rawHTML
set otid to AppleScript's text item delimiters
set AppleScript's text item delimiters to "http://"
set targetURL to (text item -1 of (text items of rawHTML))
set AppleScript's text item delimiters to otid
return targetURL
Related
I've been experimenting with the Pages object model and am trying to manipulate it with AppleScript. I'm able to use this command to add new pages to the document.
make new page
The problem I'm facing is that it always creates the new page at the beginning of the document instead of at the end of the document.
The dictionary says that the entire syntax is
set theResult to make new type class ¬
at location specifier ¬
with data anything ¬
with properties record
and I've tried to use BOTTOM and END as values for location specifier, and they've been unsuccessful. What am I missing, please, to create a new page or a new section at the bottom of the document?
Just think in natural language.
You want to make a new page at the end of the sections in the front document
tell application "Pages"
tell front document
set newPage to make new page at the end of sections
end tell
end tell
I am very new to writing code. I've been looking at every way I can find of finding a string in a text document and then returning part of the string on the following line. Ideally with the end goal of putting this extracted string into an excel file but I'm no where near that step yet. I've been playing around with a lot of different options and I can not for the life of me get it to work. I feel like I'm close and it's killing me because I just can't figure out where I'm going wrong here.
Goal: to extract the name of the person who posted the job from the text below without knowing the person's name. I know the string "Job posted by" will immediately preseed the name I'm looking for and I know " · " will immediately follow the name. no where else in the text document do either of these surround strings appear.
I'm running OS X El Capitan
file name for this example is ExtractedTextOutput.txt
file location for this example is "/Users/RaquelBianca/Desktop/ExtractTextOutput2.txt"
my attempts at this so far are the following (my issue is that it appears to simply return the entire text document as opposed to just the name I'm looking for)
set theFile to ("/Users/RaquelBianca/Desktop/ExtractTextOutput2.txt")
set theFileContents to read theFile
set output to {}
set od to AppleScript's text item delimiters
set AppleScript's text item delimiters to {"
"}
set all_lines to every text item of theFileContents
repeat with the_line in all_lines
if "Job posted by" is not in the_line then
set output to output & the_line
else
set AppleScript's text item delimiters to {"Job posted by"}
set latter_part to last text item of the_line
set AppleScript's text item delimiters to {" "}
set last_word to last text item of latter_part
set output to output & ("$ " & last_word as string)
end if
end repeat
set AppleScript's text item delimiters to {"
"}
set output to output as string
set AppleScript's text item delimiters to od
return output
any and all help and ideas is enormously appreciated.
sample text in the file:
9/2/2016 Application Security Engineer Job at Datadog in Greater New York City Area | LinkedIn
60
Home Profile
Job description
My Network Jobs
Search for people, jobs, companies, and more... Interests
Advanced
Business Services
Go to Lynda.c
Application Security Engineer
Datadog
Greater New York City Area
Posted 15 days ago 93 views
1 alum works here
Apply on company website
We’re on a mission to bring sanity to cloud operations and we need you to build resilient and secure applications on our platform. What you will do
Perform code and design reviews, contribute code that improves security throughout Datadog's products Educate your fellow engineers about security in code and infrastructure
Monitor production applications for anomalous activity
Prioritize and track application security issues across the company
Help improve our security policies and processes
Job posted by
Ryan Elberg · 2nd
Head of Tech Talent Acquisition at Datadog Greater New York City Area
Send Inmail
I just had some difficulties to determine what is exactly your second separator. you text example shows '·', but when I checked what is just after 'Elberg" and before '2nd...', I found 4 characters : code 32 (space), code 194 (¬), code 183 (∑), code 32 (space).
In the script bellow, I have used the code 194. it works when I cut/paste your text example into a file. Here is the script :
set theFile to ("/Users/RaquelBianca/Desktop/ExtractTextOutput2.txt")
-- your separator seems to be code 32 (space), code 194 (¬), code 183 (∑), code 32 (space)
set Separator to ASCII character 194 -- is it correct ?
set theFileContents to read theFile
set myAuthor to ""
set AppleScript's text item delimiters to {"Job posted by "}
if (count of text item of theFileContents) is 2 then
set Part2 to text item 2 of theFileContents -- this part starts just after "Job posted by "
set AppleScript's text item delimiters to {Separator}
set myAuthor to text item 1 of Part2
end if
log "result=//" & myAuthor & "//" -- show the result in variable myAuthor
Note : if the text does not contain "Job posted by ", then myAuthor is ''.
You had the right idea to use AppleScript's text item delimiters, but the way you tried to extract the name was giving you trouble. First, though, I'll go through some things you can do to improve your script:
set all_lines to every text item of theFileContents
repeat with the_line in all_lines
if "Job posted by" is not in the_line then
set output to output & the_line
else
…
end repeat
There's no need to break the file contents into lines; AppleScript can operate on entire paragraphs or more, if desired.
Removing these unnecessary steps (and adding new ones to make it work on the entire file) shrinks the script considerably:
set theFile to ("/Users/RaquelBianca/Desktop/ExtractTextOutput2.txt")
set theFileContents to read theFile
set output to {}
set od to AppleScript's text item delimiters
if "Job posted by" is in theFileContents
set AppleScript's text item delimiters to {"Job posted by"}
set latter_part to last text item of theFileContents
set AppleScript's text item delimiters to {" "}
set last_word to last text item of latter_part
set output to output & ("$ " & last_word as string)
else
display alert "Poster of job listing not found"
set output to theFileContents
end if
set AppleScript's text item delimiters to od
return output
This right here is what's giving you wrong output:
set last_word to last text item of latter_part
set output to output & ("$ " & last_word as string)
This is incorrect. It's not the last word you want; that's the last word of the file! To extract the poster of the job listing, change it to the following:
repeat with theWord in latterPart
if the first character in theWord is "¬" then exit repeat
set output to output & theWord
end repeat
Due to AppleScript's weird Unicode handling, for whatever reason the dot (·) that separates the name from the other text is converted to "¬∑" when run though the script. So, we look for "¬" instead.
Some last code fixes:
Some of your variable names use the_snake_case, while others use theCamelCase. It's generally a good idea to use one convention or another, so I fixed that, too.
I assumed you wanted that dollar sign in the output for whatever reason, so I kept it in. If you don't want it, just replace set output to "$ " with set output to "".
So, your final, working script looks like this:
set theFile to "/Users/RaquelBianca/Desktop/ExtractTextOutput2.txt"
set theFileContents to read theFile as text
set output to "$ "
set od to AppleScript's text item delimiters
if "Job posted by" is in theFileContents then
set AppleScript's text item delimiters to {"Job posted by"}
set latterPart to last text item of theFileContents
set AppleScript's text item delimiters to {" "}
repeat with theWord in latterPart
if the first character in theWord is "¬" then exit repeat
set output to output & theWord
end repeat
else
display alert "Poster of job listing not found"
set output to theFileContents
end if
set AppleScript's text item delimiters to od
return output
I am looking for some help getting an apple script setup. I have been trying to copy and past from different examples on the web to no avail. I am setting up a journal / diary for a family member and need to have a text file that contains the following information.
The AppleScript will display a dialogue box asking for three things:
The name of an event
The date of the event
A description for the event
Each of those would be stored as a separate variable.
Then the script would ask for a selection of files from the Finder, nothing nested, just a selection of 15 - 30 files all contained in the same folder.
Finally a new TextEdit document would be created
The beginning of the document would have the (3) variables mixed in with some default text.
The middle of the file would be filled in with a repeat loop based on the number of files selected from the finder. Their file paths would be mixed in with additional default text.
The last section would be default text only, no variables required.
I am sure my description is way more complicated than the script will probably be. Would anyone be able to provide this script for me? It would be most appreciated.
Here is a rough idea what the final thing would look like. The bold areas are the variables.
The activity of the day was scuba diving.
The date you went scuba diving was January 1, 2016.
This is a description of your event. The day was quite beautiful and the water was perfect. You were able to see a wide variety of fishes!
These are the locations of the files from this event.
The first file is /events/scuba/scuba1.txt
These are the locations of the files from this event.
The first file is /events/scuba/scuba2.txt
These are the locations of the files from this event.
The first file is /events/scuba/scuba3.txt
This was a summary of your scuba diving activity. These memories will last a lifetime!
I appreciate the help with this. And if the family member in question was able to provide a thanks, know that they would as well.
You can do that like this:
set evName to text returned of (display dialog "The name of an event" default answer "")
set evdate to text returned of (display dialog "The date of the event" default answer "")
set evDesc to text returned of (display dialog "A description for the event" default answer "")
set theText to "The activity of the day was " & evName & return & "The date you went " & evName & " was " & evdate & return & evDesc & return & return
set x to choose file with multiple selections allowed
set def1 to "These are the locations of the files from this event."
set def2 to "The first file is "
repeat with i in x
set theText to theText & def1 & return & def2 & (POSIX path of i) & return
end repeat
set theText to theText & return & "This was a summary of your " & evName & " activity. These memories will last a lifetime!"
tell application "TextEdit"
make new document with properties {text:theText}
activate
end tell
May I suggest an alternative solution using Evernote?
You could create a "template" note using a table to fill in the activity, date, and description. Any time you want a new journal entry, just select the template, and go to Note > Copy to Notebook.
Then you can attach and/or import the text of the files.
This would also allow you to add images and other attachments, and search much easier. And of course it is easy to share.
Let me know if you'd like more details.
Screenshot of example:
I am trying to write a script to save a Word document in the single-file webpage (.mht) format. I am up to the part where I write the actual "save" command, and I'm stuck there. This is what I am trying to do:
# the_file is a variable which has been set here
tell application "Microsoft Word"
activate
open the_file
save the_file as [type]
end tell
The open part works just fine. But I don't know what to put in for the save type. Perhaps more importantly, I don't know where I can find a list of the available types. Can anyone help?
EDIT: A commenter suggested the word dictionary; I found the following there but don't know how to interpret it [I'm an AS noob].
[file format format document97/format document97/format template97/format template97/format text/format text line breaks/format dostext/format dostext line breaks/format rtf/format Unicode text/format Unicode text/format HTML/format web archive/format stationery/format xml/format document/format documentME/format template/format templateME/format PDF/format flat document/format flat documentME/format flat template/format flat templateME/format custom dictionary/format exclude dictionary/format documentAuto/format templateAuto] : The format in which the document is saved.
Try format web archive. Of all the formats listed, that one looks the most likely.
1- You must specify a document when using the save command, not the file path.
For better control, use the open command with the property file name, it return the document object.
When using this : open the_file, it return nothing, in this case you must use front document, but it's unreliable, for example if another document opens after.
2- Word does not change the extension when using the save command in Applescript, the script must replace the extension.
Also, I recommend the command save as to have more options instead of save.
Answer updated : format HTML instead of Web archive
set the_file to (choose file)
tell application "Microsoft Word"
set thisDoc to open file name (the_file as string)
set tName to my removeExtension(name of thisDoc)
-- save in the same directory
save as thisDoc file format format HTML file name (tName & ".htm") with HTML display only output
close thisDoc saving no
end tell
on removeExtension(t)
if t does not contain "." then return t
set tid to text item delimiters
set text item delimiters to "."
set t to (text items 1 thru -2 of t) as string
set text item delimiters to tid
return t
end removeExtension
If you don't want HTML display only output, use without HTML display only output
I need to use rb-appscript to create a new Pages document that contains bulleted and numbered lists. Looking in to this, I see that paragraphs have a property called list_style, but I'm not familiar enough with rb-appscript or applescript to figure out how to set that property. I have read the documentation generated by the ASDictionary, but my knowledge of AppleScript is apparently too little to understand it.
Any help with either understanding how to use the information presented in the documentation, or writing a list using rb-appscript in pages would be much appreciated.
Edit: I'm not stuck on pages, textedit is also a viable option.
rb-appscript:
require 'rubygems'
require 'appscript'; include Appscript
lst=["a", "b"]
doc = app('Pages').documents[0]
doc.selection.get.paragraph_style.set("Body Bullet")
doc.selection.set(lst.join("\n"))
AppleScript:
set lst to {"a", "b"}
set text item delimiters to linefeed
tell application "Pages" to tell document 1
set paragraph style of (get selection) to "Body Bullet"
set selection to (lst as text)
end tell
The current crop of Apple applications are weird to script. I don't use rb-appscript, but here is working code for Applescript that you should be able to alter to taste and port:
property dummyList : {"Tyler Durden", "Marla Singer", "Robert Paulson"}
tell application "Pages"
set theDocument to make new document
tell theDocument
set bulletListStyle to ""
set lastListStyle to (count list styles)
repeat with thisListStyle from 1 to lastListStyle
set theListStyle to item thisListStyle of list styles
if name of theListStyle is "Bullet" then
set bulletListStyle to theListStyle
end if
end repeat
repeat with thisItem from 1 to (count dummyList)
set body text to body text & item thisItem of dummyList & return
end repeat
set paraCount to count paragraphs of theDocument
repeat with thisPara from 1 to paraCount
select paragraph thisPara
set theSelection to selection
set paragraph style of theSelection to "Body Bullet"
end repeat
end tell
end tell
What this does, essentially, is place each list item in its own paragraph (that is what a list item is for all intents and purposes: an indented paragraph with a bullet), select each paragrah in turn, then apply the list paragraph style to the selection. The paragraph object just returns the text of the given paragraph and does not hold any state in and of itself, for some reason. This isn't the best way to handle this scenario, but at least all the components are there to get you what you need.