I want to write an apple script that if it sees a json of a very certain format, it copies to the clipboard all the individual lines with the keys and all the individual values without the keys
For example
{
"stam": "value1",
"stam1": "value2",
"stam2": "value3"
}
I want the lines to be copied to the clipboard the following way:
I mostly want it to go on my clipboard history, each line on its own
"stam": "value1",
value1
"stam1": "value2",
value2
"stam2": "value3",
value3
Is it possible to do it?
As there don't seem to be any takers for an Applescript version after 2 days, I'll offer a shell method which you are welcome to ignore if it's not for you.
So, if you copy your JSON into your Clipboard and run the following in the Terminal, it should do what you ask:
pbpaste | awk -F':' '/:/ {gsub(/ /,""); print; gsub(/"/, "", $2); print $2}'
That pastes the Clipboard into awk (which is included in every macOS) and tells it to treat the colon (:) as the field separator. It then looks for and processes only lines containing colons as follows. It removes all spaces and prints the line. Then it removes all double quotes (") from the second field on the line and prints the result on a new line.
So, if you make a script called $HOME/go containing the following, it will write the result back onto your Clipboard:
#!/bin/bash
pbpaste | awk -F':' '/:/{gsub(/ /,"");print; gsub(/"/, "", $2); print $2}' | pbcopy
You can then make it executable (only necessary one time) with:
chmod +x $HOME/go
Then you can execute it from Applescript with:
do shell script "/Users/YOU/go"
Or, if you insist on Applescript you can do something ugly like this:
set lns to paragraphs of (the clipboard as text)
set res to ""
repeat with ln in lns
if (ln contains ":") then
set res to (res & ln & "\n") as string
set AppleScript's text item delimiters to ":"
set fields to every text item of ln
set AppleScript's text item delimiters to ""
set val to (item 2 of fields)
set res to (res & val & "\n") as string
end if
end repeat
display dialog res
Related
a
b
s
start
text
more text
end
even more text
end
I want to print the content between start and first end that follows the start (start is always unique). I also want to print between which lines the text had been printed, in this example between lines 4 and 7.
I was trying with grep and cat, but I couldn't do much.
I tried:
var=$(cat $path)
echo "$var" | grep -o -P '(?<=start).*(?=end)'
But it didn't print anything, without the grep, it prints the whole file.
Output should in this example should be:
The content is between lines 4 and 7.
start
text
more text
end
With shell variables passed to awk and then print text by range then try, mention your shell variable inside start variable of awk and we should be Good then. (Also change $0 ~ start to $0 ~ "^"start"$" in case you want to look for exact match for start value in lines.)
awk -v start="$your_shell_start_var" '
$0 ~ start,$0 ~ /^end$/{
print
if($0 ~ start){ startLine=FNR }
if($0~/^end$/){
print "The content is between lines " startLine " and " FNR
exit
}
}' Input_file
Sample output on OP's samples:
start
text
more text
end
The content is between lines 4 and 7
Simple explanation: Printing lines by range start till end in between this statements checking condition if line has end string then come out of the Input_file, we need NOT to read the complete Input_file since OP needs to print only very first set of lines.
Sample data:
$ cat -n strings.dat
1 a
2 b
3 s
4 start
5 text
6 more text
7 end of more text
8 end
9 even more text
10 end
One awk solution using a range (similar to RavinderSingh13's post) that prints out OP's textual message at the end:
startstring="start" # define start of search block
awk -v ss="${startstring}" ' # pass start of search block in as awk variable "ss"
# search for a range of lines between "ss" and "end":
$0==ss,/^end$/ { if ($0==ss && x==0 ) x=FNR # if this is the first line of the range make note of the line number
print # print the current line of the range
if ($0=="end") # if this is the last line of the range then print our textual message re: start/finish line numbers
printf "\nThe content is between lines %d and %d.\n",x,FNR
}
' strings.dat
NOTE: the $0==ss and /^end$/ tests assume no leading/trailing white space in the data file otherwise these tests will fail and there will be no range match.
With startstring="start" this generates:
start
text
more text
end of more text
end
The content is between lines 4 and 8.
With startstring="more text" this generates:
more text
end of more text
end
The content is between lines 6 and 8.
With startstring="even more text" this generates:
even more text
end
The content is between lines 9 and 10.
With startstring="water" this generates:
--no output--
NOTE: If OP uses startstring="end" the results are not as expected; while it would be possible to add more code to address this scenario I'm going to skip this scenario for the time being.
I am wondering if there is a way to get paragraphs of text (source file would be a pyx file) by number as sed does with lines
sed -n ${i}p
At this moment I'd be interested to use awk with:
awk '/custom-pyx-tag\(/,/\)custom-pyx-tag/'
but I can't find documentation or examples about that.
I'm also trying to trim "\r\n" with gsub(/\r\n/,"; ") int the same awk command but it doesn't work, and I can't really figure out why.
Any hint would be very appreciated, thanks
EDIT:
This is just one example and not my exact need but I would need to know how to do it for a multipurpose project
Let's take the case that I have exported the ID3Tags of a huge collection of audio files and these have been stored in a pyx-like format, so in the end I will have a nice big file with this pattern repeating for each file in the collection:
audio-genre(
blablabla
)audio-genre
audio-artist(
bla.blabla
)audio-artist
audio album(
bla-bla-bla
)audio-album
audio-track-num(
0x
)audio-track-num
audio-track-title(
bla.bla-bla
)audio-track-title
audio-lyrics(
blablablablabla
bla.bla.bla.bla
blah-blah-blah
blabla-blabla
)audio-lyrics
...
Now if I want to extract the artist of the 1234th audio file I can use:
awk '/audio-artist\(/, /)audio-artist/' | sed '/audio-artist/d' | sed -n 1234p
so being one line it can be obtained with sed, but I don't know how to get an entire paragraph given its index, for example if I want to get the lyrics of the 6543th file how could I do it?
In the end it is just a question of whether there is a command equivalent to
sed -n $ {num} p
but to be used for paragraphs
awk -v indx=1024
'BEGIN {
RS=""
}
{ split($0,arr,"audio-artist");
for (i=2;i<=length(arr);i=i+2)
{ gsub("[()]","",arr[i]);
arts[cnt+=1]=arr[i]
}
}
END {
print arts[indx]
}' audioartist
One liner:
awk -v indx=1234 'BEGIN {RS=""} NR==1 { split($0,arr,"audio-artist");for (i=2;i<=length(arr);i=i+2) { gsub("[()]","",arr[i]);arts[cnt+=1]=arr[i] } } END { print arts[indx] }' audioartist
Using awk, and the file called audioartist, we consume the file as one line by setting the records separator (RS) to "". We then split the whole file into an array arr, based on the separator audio-artist. We look through the array arr starting from 2 in steps of 2 till the end of the array and strip out the opening and closing brackets, creating another array called arts with an incrementing count as the index and the stripped artist as the value. At the end we print the arts index specified by the passed indx variable (in this case 1234).
In bash:
1) For a given groupname of interest, and
2) a list of keys of interest, for which we want a table of values, for this groupname,
3) read in a set of files, like those in /usr/share/applications (see simplified example below),
4) and produce a delimited table, with one line per file, and one field for each given key.
EXAMPLE
inputs
We want only the values of the Name and Exec keys, from only [Desktop Entry] groups, and from one or more files, like these:
[Desktop Entry]
Name=Root
Comment=Opens
Exec=e2
..
[Desktop Entry]
Comment=Close
Name=Root2
output
Two lines, one per input file, each in a delimited <Name>,<Exec> format, ready for import into a database:
Root,e2
Root2,
Each input file is:
One or more blocks of lines delimited by a [some-groupname].
Below each [.*] is one or more standard, unsorted key=value pairs.
Not every block contains the same set of keys.
[Forgive me if I am asking for a solution to an old problem, but I can't seem to find a good, quick bash way, to do this. Yes, I could code it up with some while and read loops, etc... but surely it's been done before.]
Similar to this Q but more general answer wanted.
If awk is your option, would you please try the following:
awk -v RS="[" -v FS="\n" '{ # split the file into records on "["
# and split the record into fields on "\n"
name = ""; exec = "" # reset variables
if ($1 == "Desktop Entry]") {
# if the groupname matches
for (i=2; i<=NF; i++) { # loop over the fields (lines) of "key=value" pairs
if (sub(/^Name=/, "", $i)) name = $i
# the field (line) starts with "Name="
else if (sub(/^Exec=/, "", $i)) exec = $i
# the field (line) starts with "Exec="
}
print name "," exec
}
}' file
You can feed multiple files as file1 file2 file3, dir/file* or whatever.
I have many text files containing annotations. The original text is marked with lines containing the words:
START OF TEXT OF PASSAGE 1
END OF TEXT OF PASSAGE 1
Obviously I can search each document for the phrase START OF TEXT and delete everything up to it. Then search for END OF TEXT and start selecting text for deletion until I get to the next START OF TEXT.
I have come up with this design so far:
#!/bin/bash
a="START OF PROJECT"
b="END OF PROJECT"
while read line; do
if line contains a; do
while read line; do
'if line does not contain b'
'append the line to output.txt'; fi
done
done
fi
done
Perhaps there is an easier way using sed, awk, grep and pipes?
'for every document' 'loop through it doing this' ('find the original text between START and END' | >> output.txt)
Unfortunately I am poor at bash and ignorant of sed/awk.
The reason for this is that I am assembling a huge text document that is a concatenation of thousands of marked up documents – each of which contains some annotated passages.
In Python:
import re
with open('in.txt') as f, open('out.txt', 'w') as output:
output.write('\n'.join(re.findall(r'START OF TEXT(.*?)END OF TEXT', f.read())))
This reads the input, searches for all matches that begin and end with the necessary markers, captures the text of interest in a group, joins all those groups on a linefeed, and writes that to the result file.
Pretty easy to do with awk. You would create a script (I'll call it yank.awk) containing this:
#!/usr/bin/awk
/START OF PROJECT/ { capture = 1; next }
/END OF PROJECT/ { capture = 0 }
capture == 1 { print }
and then run it like so:
yank.awk in.txt > output.txt
Could also do with sed and grep:
sed -ne '/START OF PROJECT/,/END OF PROJECT/p' in.txt | grep -vE '(START|END) OF PROJECT' > output.txt
(Another Python solution)
You can have itertools.groupby group lines together based on a boolean value - just use a global flag to keep track of whether you are in a block or not, and then use groupby to group the lines that are in or out of blocks. Then just discard the ones that are not blocks:
sample_lines = """
lskdjflsdkjf
sldkjfsdlkjf
START OF TEXT
Asdlkfjlsdkfj
Bsldkjf
Clsdkjf
END OF TEXT
sldkfjlsdkjf
sdlkjfdklsjf
sdlkfjdlskjf
START OF TEXT
Dsdlkfjlsdkfj
Esldkjf
Flsdkjf
END OF TEXT
sldkfjlsdkjf
sdlkjfdklsjf
sdlkfjdlskjf
""".splitlines()
from itertools import groupby
in_block = False
def is_in_block(line):
global in_block
if line.startswith("END OF TEXT"):
in_block = False
ret = in_block
if line.startswith("START OF TEXT"):
in_block = True
return ret
for lines_are_text,lines in groupby(sample_lines, key=is_in_block):
if lines_are_text:
print(list(lines))
gives:
['Asdlkfjlsdkfj', 'Bsldkjf', 'Clsdkjf']
['Dsdlkfjlsdkfj', 'Esldkjf', 'Flsdkjf']
See that first group has the lines that start with A, B, and C, and the second group is made up of those lines starting with D, E, and F.
It sounds like the specific solution you need is:
awk '/END OF TEXT OF PASSAGE/{f=0} f; /START OF TEXT OF PASSAGE/{f=1}' file
See https://stackoverflow.com/a/18409469/1745001 for other ways to select text from files.
Use Perl's Flip-Flop Operator to Print Text Between Markers
Given a corpus like:
START OF TEXT OF PASSAGE 1
foo
END OF TEXT OF PASSAGE 1
START OF TEXT OF PASSAGE 2
bar
END OF TEXT OF PASSAGE 2
you can use the Perl flip-flop operator to process within a range of lines. For example, from the shell prompt:
$ perl -ne 'if (/^START OF TEXT/ ... /^END OF TEXT/) {
next if /^(?:START|END)/;
print;
}' /tmp/corpus
foo
bar
Basically, this short Perl script loops through your input. When it finds your start and end tags, it throws away the tags themselves and prints everything else in between.
Usage Notes
The line breaks between passages in the corpus are for readability. It doesn't matter if your real corpus has no line breaks between passages, so long as the text markers always start at the beginning of the line as shown in your original post. If that assumption doesn't hold true, then you will need to adjust the regular expressions used to identify the start and end of your passages.
You can pass multiple files to the Perl script. Again, it makes no practical difference as long as you don't exceed the length limit of your shell.
If you want the final output to go to somewhere other than standard output, just use shell redirection. For example:
perl -ne 'if (/^START OF TEXT/ ... /^END OF TEXT/) {
next if /^(?:START|END)/;
print;
}' /tmp/file1 /tmp/file2 /tmp/file3 > /tmp/output
You can use sed as follows:
sed -n '/^START OF TEXT/,/^END OF TEXT/{/^\(START\|END\) OF TEXT/!p}' infile
or, with extended regular expressions (-r):
sed -rn '/^START OF TEXT/,/^END OF TEXT/{/^(START|END) OF TEXT/!p}' infile
-n prevents sed from printing as a default. The rest works as follows:
/^START OF TEXT/,/^END OF TEXT/ { # For lines between these two matches
/^\(START\|END\) OF TEXT/!p # If the line does NOT match, print it
}
This works with GNU sed and might require some tweaking to run with other seds.
This should be a simple fix but I cannot wrap my head around it at the moment.
I have a comma-delimited file called my_course that contains a list of courses with some information about them.
I need to get user input about the last two fields and change them accordingly.
Each field is constructed like:
CourseNumber,CourseTitle,CreditHours,Status,Grade
Example file:
CSC3210,COMPUTER ORG & PROGRAMMING,3,0,N/A
CSC2010,INTRO TO COMPUTER SCIENCE,3,0,N/A
CSC1010,COMPUTERS & APPLICATIONS,3,0,N/A
I get the user input for 3 things: Course Number, Status (0 or 1), and Grade (A,B,C,N/A)
So far I have tried matching the line containing the course number and changing the last two fields. I haven't been about to figure out how to modify the last two fields using sed so I'm using this horrible jumble of awk and sed:
temporary=$(awk -v status=$status -v grade=$grade '
BEGIN { FS="," }; $(NF)=""; $(NF-1)="";
/'$cNum'/ {printf $0","status","grade;}' my_course)
sed -i "s/CSC$cNum.*/$temporary/g" my_course
The issue that I'm running into here is the number of fields in the course title can range from 1 to 4 so I can't just easily print the first n fields. I've tried removing the last two fields and appending the new values for status and grade but that isn't working for me.
Note: I have already done checks to ensure that the user inputs valid data.
Use a simple awk-script:
BEGIN {
FS=","
OFS=FS
}
$0 ~ course {
$(NF-1)=status
$NF=grade
} {print}
and on the cmd-line, set three parameters for the various parameters like course, status and grade.
in action:
$ cat input
CSC3210,COMPUTER ORG & PROGRAMMING,3,0,N/A
CSC2010,INTRO TO COMPUTER SCIENCE,3,0,N/A
CSC1010,COMPUTERS & APPLICATIONS,3,0,N/A
$ awk -vcourse="CSC3210" -vstatus="1" -vgrade="A" -f grades.awk input
CSC3210,COMPUTER ORG & PROGRAMMING,3,1,A
CSC2010,INTRO TO COMPUTER SCIENCE,3,0,N/A
CSC1010,COMPUTERS & APPLICATIONS,3,0,N/A
$ awk -vcourse="CSC1010" -vstatus="1" -vgrade="B" -f grades.awk input
CSC3210,COMPUTER ORG & PROGRAMMING,3,0,N/A
CSC2010,INTRO TO COMPUTER SCIENCE,3,0,N/A
CSC1010,COMPUTERS & APPLICATIONS,3,1,B
It doesn't matter how much commas you have in course name as long as you look only at last two commas:
sed -i "/CSC$cNum/ s/.,[^,]*$$/$status,$grade/"
The trick is to use $ in pattern to match the end of line. $$ because of double quotes.
And don't bother building the "temporary" line - apply substitution only to line that matches course number.