OSX: dtrace printf(), write_nocancel() output base on dtruss script

OSX: dtrace printf(), write_nocancel() output base on dtruss script - macos

I need to see my program's printf output in sync with the dtrace output.
I like to build my own version of dtrace command that produce the equivalent output of the "sudo dtruss -t write_nocancel ls" command.
This is the "correct dtruss command/output":
sudo dtruss -t write_nocancel ls
Chap1 Chap10 Chap11 Chap12 Chap2 Chap3 Chap4 Chap5 Chap6 Chap7 Chap8 Chap9 README
SYSCALL(args) = return
write_nocancel(0x1, "Chap1\tChap10\tChap11\tChap12\tChap2\tChap3\tChap4\tChap5\tChap6\tChap7\tChap8\tChap9\tREADME\n\0", 0x52) = 82 0
Base on looking at the dtruss script source code, I tried this dtrace command, but it failed.
sudo dtrace -q \
-n '*:*:write_nocancel:entry {self->arg0=arg0; self->arg1 =arg1; \
self->arg2 =arg2; self->code=0; } ' \
-n '*:*:write_nocancel:return { \
printf("return %s(0x%X, \"%S\", 0x%X) = %d %d", \
probefunc,self->arg0, arg0 == -1 ? "" : stringof(copyin(self->arg1,arg0)),self->arg2,(int)arg0, \
(int)errno); }' \
-c ls 2>&1
Chap1
Chap10
Chap11
Chap12
Chap2
Chap3
Chap4
Chap5
Chap6
Chap7
Chap8
Chap9
README
dtrace: error on enabled probe ID 3 (ID 209288: fbt:mach_kernel:write_nocancel:return): invalid address (0xffffff80218dfc40) in action #3 at DIF offset 92
dtrace: error on enabled probe ID 4 (ID 958: syscall::write_nocancel:return): invalid address (0xffffff80218dfc40) in action #3 at DIF offset 92
dtrace: error on enabled probe ID 3 (ID 209288: fbt:mach_kernel:write_nocancel:return): invalid address (0xffffff801a7c0010) in action #3 at DIF offset 92
Any dtrace experts out there might have a clue on how to fixe this?

Find the answer: (The issue of two -n options).
sudo dtrace -q -n \
'syscall::write_nocancel:entry{self->start = 1; \
self->vstart = 1; self->arg0 = arg0; \
self->arg1 = arg1; self->arg2 = arg2;} \
*:*:write_nocancel:return /self->start/ \
{ printf("return %s(0x%X, \"%S\", 0x%X) = %d %d" \
,probefunc,self->arg0, \
arg0 == -1 ? "" : stringof(copyin(self->arg1,arg0)),\
self->arg2,(int)arg0, (int)errno); }' \
-c ls 2>&1

Related

dtrace errors out with incorrect type for thread_t on macOS 12.2.1

The following dtrace invocation used to work on MacOS 11.6.2 but after upgrading to MacOS 12.2.1 it stopped working
sudo --non-interactive dtrace -l -b 16m -x mangled -x disallow_dsym -x strip -x evaltime=preinit -x preallocate=256m -n 'oneshot$target:a.out::entry { hdr_vm_addr = ((thread_t)curthread)->task->mach_header_vm_address; }' -w -p 4557
dtrace: invalid probe specifier oneshot$target:a.out::entry { hdr_vm_addr = ((thread_t)curthread)->task->mach_header_vm_address; }: in action list: task is not a member of struct thread
How do I get task from thread_t now?

It looks like it may be t_task. Here's how you can view the type of a built-in variable such as curthread:
$ sudo dtrace -n 'BEGIN { print(*curthread); exit(0); }'
dtrace: description 'BEGIN ' matched 1 probe
CPU ID FUNCTION:NAME
4 1 :BEGIN struct thread {
union {
queue_chain_t runq_links = {
struct queue_entry *next = 0
struct queue_entry *prev = 0
}
...
struct task *t_task = 0xffffff8bd07859f0
...
}

Using bash functions in snakemake

I am trying to download some files with snakemake. The files (http://snpeff.sourceforge.net/SnpSift.html#dbNSFP) I would like to download are on a google site/drive and my usual wget approach does not work. I found a bash function that does the job (https://www.zachpfeffer.com/single-post/wget-a-Google-Drive-file):
function gdrive_download () { CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p') wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$CONFIRM&id=$1" -O $2 rm -rf /tmp/cookies.txt }
gdrive_download 120aPYqveqPx6jtssMEnLoqY0kCgVdR2fgMpb8FhFNHo test.txt
I have tested this function with my ids in a plain bash script and was able to download all the files. To add a bit to the complexity, I must use a workplace template, and incorporate the function into it.
rule dl:
params:
url = 'ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_{genome}/{afile}'
output:
'data/{genome}/{afile}'
params:
id1 = '0B7Ms5xMSFMYlOTV5RllpRjNHU2s',
f1 = 'dbNSFP.txt.gz'
shell:
"""CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id={{params.id1}}" -O- | sed -rn "s/.*confirm=([0-9A-Za-z_]+).*/\1\n/p") && wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$CONFIRM&id={{params.id1}}" -O {{params.f1}} && rm -rf /tmp/cookies.txt"""
#'wget -c {params.url} -O {output}'
rule checksum:
input:
i = 'data/{genome}/{afile}'
output:
o = temp('tmp/{genome}/{afile}.md5')
shell:
'md5sum {input} > {output}'
rule file_size:
input:
i = 'data/{genome}/{afile}'
output:
o = temp('tmp/{genome}/{afile}.size')
shell:
'du -csh --apparent-size {input} > {output}'
rule file_info:
"""md5 checksum and file size"""
input:
md5 = 'tmp/{genome}/{afile}.md5',
s = 'tmp/{genome}/{afile}.size'
output:
o = temp('tmp/{genome}/info/{afile}.csv')
run:
with open(input.md5) as f:
md5, fp = f.readline().strip().split()
with open(input.s) as f:
size = f.readline().split()[0]
with open(output.o, 'w') as fout:
print('filepath,size,md5', file=fout)
print(f"{fp},{size},{md5}", file=fout)
rule manifest:
input:
expand('tmp/{genome}/info/{suffix}.csv', genome=('GRCh37','GRCh38'), suffix=('dbNSFP.txt.gz', 'dbNSFP.txt.gz.tbi'))
#expand('tmp/{genome}/info/SnpSift{suffix}.csv', genome=('GRCh37','GRCh38'), suffix=('dbNSFP.txt.gz', 'dbNSFP.txt.gz.tbi'))
output:
o = 'MANIFEST.csv'
run:
pd.concat([pd.read_csv(afile) for afile in input]).to_csv(output.o, index=False)
There are four downloadable files for which I have ids (I only show one in params), however I don't know how to call the bash functions as written by ZPfeffer for all the ids I have with snakemake. Additionally, when I run this script, there are several errors, the most pressing being
sed: -e expression #1, char 31: unterminated `s' command
I am far from a snakemake expert, any assistance on how to modify my script to a) call the functions with 4 different ids, b) remove the sed error, and c) verify whether this is the correct url format (currently url = 'https://docs.google.com/uc?export/{afile}) will be greatly appreciated.

You would want to use raw string literal so that snakemake doesn't escape special characters, such as backslash in sed command. For example (notice r in front of shell command):
rule foo:
shell:
r"sed d\s\"
You could use --printshellcmds or -p to see how exactly shell: commands get resolved by snakemake.

Here is how I "solved" it:
import pandas as pd
rule dl:
output:
'data/{genome}/{afile}'
shell:
"sh download_snpsift.sh"
rule checksum:
input:
i = 'data/{genome}/{afile}'
output:
o = temp('tmp/{genome}/{afile}.md5')
shell:
'md5sum {input} > {output}'
rule file_size:
input:
i = 'data/{genome}/{afile}'
output:
o = temp('tmp/{genome}/{afile}.size')
shell:
'du -csh --apparent-size {input} > {output}'
rule file_info:
"""md5 checksum and file size"""
input:
md5 = 'tmp/{genome}/{afile}.md5',
s = 'tmp/{genome}/{afile}.size'
output:
o = temp('tmp/{genome}/info/{afile}.csv')
run:
with open(input.md5) as f:
md5, fp = f.readline().strip().split()
with open(input.s) as f:
size = f.readline().split()[0]
with open(output.o, 'w') as fout:
print('filepath,size,md5', file=fout)
print(f"{fp},{size},{md5}", file=fout)
rule manifest:
input:
expand('tmp/{genome}/info/{suffix}.csv', genome=('GRCh37','GRCh38'), suffix=('dbNSFP.txt.gz', 'dbNSFP.txt.gz.tbi'))
output:
o = 'MANIFEST.csv'
run:
pd.concat([pd.read_csv(afile) for afile in input]).to_csv(output.o, index=False)
And here is the bash script.
function gdrive_download () {
CONFIRM=$(wget --quiet --save-cookies /tmp/cookies.txt --keep-session-cookies --no-check-certificate "https://docs.google.com/uc?export=download&id=$1" -O- | sed -rn 's/.*confirm=([0-9A-Za-z_]+).*/\1\n/p')
wget --load-cookies /tmp/cookies.txt "https://docs.google.com/uc?export=download&confirm=$CONFIRM&id=$1" -O $2
rm -rf /tmp/cookies.txt
}
gdrive_download 0B7Ms5xMSFMYlSTY5dDJjcHVRZ3M data/GRCh37/dbNSFP.txt.gz
gdrive_download 0B7Ms5xMSFMYlOTV5RllpRjNHU2s data/GRCh37/dbNSFP.txt.gz.tbi
gdrive_download 0B7Ms5xMSFMYlbTZodjlGUDZnTGc data/GRCh38/dbNSFP.txt.gz
gdrive_download 0B7Ms5xMSFMYlNVBJdFA5cFZRYkE data/GRCh38/dbNSFP.txt.gz.tbi

GhostScript auto pagenumbering

I want to export one certain page from a pdf document to an image and automatically fill the page number in the file name. When I run the following code:
gs \
-sDEVICE=jpeg \
-o outfile-%03.jpeg \
-dFirstPage=12 \
-dLastPage=12 \
wo.pdf
I get: outfile-001.jpeg instead of outfile-012.jpeg.

I've wrote a bash script for the job:
function extract_nth_page(){
printf -v j "outfile-%05g.png" $1
echo $j
gs -q -dNOPAUSE -sDEVICE=png16m -r600 -dFirstPage=$1 -dLastPage=$1 -sOutputFile=$j $2 -c quit
return 0
}
# Extracts page number 42 from myFile.pdf to outfile-00042.png
extract_nth_page 42 myFile.pdf

How to run a script with correct permissions from custom udev rule?

I set up a custom udev rule triggered when my bluetooth mouse connects:
ACTION=="add" \
, ATTRS{idProduct}=="XXX" \
, ATTRS{idVendor}=="XXX" \
, ENV{DISPLAY}=":0" \
, ENV{XAUTHORITY}="/run/user/1000/gdm/Xauthority" \
, RUN+="/home/XXX/.scripts/mouse_connected.sh"
udevadm test succeeds, everything seems fine.
Then i created my script that should be triggered:
#!/bin/bash
if (xinput list | grep Maus | grep -o id=[0-9]* | grep -o [0-9]*$);
then xinput set-button-map "$(xinput list | grep Maus | grep -o id=[0-9]* | grep -o [0-9]*$)" 1 1 3 && logger "success";
else logger "fail" && sleep 3;
fi
when I manually run the script in terminal using "bash '/home/XXX/.scripts/mouse_connected.sh'" it works as expected. But when it gets successfully triggered via my custom udev rule it fails. - But why?

Multiple Processes - Python

I am looking to run multiple instances of a command line script at the same time. I am new to this concept of "multi-threading" so am at bit of a loss as to why I am seeing the things that I am seeing.
I have tried to execute the sub-processing in two different ways:
1 - Using multiple calls of Popen without a communicate until the end:
command = 'raster2pgsql -I -C -e -s 26911 %s -t 100x100 -F p839.%s_image_sum_sum1 | psql -U david -d projects -h pg3' % (workspace + '\\r_sumsum1{}'.format(i), str(i))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
command = 'raster2pgsql -I -C -e -s 26911 %s -t 100x100 -F p839.%s_image_sum_sum2 | psql -U david -d projects -h pg3' % (workspace + '\\r_sumsum2{}'.format(i), str(i))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
command = 'raster2pgsql -I -C -e -s 26911 %s -t 100x100 -F p839.%s_image_sum_sum3 | psql -U david -d projects -h pg3' % (workspace + '\\r_sumsum3{}'.format(i), str(i))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
(stdoutdata, stderrdata) = process.communicate()
this starts up each of the command line item but only completes the last entry leaving the other 2 hanging.
2 - Attempting to implement an example from Python threading multiple bash subprocesses? but nothing happens except for a printout of the commands (program hangs with no command line arguments running as observed in windows task manager:
import threading
import Queue
import commands
import time
workspace = r'F:\Processing\SM'
image = 't08r_e'
image_name = (image.split('.'))[0]
i = 0
process_image_tif = workspace + '\\{}{}.tif'.format((image.split('r'))[0], str(i))
# thread class to run a command
class ExampleThread(threading.Thread):
def __init__(self, cmd, queue):
threading.Thread.__init__(self)
self.cmd = cmd
self.queue = queue
def run(self):
# execute the command, queue the result
(status, output) = commands.getstatusoutput(self.cmd)
self.queue.put((self.cmd, output, status))
# queue where results are placed
result_queue = Queue.Queue()
# define the commands to be run in parallel, run them
cmds = ['raster2pgsql -I -C -e -s 26911 %s -t 100x100 -F p839.%s_image_sum_sum1 | psql -U david -d projects -h pg3' % (workspace + '\\r_sumsum1{}'.format(i), str(i)),
'raster2pgsql -I -C -e -s 26911 %s -t 100x100 -F p839.%s_image_sum_sum2 | psql -U david -d projects -h pg3' % (workspace + '\\r_sumsum2{}'.format(i), str(i)),
'raster2pgsql -I -C -e -s 26911 %s -t 100x100 -F p839.%s_image_sum_sum3 | psql -U david -d projects -h pg3' % (workspace + '\\r_sumsum3{}'.format(i), str(i)),
]
for cmd in cmds:
thread = ExampleThread(cmd, result_queue)
thread.start()
# print results as we get them
while threading.active_count() > 1 or not result_queue.empty():
while not result_queue.empty():
(cmd, output, status) = result_queue.get()
print(cmd)
print(output)
How can I run all of these commands at the same time achieving a result at the end? I am running in windows, pyhton 2.7.

My first try didn't work because of the repeated definitions of stdout and sterror. Removing these definitions causes expected behavior.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

OSX: dtrace printf(), write_nocancel() output base on dtruss script - macos

Related

dtrace errors out with incorrect type for thread_t on macOS 12.2.1

Using bash functions in snakemake

GhostScript auto pagenumbering

How to run a script with correct permissions from custom udev rule?

Multiple Processes - Python

Categories

Resources