I am trying to print the dynamic debug messages produced by any modules. What I do is the same as the instruction stated in the kernel documentation on dynamic debug. I do it for modules instead of files. I use the following command:
cut -f 2 -d "[" /sys/kernel/debug/dynamic_debug/control | cut -f 1 -d "]" | xargs -i echo 'module {} +p' > /sys/kernel/debug/dynamic_debug/control
The above instruction extracts the module names from control file and makes it to print the actual dynamic debug messages issued by netdev_dbg() for example.
I took this command from a paper on page 7 (here is the link) that explains dynamic debug and this example actually prints dynamic debug messages in their system. But in my system it does not print anything!
I have tested the first two cut commands extract the module names correctly but xargs echo command does not print the messages. I tried it on two different laptops.
I use Ubuntu 16.04LTS.
1. OS: Linux / Ubuntu x86/x64
2. Task:
Write a Bash shell script to download URLs in a (large) csv (as fast/simultaneous as possible) and naming each output on a column value.
2.1 Example Input:
A CSV file containing lines like:
2.2 Example outputs:
Files in a folder, outputs, containg files like:
3. My Try:
I tried mainly in two styles.
1. Using the download tool's inner support
Take ariasc as an example, it support use -i option to import a file of URLs to download, and (I think) it will process it in parallel to max speed. It do have --force-sequential option to force download in the order of the lines, but I failed to find a way to make the naming part happen.
2. Splitting first
split the file into files and run a script like the following to process it:
while IFS=, read serino url
aria2c -c "$url" --dir=outputs --out="$serino.jpg"
done < "$INPUT"
However, it means for each line it will restart aria2c again which seems cost time and low the speed.
Though, one can run the script in bash command multiple times to get 'shell-level' parallelism, it seems not to be the best way.
Any suggestion ?
Thank you,
aria2c supports so called option lines in input files. From man aria2c
-i, --input-file=
Downloads the URIs listed in FILE. You can specify multiple sources for a single entity by putting multiple URIs on a single line separated by the TAB character. Additionally, options can be specified after each URI line. Option lines must start with one or more white space characters (SPACE or TAB) and must only contain one option per line.
and later on
These options have exactly same meaning of the ones in the command-line options, but it just applies to the URIs it belongs to. Please note that for options in input file -- prefix must be stripped.
You can convert your csv file into an aria2c input file:
sed -E 's/([^,]*),(.*)/\2\n out=\1/' file.csv | aria2c -i -
This will convert your file into the following format and run aria2c on it.
However this won't create files 001.jpg, 002.jpg, … but 001, 002, … since that's what you specified. Either specify file names with extensions or guess the extensions from the URLs.
If the extension is always jpg you can use
sed -E 's/([^,]*),(.*)/\2\n out=\1.jpg/' file.csv | aria2c -i -
To extract extensions from the URLs use
sed -E 's/([^,]*),(.*)(\..*)/\2\3\n out=\1\3/' file.csv | aria2c -i -
Warning: This works if and only if every URL ends with an extension. For instance, due to the missing extension the line 001,domain.tld/abc would not be converted at all, causing aria2c to fail on the "URL" 001,domain.tld/abc.
Using all standard utilities you can do this to download in parallel:
tr '\n' ',' < file.csv |
xargs -P 0 -d , -n 2 bash -c 'curl -s "$2" -o "$1.jpg"' -
-P 0 option in xargs lets it run commands in parallel (one per core processor)
I work in SEO and sometimes I have to manage lists of domains to be considered for certain actions in our campaigns. On my iMac, I have 2 lists, one provided for consideration - unfiltered.txt - and another that has listed the domains I've already analyzed - used.txt. The one provided for consideration, the new one (unfiltered.txt), looks like this:
... etc
List of domains that needs to be used as a filter, to be eliminated (used.txt) - looks like this.
... etc
Is there a way to use my OS X terminal to remove from unfiltered.txt all the lines found in used.txt? Found a software solution that partially solves a problem, and, aside from the words from used.txt, eliminates also words containing these smaller words. It means I get a broader filter and eliminate also domains that I still need.
For example, if my unfiltered.txt contains a domain named fogland.org.uk it will be automatically eliminated if in my used.txt file I have a domain named gland.org.uk.
Files are pretty big (close to 100k lines). I have pretty good configuration, with SSD, i7 7th gen, 16GB RAM, but it is unlikely to let it run for hours just for this operation.
... hope it makes sense.
You can do that with awk. You pass both files to awk. Whilst parsing the first file, where the current record number across all files is the same as the record number in the current file, you make a note of each domain you have seen. Then, when parsing the second file, you only print records that correspond to ones you have not seen in the first file:
awk 'FNR==NR{seen[$0]++;next} !seen[$0]' used.txt unfiltered.txt
Sample Output for your input data
awk is included and delivered as part of macOS - no need to install anything.
I have always used
grep -v -F -f expunge.txt filewith.txt > filewithout.txt
to do this. When "expunge.txt" is too large, you can do it in stages, cutting it into manageable chunks and filtering one after another:
cp filewith.txt original.txt
and loop as required:
grep -v -F -f chunkNNN.txt filewith.txt > filewithout.txt
mv filewithout.txt filewith.txt
You could even do this in a pipe:
grep -v -F -f chunk01.txt original.txt |\
grep -v -F -f chunk02.txt original.txt |\
grep -v -F -f chunk03.txt original.txt \
> purged.txt
You can use comm. I haven't got a mac here to check but I expect it will be installed by default. Note that both files must be sorted. Then try:
comm -2 -3 unfiltered.txt used.txt
Check the man page for further details.
You can use comm and process substitution to do everything in one line:
comm -23 <(sort used.txt) <(sort unfiltered.txt) > used_new.txt
P.S. tested on my Mac running OSX 10.11.6 (El Capitan)
I've attempted numerous times and tried different methods but cannot seem to get this to work. I am trying to run a python script and grep the output to see if it is contained in a file and if it is not I want to append it to said file.
$./scan_network.py 22 | if ! grep -q - ./results.log; then - >> results.log; fi
I understand that it macOS grep does not understand - as stdout and that then - >> would not work because it would not pick up stdout either. I am not sure what to do.
As stated before the primary goal is to check the output of the script against a file and if the IP address is not found in the file, it needs to be appended.
results.log is currently an empty file. Output of scan_network.py on would be for now. When I go to run it on another network the output would be numerous addresses in a range example being 10.234.x.y where x and y would be any number between 0 and 255.
One simple solution is to merge the log file and the output of the program into a new log file:
sort -u <(./scan_network.py 22 results.log > newresults.log
The -u flag causes duplicate lines to be removed from the output, so you will get only one of each line.
That has the side effect of reordering the lines (so that they are sorted alphabetically). It is possible to preserve order if necessary, but it gets more complicated.
With a reasonably modern gnu sort, you can use a "version number" sort, which will do a reasonable job of keeping IP numbers in logical order; you can use the -V flag to do that. Or you can sort the octets individually with sort -u -t. -k1,1n -k2,2n -k3,3n -k4,4n .... Or you can just live with lexicographic ordering. Do not just use -n for standard numeric sorting, because it will only examine the first octet, and that will have an unfortunate interaction with the -u option, since two lines which compare equal are considered duplicates. As numeric sort only considers the numeric prefix, there will be many false duplicates.
If you don't mind sorting and rewriting your log file, rici's helpful answer works well (note that simply using -V for true per-component numerical IP-address sorting is not an option on macOS, unfortunately).[1].
Here's an alternative that only appends to the existing log file on demand, in-place, without reordering existing lines:
grep -f results.log -xFv <(./scan_network.py 22 >> results.log
Note: This assumes that ./scan_network.py's output is line-based; pipe to tr to transform to line-based output, if necessary.
-f treats each line in the specified file as a separate search term, where a match of any term is considered an overall match.
-x matches lines in full
-F performs literal matching (doesn't interpret search terms as regular expressions)
-v only outputs lines that do not match
The net effect is that only lines output by ./scan_network.py ... that aren't already present in results.log are appended to results.log.
Note, however, that performance will likely suffer the larger results.log becomes, so rici's approach may be preferable in the long run, particularly, if the log file keeps growing and/or you want the log sorted by IP addresses anyway.
As for what you've tried:
Both GNU and BSD/macOS grep optionally accept - as a placeholder for stdin to accept the input from, but note that this operand is never needed, because grep reads input from stdin by default.
By contrast, only GNU grep accepts - as the option-argument to -f, i.e., the file containing the search terms to apply.
BSD/macOS requires either an explicit filename, a process substitution (as above), or, in a pinch, /dev/stdin to refer to stdin.
The logic of your search must be reversed: as in the command above, the existing log file contents must serve as the search terms (passed to -f), and the ./scan_network.py ... output must serve as the input in order to determine which lines are not (-v) already in the log file.
using - to represent stdin or stdout, depending on context, is a mere convention that only works as a command argument, so your attempt to refer to stdout output with if ...; then - >> results.log cannot work, because - is invariably interpreted as a command name.
If you use grep -q, stdout output is by definition suppressed, so there's nothing to pass on (even if you used a pipe).
[1] macOS's (OS X's) sort does not support -V for per-component version-number sorting (which can be applied to IP addresses too). Even though the macOS sort is a GNU sort, it is an ancient one - v5.93 as of macOS 10.12 - that predates support for -V.
Assuming that your script returns a single line of text, you can store the output in a variable and then grep for that string. For example:
# save output to a shell variable
str=$(./scan_network.py 22
# don't call grep twice for the same pattern
grep=$(grep -F "$str" "$logfile")
# append if grep results are empty
if [[ -z "$grep" ]]; then
echo "$grep" >> "$logfile"
I am trying to prepend a message to the output of rsstail, this is what I have right now:
rsstail -o -i 15 --initial 0 http://feeds.bbci.co.uk/news/world/europe/rss.xml | awk -v time=$( date +\[%H:%M:%S_%d/%m/%Y\] ) '{print time,$0}' | tee someFile.txt
which should give me the following:
[23:46:49_23/10/2014] Title: someTitle
After the command I have a | while read line do ... end which never gets called because the above command does not output a single thing. What am I doing wrong?
PS: I am using the python version of rsstail, since the other one kept on crashing (https://github.com/gvalkov/rsstail.py)
As requested in the comments the command:
rsstail -o -i 15 --initial 0 http://feeds.bbci.co.uk/news/world/europe/rss.xml
Will give back a message like the following when a new article is found
Title: Sweden calls off search for sub
It seems that my rsstail is different from yours, but mine supports the option
-Z x add heading 'x'
so that
rsstail -Z"$( date +\[%H:%M:%S_%d/%m/%Y\] ) " ...
does the job without awk; on the other hand, you do have some problem with buffering, is it possible to ask rsstail to stop after a given number of titles?
For work, I occasionally need to monitor the output logs of services I create. These logs are short lived, and contain a lot of information that I don't necessarily need. Up until this point I've been watching them using:
grep <tag> * | less
where <tag> is either INFO, DEBUG, WARN, or ERROR. There are about 10x as many warns as there are errors, and 10x as many debugs as warns, and so forth. It makes it difficult to catch one ERROR in a sea of relevant DEBUG messages. I would like a way to, for instance, make all 'WARN' messages appear on the left-hand side of the terminal, and all the 'ERROR' messages appear on the right-hand side.
I have tried using tmux and screen, but it doesn't seem to be working on my dev machine.
Try doing this :
vim -O <(grep 'ERR' "$FILE") <(grep 'WARN' "$FILE")
Just use sed to indent the desired lines. Or, use colors. For example, to make ERRORS red, you could do:
$ r=$( printf '\033[1;31m' ) # escape sequence may change depending on the display
$ g=$( printf '\033[1;32m' )
$ echo $g # Set the output color to the default
$ sed "/ERROR/ { s/^/$r/; s/$/$g/; }" *
If these are live logs, how about running these two commands in separate terminals:
tail -f * | grep ERROR
tail -f * | grep WARN
To automate this you could start it in a tmux session. I tend to do this with a tmux script similar to what I described here.
In you case the script file could contain something like this:
send-keys "tail -f * | grep ERROR\n"
send-keys "tail -f * | grep WARN\n"
Then run like this:
tmux new -d \; source-file monitor.tmux; tmux attach
You could do this using screen. Simply split the screen vertically and run tail -f LOGFILE | grep KEYWORD on each pane.
As a shortcut, you can use the following rc file:
split -v
screen bash -c "tail -f /var/log/syslog | grep ERR"
screen bash -c "tail -f /var/log/syslog | grep WARN"
then launch your screen instance using:
screen -c monitor_log_screen.rc
You can of course extend this concept much further by making more splits and use commands like tail -f and watch to get live updates of different output.
Do also explore screen other screen features such as use of multiple windows (with monitoring) and hardstatus and you can come up with quite a comprehensive "monitoring console".