How to save bash output in text/excel file? - shell

I want to save the particular result to the text/excel file through bash.
I tried below command it works fine, but I need only last result passed/failed, not the each step of execution.
I used below command to execute
$ bash eg.sh Behatscripts.txt > output.xls or
$ bash eg.sh Behatscripts.txt > output.txt
Below is the output console in my case, this whole thing is writing into the .txt/.xls file. But I need only last part that is:
1 scenario (1 passed)
3 steps (3 passed)
Executing the Script : eg.feature
----------------------------------------
#javascript
Feature: home page Validation
In order to check the home page of our site
As a website/normal user
I should be able to find some of the links/texts on the home page
Scenario: Validate the links in the header and footer # features\cap_english_home.feature:8
Given I am on the homepage # FeatureContext::iAmOnHomepage()
When I visit "/en" # FeatureContext::assertVisit()
Then I should see the following <links> # FeatureContext::iShouldSeeTheFollowingLinks()
| links |
| Dutch |
1 scenario (1 passed)
3 steps (3 passed)
0m14.744s
Give some suggestion to put condition to save only last part of the output console, thanks in advance.

Related

How to invert the result of a diff?

I'm trying to create a logging script
Right now I need a way to get the different lines
I'm thinking of storing a copy of the sent log, and sending the difference
Example:
fullLog.txt
logged 1
logged 2
logged 3
cachedLog.txt
logged 1
And I want to get
logged 2
logged 3
as a variable
and then cp fullLog.txt cachedLog.txt
The issue is, diff fullLog.txt cachedLog.txt | sed 's/^[<>] //g' only prints
logged 1
How can I "invert" the result to get what I wanted?
Solved: comm -3 fullLog.txt cachedLog.txt

Can a snakemake input rule be defined with different paths/wildcards

I want to know if one can define a input rule that has dependencies on different wildcards.
To elaborate, I am running this Snakemake pipeline on different fastq files using qsub which submits each job to a different node:
fastqc on original fastq - no downstream dependency on other jobs
adapter/quality trimming to generate trimmed fastq
fastqc_after on trimmed fastq (output from step 2) and no downstream dependency
star-rsem pipeline on trimmed fastq (output from step 2 above)
rsem and tximport (output from step 4)
Run multiqc
MultiQC - https://multiqc.info/ - runs on the results folder which has results from fastqc, star, rsem, etc. However, because each job runs on a different node, sometimes Step 3 (fastqc and/or fastqc_after) is still running on the nodes while other steps finish running (Steps 2, 4 and 5) OR vice-versa.
Currently, I can create a MultiQc rule which waits on results from Steps 2, 4, 5 because they are linked to each other by input/output rules.
I have attached my pipeline as png to this post. Any suggestions would help.
What I need: I want to create a "collating" step where I want MultiQC to wait till all steps (from 1 to 5) finish. In other words, using my attached png as guide, I want to define multiple input rules for MultiQC that also wait on results from fastqc
Thanks in advance.
Note: Based on comments I received from 'colin' and 'bli' after my original post, I have shared the code for the different rules here.
Step 1 - fastqc
rule fastqc:
input: "raw_fastq/{sample}.fastq"
output: "results/fastqc/{sample}_fastqc.zip"
log: "results/logs/fq_before/{sample}.fastqc.log"
params: ...
shell: ...
Step 2 - bbduk
rule bbduk:
input: R1 = "raw_fastq/{sample}.fastq"
output: R1 = "results/bbduk/{sample}_trimmed.fastq",
params: ...
log: "results/logs/bbduk/{sample}.bbduk.log"
priority:95
shell: ....
Step 3 - fastqc_after
rule fastqc_after:
input: "results/bbduk/{sample}_trimmed.fastq"
output: "results/bbduk/{sample}_trimmed_fastqc.zip"
log: "results/logs/fq_after/{sample}_trimmed.fastqc.log"
priority: 70
params: ...
shell: ...
Step 4 - star_align
rule star_align:
input: R1 = "results/bbduk/{sample}_trimmed.fastq"
output:
out_1 = "results/bam/{sample}_Aligned.toTranscriptome.out.bam",
out_2 = "results/bam/{sample}_ReadsPerGene.out.tab"
params: ...
log: "results/logs/star/{sample}.star.log"
priority:90
shell: ...
Step 5 - rsem_norm
rule rsem_norm:
input:
bam = "results/bam/{sample}_Aligned.toTranscriptome.out.bam"
output:
genes = "results/quant/{sample}.genes.results"
params: ...
threads = 16
priority:85
shell: ...
Step 6 - rsem_model
rule rsem_model:
input: "results/quant/{sample}.genes.results"
output: "results/quant/{sample}_diagnostic.pdf"
params: ...
shell: ...
Step 7 - tximport_rsem
rule tximport_rsem:
input: expand("results/quant/{sample}_diagnostic.pdf",sample=samples)
output: "results/rsem_tximport/RSEM_GeneLevel_Summarization.csv"
shell: ...
Step 8 - multiqc
rule multiqc:
input: expand("results/quant/{sample}.genes.results",sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: ...
If you want rule multiqc to happen only after fastqc completed, you can add the output of fastqc to the input of multiqc:
rule multiqc:
input:
expand("results/quant/{sample}.genes.results",sample=samples),
expand("results/fastqc/{sample}_fastqc.zip", sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: ...
Or, if you need to be able to refer to the output of rsem_norm in your shell section:
rule multiqc:
input:
rsem_out = expand("results/quant/{sample}.genes.results",sample=samples),
fastqc_out = expand("results/fastqc/{sample}_fastqc.zip", sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: "... {input.rsem_out} ..."
In one of your comments, you wrote:
MultiQC needs directory as input - I give it the 'results' directory in my shell command.
If I understand correctly, this means that results/quant/{sample}.genes.results are directories, and not plain files. If this is the case, you should make sure no downstream rule writes files inside those directories. Otherwise, the directories will be considered as having been updated after the output of multiqc, and multiqc will be re-run every time you run the pipeline.

How to read every line from a txt file and print starting from the line which starts with "Created_Date" in shell scripting [duplicate]

This question already has answers here:
How to get the part of a file after the first line that matches a regular expression
(12 answers)
Closed 4 years ago.
5G_Fixed_Wireless_Dashboard_TestScedule||||||||||||||||^M
Report Run Date||08/07/2018|||||||||||||||||||||^M
Requesting User Company||NEW|||||||||||||||||||||^M
Report Criteria|||||||||||||||||||||||^M
" Service Job Updated from Date:
Service Job Updated to Date:
Service Job Created from Date: 08/06/2018
Service Job Created to Date:
Service Job Status:
Resolution Code:"|||||||||||||||||||||||^M
Created Date|Job Status|Schedule Date|Job
Number|Service Job Type|Verizon Customer Order
Number|Verizon Location Code|Service|Installation
Duration|Part Number
I want to print starting from Created Date. The result
file should be something like below.
Created Date|Job Status|Schedule Date|Job
Number|Service Job Type|Verizon Customer Order
Number|Verizon Location Code|Service|Installation
Duration|Part Number
I have tried the following lines after you people linked me to some other questions. But my requirement is to print the result to the same file.
FILELIST=find $MFROUTDIR -maxdepth 1 -name "XXXXXX_5G_Order_*.txt"
for nextFile in $FILELIST;do
cat $nextFile | sed -n -e '/Created Date/,$p'
done
By writing above lines of code, output is printed on console. Could you please suggest some way to print it in same file.
This can be easily done with a simple awk command:
awk '/^Created Date/{p=1} p' file
Created Date|Job Status|Schedule Date|Job
Number|Service Job Type|Verizon Customer Order
Number|Verizon Location Code|Service|Installation
Duration|Part Number
We set a flag p to 1 when we encounter a line that starts with Created Date. Later we use awk default action to print each line when p==1.
References:
Effective AWK Programming
Awk Tutorial

shell script display grep results

I need some help with displaying how many times two strings are found on the same line! Lets say I want to search the file 'test.txt', this file contains names and IP's, I want to enter a name as a parameter when running the script, the script will search the file for that name, and check if there's an IP-address there also. I have tried using the 'grep' command, but I don't know how I can display the results in a good way, I want it like this:
Name: John Doe IP: xxx.xxx.xx.x count: 3
The count is how many times this line was found, this is how my grep script looks like right now:
#!/bin/bash
echo "Searching $1 for the Name '$2'"
result=$(grep "$2" $1 | grep -E "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)")
echo $result
I will run the script like 'sh search test.txt John'.
I'm having trouble displaying the information I get from the grep command, maybe there's a better way to do this?
EDIT:
Okey, I will try to explain a little better, let's say I want to search a .log file, I want a script to search that file for a string the user enters as a parameter. i.e if the user enters 'sh search test.log logged in' the script will search for the string "logged in" within the file 'test.log'. If the script finds this line on the same line as a IP-address the IP address is printed, along with how many times this line was found.
And I simply don't know how to do it, I'm new to shell scripting, and was hoping I could use grep along with regular expressions for this! I will keep on trying, and update this question with an answer if I figure it out.
I don't have said file on my computer, but it looks something like this:
Apr 25 11:33:21 Admin CRON[2792]: pam_unix(cron:session): session opened for user 192.168.1.2 by (uid=0)
Apr 25 12:39:01 Admin CRON[2792]: pam_unix(cron:session): session closed for user 192.168.1.2
Apr 27 07:42:07 John CRON[2792]: pam_unix(cron:session): session opened for user 192.168.2.22 by (uid=0)
Apr 27 14:23:11 John CRON[2792]: pam_unix(cron:session): session closed for user 192.168.2.22
Apr 29 10:20:18 Admin CRON[2792]: pam_unix(cron:session): session opened for user 192.168.1.2 by (uid=0)
Apr 29 12:15:04 Admin CRON[2792]: pam_unix(cron:session): session closed for user 192.168.1.2
Here is a simple Awk script which does what you request, based on the log snippet you posted.
awk -v user="$2" '$4 == user { i[$11]++ }
END { for (a in i) printf ("Name: %s IP: %s count: %i\n", user, a, i[a]) }' "$1"
If the fourth whitespace-separated field in the log file matches the requested user name (which was passed to the shell script as its second parameter), add one to the count for the IP address (from field 11).
At the end, loop through all non-zero IP addresses, and print a summary for each. (The user name is obviously whatever was passed in, but matches your expected output.)
This is a very basic Awk script; if you think you want to learn more, I urge you to consult a simple introduction, rather than follow up here.
If you want a simpler grep-only solution, something like this provides the information in a different format:
grep "$2" "$1" |
grep -o -E '(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)' |
sort | uniq -c | sort -rn
The trick here is the -o option to the second grep, which extracts just the IP address from the matching line. It is however less precise than the Awk script; for example, a user named "sess" would match every input line in the log. You can improve on that slightly by using grep -w in the first grep -- that still won't help against users named "pam" --, but Awk really gives you a lot more control.
My original answer is below this line, partly becaus it's tangentially useful, partially because it is required in order to understand the pesky comment thread below.
The following
result=$(command)
echo $result
is wrong. You need the second line to be
echo "$result"
but in addition, the detour over echo is superfluous; the simple way to write that is simply
command

Checking output from "command" should contain unexpected crash with NilClass

In an effort to use Cucumber for a command-line script, I've installed the aruba gem as per the instructions provided. It's in my Gemfile, I can verify that the correct version is installed and I've included
require 'aruba/cucumber'
in 'features/env.rb'
In order to ensure it works, I wrote the following scenario:
#announce
Scenario: Testing cucumber/aruba
Given a blank slate
Then the output from "ls -la" should contain "drw"
assuming the thing should fail.
It does fail, but it fails for the wrong reasons:
#announce
Scenario: Testing cucumber/aruba
Given a blank slate
Then the output from "ls -la" should contain "drw"
You have a nil object when you didn't expect it!
You might have expected an instance of Array.
The error occurred while evaluating nil.[] (NoMethodError)
features/dataloader.feature:9:in `Then the output from "ls -la" should contain "drw"'
Anyone have any ideas why this isn't working? This seems to be very basic aruba behavior.
You are missing a 'When' step - the aruba "output should contain" step requires the command to have already run (it does not run it itself, it only looks it up).
#announce
Scenario: Testing cucumber/aruba
Given a blank slate
When I run `ls -la`
Then the output from "ls -la" should contain "drw"
This produces, on my machine:
#announce
Scenario: Testing cucumber/aruba # features/test_aruba.feature:8
When I run `ls -la` # aruba-0.4.11/lib/aruba/cucumber.rb:56
$ cd /Users/d.chetlin/dev/mine/ladder/tmp/aruba
$ ls -la
total 0
drwx------ 2 d.chetlin staff 68 Feb 15 23:38 .
drwx------ 7 d.chetlin staff 238 Feb 15 23:38 ..
Then the output from "ls -la" should contain "drw" # aruba-0.4.11/lib/aruba/cucumber.rb:86
1 scenario (1 passed)
2 steps (2 passed)
0m0.465s

Resources