Hello I am having a problem with checking two variables to see whether or not they are both equal. I have the following script:
Output=$(sudo defaults read /System/Library/User\ Template/English.lproj/Library/Preferences/com.apple.SetupAssistant | grep -o "DidSeeCloudSetup = 1")
Output2=$(sudo defaults read /System/Library/User\ Template/English.lproj/Library/Preferences/com.apple.SetupAssistant | grep -o "LastSeenCloudProductVersion")
Check="DidSeeCloudSetup = 1"
Check2="LastSeenCloudProductVersion"
echo "$Output"
echo "$Check"
if [ "$Output" = "$Check" ]
then
echo "OK"
else
echo "FALSE"
Even though they both contain the same thing it always comes out false... any ideas why?
There is a special character (hex: 10) between $ and Check in your if clause:
00000000 69 66 20 5b 20 22 24 4f 75 74 70 75 74 22 20 3d |if [ "$Output" =|
00000010 20 22 24 10 43 68 65 63 6b 22 20 5d 0a | "$.Check" ].|
Related
I encounter a strange behaviour with bash string substitution.
I expected the same substitution on $r1 and $var to yield the exact same results.
both strings seem to have the same value.
But It is not the case and I can't understand what I am missing....
maybe is because of the glob? I just don't know... I am not pure IT guys and maybe it's something that will be evident for you.
(bottom a Repl.it link)
mkdir -p T21805
touch T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
r1=T21805/*R1*
echo $r1;
echo ${r1%%_S1*z}
var=T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
echo ${var%%_S1*z}
echo $r1| hexdump -C
echo $var | hexdump -C
output :
echo $r1
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
echo ${r1%%_S1*z}
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
echo ${var%%_S1*z}
T21805/T21805_SI-GA-D8-BH25N7DSXY
echo $r1| hexdump -C
00000000 54 32 31 38 30 35 2f 54 32 31 38 30 35 5f 53 49
|T21805/T21805_SI|
00000010 2d 47 41 2d 44 38 2d 42 48 32 35 4e 37 44 53 58
|-GA-D8-BH25N7DSX|
00000020 59 5f 53 31 5f 4c 30 30 31 5f 52 31 5f 30 30 31
|Y_S1_L001_R1_001|
00000030 2e 66 61 73 74 71 2e 67 7a 0a
|.fastq.gz.| 0000003a
echo $var | hexdump -C
00000000 54 32 31 38 30 35 2f 54 32 31 38 30 35 5f 53 49
|T21805/T21805_SI|
00000010 2d 47 41 2d 44 38 2d 42 48 32 35 4e 37 44 53 58
|-GA-D8-BH25N7DSX|
00000020 59 5f 53 31 5f 4c 30 30 31 5f 52 31 5f 30 30 31
|Y_S1_L001_R1_001|
00000030 2e 66 61 73 74 71 2e 67 7a 0a
|.fastq.gz.| 0000003a
Repl.it
I am interested on understanding why this is not working, I can achieve my desire output using sed for example.
Glob expansion doesn't happen at assignment time.
$ mkdir -p T21805
$ touch T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
$ touch T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_002.fastq.gz
$ r1=T21805/*R1*
$ printf '%s\n' "$r1"
T21805/*R1*
$ printf '%s\n' $r1
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_002.fastq.gz
It happens after the unquoted r1 has been expanded. When you write ${r1%%_S1*z}, the value of r1 doesn't contain the string S1; only after ${r1} expands is there an S1 you could match against.
If you set an array, the assignment rules are different. The glob expands before the assignment, and so you can do your filtering on each element of the array.
$ r1=( T21805/*R1* )
$ printf '%2\n' "${r1[#]}"
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_002.fastq.gz
$ printf '%s\n' "${r1[#]%%_S1*z}"
T21805/T21805_SI-GA-D8-BH25N7DSXY
T21805/T21805_SI-GA-D8-BH25N7DSXY
I ran it after set -xv to see the contents of r1.
$ r1=T21805/*R1*
+ r1='T21805/*R1*'
$ var=T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
+ var=T21805/T21805_SI-GA-D8-BH25N7DSXY_S1_L001_R1_001.fastq.gz
The r1 of$ {r1 %% _ S1 * z}isT21805 / * R1 *.
r1 does not include_S1 * z.
I'm trying to do a simple tcsh script to look for a folder, then navigate to it if it exists. The statement evaluates properly, but if it evaluates false, I get an error "then: then/endif not found". If it evaluates true, no problem. Where am I going wrong?
#!/bin/tcsh
set icmanagedir = ""
set workspace = `find -maxdepth 1 -name "*$user*" | sort -r | head -n1`
if ($icmanagedir != "" && $workspace != "") then
setenv WORKSPACE_DIR `readlink -f $workspace`
echo "Navigating to workspace" $WORKSPACE_DIR
cd $WORKSPACE_DIR
endif
($icmanagedir is initialized elswehere, but I get the error regardless of which variable is empty)
The problem is that tcsh needs to have every line end in a newline, including the last line; it uses the newline as the "line termination character", and if it's missing it errors out.
You can use a hex editor/viewer to check if the file ends with a newline:
$ hexdump -C x.tcsh i:arch:21:49
00000000 69 66 20 28 22 78 22 20 3d 20 22 78 22 29 20 74 |if ("x" = "x") t|
00000010 68 65 6e 0a 09 65 63 68 6f 20 78 0a 65 6e 64 69 |hen..echo x.endi|
00000020 66 |f|
Here the last character if f (0x66), not a newline. A correct file has 0x0a as the last character (represented by a .):
$ hexdump -C x.tcsh
00000000 69 66 20 28 22 78 22 20 3d 20 22 78 22 29 20 74 |if ("x" = "x") t|
00000010 68 65 6e 0a 09 65 63 68 6f 20 78 0a 65 6e 64 69 |hen..echo x.endi|
00000020 66 0a |f.|
Ending the last line in a file with a newline is a common UNIX idiom, and some shell tools expect this. See What's the point in adding a new line to the end of a file? for some more info on this.
Most UNIX editors (such as Vim, Nano, Emacs, etc.) should do this by default, but some editors or IDEs don't do this by default, but almost all editors have a setting through which this can be enabled.
The best solution is to enable this setting in your editor. If you can't do this then adding a blank line at the end also solves your problem.
I've spent an embarrassingly long time trying to understand why the second conditional in the "foo" script below fails but the first one succeeds.
Please note:
The current directory contains two files: bar and foo.
All three strings $s1, $s2 and $s3 are equal according to hexdump.
Thanks in advance for any help.
Session: (Running on a Centos7 host):
>ls
bar foo
>cat foo
#!/bin/bash
s1="bar foo"
s2="bar foo"
s3=`ls`
echo -n $s1 | hexdump -C
echo -n $s2 | hexdump -C
echo -n $s3 | hexdump -C
if [ "$s1" = "$s2" ]; then # True
echo s1 = s2
fi
if [ "$s1" = "$s3" ]; then # NOT true! Why?
echo s1 = s3
fi
>foo
00000000 62 61 72 20 66 6f 6f |bar foo|
00000007
00000000 62 61 72 20 66 6f 6f |bar foo|
00000007
00000000 62 61 72 20 66 6f 6f |bar foo|
00000007
s1 = s2
>
Quote the variables when echoing.
echo -n "$s3" | hexdump -C
You'll see a newline between the file names, as ls uses -1 when the output is redirected.
Your demo would be more convincing with echo -n "$s1" etc. That would show that there's a newline in the middle of s3 where there's a space in s1 and s2. The echo without the double quotes mangles the newline into a space (and generally each sequence of one or more white space characters in the string into a single space).
Given:
#!/bin/bash
s1="bar foo"
s2="bar foo"
s3=`ls`
echo -n "$s1" | hexdump -C
echo -n "$s2" | hexdump -C
echo -n "$s3" | hexdump -C
if [ "$s1" = "$s2" ]; then # True
echo s1 = s2
fi
if [ "$s1" = "$s3" ]; then # NOT true because s3 contains a newline!
echo s1 = s3
fi
I get:
$ sh foo
00000000 2d 6e 20 62 61 72 20 66 6f 6f 0a |-n bar foo.|
0000000b
00000000 2d 6e 20 62 61 72 20 66 6f 6f 0a |-n bar foo.|
0000000b
00000000 2d 6e 20 62 61 72 0a 66 6f 6f 0a |-n bar.foo.|
0000000b
s1 = s2
$ bash foo
00000000 62 61 72 20 66 6f 6f |bar foo|
00000007
00000000 62 61 72 20 66 6f 6f |bar foo|
00000007
00000000 62 61 72 0a 66 6f 6f |bar.foo|
00000007
s1 = s2
$
I want to filter a directory that is full of log files according to the 8th column according to user's input (eg: 1218738496) and output to a text file. I have a working solution, but l am looking for a better solution that offers better performance, as the total file size may exceed 1GB+.
Problem 1:
Format inconsistencies in some lines.
Problem 2:
If the line's 8th column matches the input, the lines below it (that do not contain INSERT) should be output to file as well.
Sample data
ACTION,INSTALLATION_ID,LOG_TIMESTAMP_SECONDS,LOG_TIMESTAMP_FRACTIONS,LOG_TIMESTAMP,THREAD_ID,SEQUENCE_NUMBER,LOG_LEVEL_TYPE
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1127192896,0,DEBUG3
0010: 69 6c 65 40 10 92 0f 0e 67 b9 72 aa 5d e1 03 63
]",,default,false
INSERT,SLT_TEST_1,2015/06/02 14:07:13.305 (Asia/Colombo),1127192896,1,DEBUG1
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,14,DEBUG3
<v s=""MONTHLY_PEAK_DWNLOAD""/>
</a><a n=""thresholdScheme""><o t=""PM_UsageMonitorConfigThreshold"">
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,15,DEBUG3
0010: 69 6c 65 40 10 92 0f 0e 67 b9 72 aa 5d e1 03 63
]",,default,false
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,17,DEBUG3
Desired output
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,14,DEBUG3
<v s=""MONTHLY_PEAK_DWNLOAD""/>
</a><a n=""thresholdScheme""><o t=""PM_UsageMonitorConfigThreshold"">
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,15,DEBUG3
0010: 69 6c 65 40 10 92 0f 0e 67 b9 72 aa 5d e1 03 63
]",,default,false
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,17,DEBUG3
My current working script
for file in $(ls -rt $directory)
do
echo "Reading file : " $file
# || [[ -n "$line" ]] <-- prevent last line being ignored if doesn't end with newline
while IFS= read -r line || [[ -n "$line" ]]
do
# if line contains INSERT
if [[ $line == *"INSERT"* ]]
then
# Break it to access the thread ID
breakdown=(${line//,/ })
threadID=${breakdown[4]}
if [[ $threadID == "$inputThreadID" ]]
then
seqID=${breakdown[5]}
echo $line >> ./output_unsorted.txt
fi
else
# The "too long lines" check if they belong to the ID log we want
if [ "$threadID" == "$inputThreadID" ] && [[ $line != *"ACTION,INSTALLATION_ID"* ]]
then
if [ "$lastSeqID" != "$seqID" ]
then
echo $line >> ./output_unsorted.txt
else
echo $line >> ./output_unsorted.txt
fi
fi
fi
done < "$directory/$file"
done
Using awk
This produces the output that you ask for:
$ awk -F, '/INSERT/{f=0} $4==1218738496{f=1} f' file
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,14,DEBUG3
<v s=""MONTHLY_PEAK_DWNLOAD""/>
</a><a n=""thresholdScheme""><o t=""PM_UsageMonitorConfigThreshold"">
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,15,DEBUG3
0010: 69 6c 65 40 10 92 0f 0e 67 b9 72 aa 5d e1 03 63
]",,default,false
INSERT,SLT_TEST_1,2015/06/02 14:07:26.860 (Asia/Colombo),1218738496,17,DEBUG3
How it works:
-F,
Set the input field separator to a comma.
/INSERT/{f=0}
If the line contains INSERT, we set flag f to zero (false).
$4==1218738496{f=1}
If the fourth field is your selected number, then we set the flag f to one (true).
f
If f is true, print the line.
Using bash
This uses very similar logic and produces the same output but uses bash:
#!/bin/bash
f=
while IFS= read line
do
[[ $line == *"INSERT"* ]] && f=
IFS=, read a b c d rest <<<"$line"
[ "$d" = 1218738496 ] && f=1
[ "$f" ] && echo "$line"
done <file
This is the script I've constructed
It takes a list of files according to the extension supplied as an argument.
It then removes everything before the pattern 00000000: in those files.
The pattern 00000000: is preceded by the string <pre>, it then removes those five first characters.
The script then removes the last three lines of the file
The script the outputs only the hexdump data of the file.
The script runs xxd to convert the hexdump to a file.jpg
if [[ $# -eq 0 ]] ; then
echo 'Run script as ./hexconv ext'
exit 0
fi
for file in *.$1
do
filename=$(basename $file)
extension="${filename##*.}"
filename="${filename%.*}"
sed -n '/00000000:/,$p' $file | sed '1s/^.....//' | head -n -3 | awk '{print $2" "$3" "$4" "$5" "$6" "$7" "$8" "$9" "$10" "$11" "$12" "$13" "$14" "$15" "$16" "$17}' | xxd -p -r > $filename.jpg
done
It works as I want it too, but I suspect there are things to improve it by, but alas, I am a novice in the use of awk and sed.
Excerpt from file
<th>response-head:</th>
<td>HTTP/1.1 200 OK
Date: Sun, 15 Dec 2013 04:27:04 GMT
Server: PWS/8.0.18
X-Px: ms h0-s34.p6-lhr ( h0-s35.p6-lhr), ht-d h0-s35.p6-lhr.cdngp.net
Etag: "4556354-9fbf8-4e40387aadfc0"
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0, max-age=0
Accept-Ranges: bytes
Content-Length: 654328
Content-Type: image/jpeg
Last-Modified: Thu, 15 Aug 2013 21:55:19 GMT
Pragma: no-cache
</td>
</tr>
</table>
<hr/>
<pre>00000000: ff d8 ff e0 00 10 4a 46 49 46 00 01 01 01 00 48 ......JFIF.....H
00000010: 00 48 00 00 ff e1 00 18 45 78 69 66 00 00 49 49 .H......Exif..II
00000020: 2a 00 08 00 00 00 00 00 00 00 00 00 00 00 ff ed *...............
00000030: 00 48 50 68 74 73 68 70 20 33 2e 30 00 .HPhotoshop 3.0.
00000040: 38 42 49 4d 04 04 00 00 00 00 00 1c 01 5a 00 8BIM..........Z.
00000050: 03 1b 25 47 1c 02 00 00 02 00 02 00 38 42 49 4d ..%G........8BIM
00000060: 04 25 00 00 00 00 00 10 fc e1 89 c8 b7 c9 78 .%.............x
00000070: 34 62 34 07 58 77 eb ff e1 03 a5 68 74 74 70 /4b4.Xw.....http
00000080: 3a 6e 73 2e 61 64 62 65 2e 63 6d ://ns.adobe.com/
00000090: 78 61 70 31 2e 30 00 3c 78 70 61 63 6b xap/1.0/.<?xpack
000000a0: 65 74 20 62 65 67 69 6e 3d 22 ef bb bf 22 20 69 et begin="..." i
000000b0: 64 3d 22 57 35 4d 30 4d 70 43 65 68 69 48 7a 72 d="W5M0MpCehiHzr
000000c0: 65 53 7a 4e 54 63 7a 6b 63 39 64 22 3e 20 3c eSzNTczkc9d"?> <
000000d0: 78 3a 78 6d 70 6d 65 74 61 20 78 6d 6c 6e 73 3a x:xmpmeta xmlns:
000000e0: 78 3d 22 61 64 62 65 3a 6e 73 3a 6d 65 74 61 x="adobe:ns:meta
000000f0: 22 20 78 3a 78 6d 70 74 6b 3d 22 41 64 62 /" x:xmptk="Adob
00000100: 65 20 58 4d 50 20 43 72 65 20 35 2e 30 2d 63 e XMP Core 5.0-c
00000110: 30 36 31 20 36 34 2e 31 34 30 39 34 39 2c 20 32 061 64.140949, 2
00000120: 30 31 30 31 32 30 37 2d 31 30 3a 35 37 3a 010/12/07-10:57:
Although #CodeGnome is right and this might belong to Code Review SE, here you go anyway:
Slightly more efficient to combine the multiple sed commands into one, for example:
sed -n -e 's/^<pre>//' -e '/00000000:/,$p'
I decided to retract this part, as I'm not all that sure it's any better or clearer. Your version is fine, except that s/^<pre>// is better than s/^.....//.
Use exit 1 when checking the number of arguments to signal an error
What is for file in *. there? Iterate for all files ending with a dot? Typo?
Unless you're 100% sure the filenames will never contain spaces, you should quote them, but don't quote where you don't need, for example:
filename=$(basename "$file") # need to quote
extension=${filename##*.} # no need,
filename=${filename%.*} # no need
sed ... "$file" # need to quote
... | xxd > "$filename".jpg # need to quote
The last awk could be shorter and less error prone as a loop:
... | awk '{printf $2; for (i=3; i<=17; ++i) printf " " $i; print ""}'
It seems you want to learn. You might be interested in this other answer too: What are the rules to write robust shell scripts?
The error message should be sent to stderr, should not hard-code the name of the script in case you rename it later, and should exit with a nonzero value.
if (( ! $# )); then
echo >&2 "Run script as '$0' \$extension"
exit 1
fi
If you're going to put the then on the same line as the if, then you should put the do on the same line as the for, too, for consistency:
for file in *.$1; do
Using file for the full name and filename for the basename is confusing variable name choice. I would use basename for the variable, to match the operation. And you need to quote the parameter expansion:
basename=$(basename "$file")
But you don't need to quote the right hand side of an assignment:
extension=${basename##*.}
The part of a filename without the extension is sometimes called the root (in vi and csh :-modifiers, you get it with :r)... using that name would be less confusing than changing an existing variable and reusing it:
root=${basename%.*}
As far as the actual pipeline, I would reorder it to put the head before the awk, since the sed and the head are all about what lines to print out and should be grouped together before the awk which modifies those selected lines. I would also use a loop and printf to make the awk a little more wieldy:
sed -n '/0\{8\}:/,$p' "$file" |
head -n -3 |
awk '{ printf "%s", $2; for (f=3;f<=17;++f) { printf " %s", $f }; print "" }' |
xxd -p -r > "$root.jpg"
done