How to find files containing empty strings

How to find files containing empty strings - bash

One of our suppliers has a buggy shop floor system (long story short). While they fix whatever is wrong on their end, I need to segregate files they send: they are not empty but have a long empty string. Typically a good file will look like this in vi
<insert_list><test_event_insert endTime="2012-09-10T05:28:45" startTime="2012-09-10T05:27:41" operator="8176967"><process_step name="FVT" revision="NO DATA"></process_step><location1 name="CT" type="REGION"><location2 name="ONTREP1" type="TESTER"><location3 name="LineA" type="LINE"></location3></location2></location1><unit ...
"CT~DCA~FVT~8176967~ONTREP1~4~P~1100~DECA1MR0-01~XED1B1033A4675~20120910~052846.XML" [noeol][dos] 3L, 2170C
a bad file will look this:
^#^#^#^#^#^#^#^#^#^#^#^#^#...
"CT~DCA~FVT~8176967~ONTREP1~2~P~1100~DECA1MR0-01~XED1B1045B6072~20120904~043209.XML" [noeol] 1L, 2170C
The caret/at sign combo is VI's interpretation of that string, I guess but it is in fact an empty string. Using -z seems to work on one single file
X=CT~DCA~FVT~8176967~ONTREP1~2~P~1100~DECA1MR0-01~XED1B1045B6072~20120904~043209.XML
if [ ! -z $X ]
then
echo "$X empty"
else
echo "$X not empty"
fi
CT~DCA~FVT~8176967~ONTREP1~2~P~1100~DECA1MR0-01~XED1B1045B6072~20120904~043209.XML empty
But the same code is telling me that all 900 files on my EMC mass filer are empty. Which is not true.
export OUT=/path/to/device
declare -a myArray
for f in "$OUT"/*ONTREP1*; do myArray+=( "${f#$OUT/}" ); done
for i in "${myArray[#]}"; do if [ ! -z $i ] ; then echo "$i empty"; else echo "$i not empty"; fi; done
NB: Pattern "ONTREP1" is to narrow down the faulty files to one shop floor computer name.
What am I missing?

You are missing that test -z string tests whether a string is empty (as opposed to test -s file which tests whether a file is empty.) Furthermore the ^# in vim are an indication of NUL bytes--bytes with the value 0. It looks like these are binary data files or maybe corrupted, but certainly not empty. An empty file in vim displays as all tildes (~) in the leftmost column :-)
Try running the file filename command on the good and bad files; the latter probably says "data" due to the NUL bytes.

Related

Script which will move non-ASCII files

I need help. I should write a script,whih will move all non-ASCII files from one directory to another. I got this code,but i dont know why it is not working.
#!/bin/bash
for file in "/home/osboxes/Parkhom"/*
do
if [ -eq "$( echo "$(file $file)" | grep -nP '[\x80-\xFF]' )" ];
then
if test -e "$1"; then
mv $file $1
fi
fi
done
exit 0

It's not clear which one you are after, but:
• To test if the variable $file contains a non-ASCII character, you can do:
if [[ $file == *[^[:ascii:]]* ]]; then
• To test if the file $file contains a non-ASCII character, you can do:
if grep -qP '[^[:ascii:]]' "$file"; then
So for example your code would look like:
for file in "/some/path"/*; do
if grep -qP '[^[:ascii:]]' "$file"; then
test -d "$1" && mv "$file" "$1"
fi
done

The first problem is that your first if statement has an invalid test clause. The -eq operator of [ needs to take one argument before and one after; your before argument is gone or empty.
The second problem is that I think the echo is redundant.
The third problem is that the file command always has ASCII output but you're checking for binary output, which you'll never see.
Using file pretty smart for this application, although there are two ways you can go on this; file says a variety of things and what you're interested in are data and ASCII, but not all files that don't identify as data are ASCII and not all files that don't identify as ASCII are data. You might be better off going with the original idea of using grep, unless you need to support Unicode files. Your grep is a bit strange to me so I don't know what your environment is but I might try this:
#!/bin/bash
for file in "/home/osboxes/Parkhom"/*
do
if grep -qP '[\0x80-\0xFF]' $file; then
[ -e "$1" ] && mv $file $1
fi
done
The -q option means be quiet, only return a return code, don't show the matches. (It might be -s in your grep.) The return code is tested directly by the if statement (no need to use [ or test). The && in the next line is just a quick way of saying if the left-hand side is true, then execute the right-hand side. You could also form this as an if statement if you find that clearer. [ is a synonym for test. Personally if $1 is a directory and doesn't change, I'd check it once at the beginning of the script instead of on each file, it would be faster.

If you mean you want to know if something is not a plain text file then you can use the file command which returns information about the type of a file.
[[ ! $( file -b "$file" ) =~ (^| )text($| ) ]]
The -b simply tells it not to bother returning the filename.
The returned value will be something like:
ASCII text
HTML document text
POSIX shell script text executable
PNG image data, 21 x 34, 8-bit/color RGBA, non-interlaced
gzip compressed data, from Unix, last modified: Mon Oct 31 14:29:59 2016
The regular expression will check whether the returned file information includes the word "text" that is included for all plain text file types.
You can instead filter for specific file types like "ASCII text" if that is all you need.

Unexpected end of file in while loop in bash

I am trying to write a bash script that will do the following:
Take a directory or file as input (will always begin with /mnt/user/)
Search other mount points for same file or directory (will always begin with /mnt/diskx)
Return value
So, for example, the input will be "/mnt/user/my_files/file.txt". It will search if ""/mnt/disk1/my_files/file.txt" exists and will incrementally look for each disk (disk2, disk3, etc) until it finds it or disk20.
This is what I have so far:
#/user/bin/bash
var=$1
i=0
while [ -e $check_var = echo $var | sed 's:/mnt/user:/mnt/disk$i+1:']
do
final=$check_var
done
It's incomplete yes, but I am not that proficient in bash so I'm doing a little at a time. I'm sure my command won't work properly yet either but right now I am getting an "unexpected end of file" and I can't figure out why.

There are many issues here:
If this is the actual code you're getting "unexpected end of file" on, you should save the file in Unix format, not DOS format.
The shebang should be #!/usr/bin/bash or #!/bin/bash depending on your system
You have to assign check_var before running [ .. ] on it.
You have to use $(..) to expand a command
Variables like $i are not expanded in single quotes
sed can't add numbers
i is never incremented
the loop logic is inverted, it should loop until it matches and not while it matches.
You'd want to assign final after -- not in -- the loop.
Consider doing it in even smaller pieces, it's easier to debug e.g. the single statement sed 's:/mnt/user:/mnt/disk$i+1:' than your entire while loop.
Here's a more canonical way of doing it:
#!/bin/bash
var="${1#/mnt/user/}"
for file in /mnt/disk{1..20}/"$var"
do
[[ -e "$file" ]] && final="$file" && break
done
if [[ $final ]]
then
echo "It exists at $final"
else
echo "It doesn't exist anywhere"
fi

the test -s for the "if" statement

I have a question and i would be grateful for the answer if somebody knows one.
Ok, to the point. In one of my scripts i have following expression, it is not clear for me form man page what effect it should produce:
if ! [[ -s "$the_file_to_check" ]] ; then echo "file is zero sized and not exist | is exist and zero sized | not zero sized and not exist | not zero sized or not exist | not exist (obviously zero sized)" ; fi
there exist separate check for existence (the -a key), why to add another one?
And how this logics works in anyway.
I little bit lost in definition.
P.S.
I need a check for emptiness but not existence. Thank everyone.

-s checks not only if the file exists, but also if it contains any data (that is, if it has a size greater than 0 bytes). This is more than the -a option (which is, in fact, a synonym of -e) does, which only tests if the file exists.
touch foo
[[ -a foo ]] && echo "File foo exists"
[[ -s foo ]] || echo "File foo exists, but is size 0 bytes"
(I have wondered what the rationale for -a is, since I'm not aware that it does anything different from -e.)

This if command simply prints the string file is zero sized and not exist | is exist and zero sized | not exist (obviously zero sized) if the file does not exist or if the file is zero sized.
If you want your if command to check only for the zero sized file and not file existence then you can do something like this :
if [[ $(du -h "$the_file_to_check" |cut -f 1) == "0" ]] ;then
echo "file is zero sized" ;
fi
But this if statement will post a error if file does not exist. Make sure you execute this only when the file is present.

Search for specific extension files (shellscript)

I am new to shell scripting, I have this script:
#!/bin/bash
path_file_conf=/fullpath/directory/*.conf
if [ -e "$path_file_conf" ];then
echo "Found file"
else
echo "No found file"
fi
The result is always "No found file" even if I have a .conf files inside /fullpath/directory/ folder.
May I know what part of the code is wrong?
Thanks in advance!

I would try something like this:
for filename in /fullpath/directory/*.conf
do
if [ -e "$filename" ] # If finds match...
then
echo "Found file"
echo
else
echo "No found file"
fi
done
I haven't tested so I'm not certain it works, but it will at least give you the overall strategy.

The expression:
path_file_conf=/fullpath/directory/*.conf
May have multiple path names that match. So the value of $path_file_conf may end up being, for example:
/fullpath/directory/foo1.conf /fullpath/directory/foo2.conf
The conditional:
if [ -e "$path_file_conf" ]; then
Checks for the existence of a single file. If "/fullpath/directory/foo1.conf /fullpath/directory/foo2.conf" doesn't name a "single file", which it won't, then the condition will fail even though the files exist.
You could check this way. If the path doesn't expand, it will fail and exit. If it finds at least one good path, it will succeed and exit.
for pf in $path_file_conf ; do
if [ -e "$pf" ] ; then
echo "Found"
break
else
echo "Not found"
fi
done

The line causing trouble is:
path_file_conf=/full/path/directory/*.conf
The shell does not do wild-card expansion on the name when there are multiple files to match, or when no files match, so (except in the unusual circumstance of having a file called *.conf with an asterisk) the -e test fails. There is probably an option in bash to generate an error when a wild card fails to match; I would never use it.
You can use:
path_file_conf=( /full/path/directory/*.conf )
This gives you an array with the names of the files as the elements of the array. However, if there are no files that match, it gives you the name as written as the only element of the array.
From there, you can check each file in turn:
for conf_file in "${path_file_conf[#]}"
do
if [ -e "$conf_file" ]
then echo "Found file $conf_file"
else echo "No such file as $conf_file"
fi
done
You can determine the number of names with ${#path_file_conf[#]}, but remember that 1 could indicate a real file or a non-existent file.

String contains in Bash that is a directory path

I am writing an SVN script that will export only changed files. In doing so I only want to export the files if they don't contain a specific file.
So, to start out I am modifying the script found here.
I found a way to check if a string contains using the functionality found here.
Now, when I try to run the following:
filename=`echo "$line" |sed "s|$repository||g"`
if [ ! -d $target_directory$filename ] && [[!"$filename" =~ *myfile* ]] ; then
fi
However I keep getting errors stating:
/home/home/myfile: "no such file or directory"
It appears that BASH is treating $filename as a literal. How do I get it so that it reads it as a string and not a path?
Thanks for your help!

You have some syntax issues (a shell script linter can weed those out):
You need a space after "[[", otherwise it'll be interpretted as a command (giving an error similar to what you posted).
You need a space after the "!", otherwise it'll be considered part of the operand.
You also need something in the then clause, but since you managed to run it, I'll assume you just left it out.
You combined two difference answers from the substring thing you posted, [[ $foo == *bar* ]] and [[ $foo =~ .*bar.* ]]. The first uses a glob, the second uses a regex. Just use [[ ! $filename == *myfile* ]]

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to find files containing empty strings - bash

Related

Script which will move non-ASCII files

Unexpected end of file in while loop in bash

the test -s for the "if" statement

Search for specific extension files (shellscript)

String contains in Bash that is a directory path

Categories

Resources