Folder listing with gsutil with condition - bash

I have got this: gsutil ls -d gs://mystorage/*123*,
which gives me all files matching the pattern "123".
I wonder if i could do this with condition like >123 and <127. To grab all files whose names contain 124, 125 and 126.

Other than *, gsutil supports special wildcard names.
You can use these special wildcards to match the name of your files, but keep in mind that you are working with strings and characters rather than numbers, therefore the solution is not very straight forward. Here is a guide using regexp, that better explains how to work with digits, in a general way.
For your specific question, you would end up with something like:
gsutil ls -d gs://mystorage/*12[456]*

Related

Bash glob, how to OR over strings of non unit length?

I have in a directory a bunch of files. Each file's basename ends with a two digit number and a letter, such as file_01A.txt, file_03B.txt, file_13A.txt.
In a terminal using bash (I assume, working on a mac osx) I use
ls *01*[AB]*.txt
returns all files such as 01A and 01B. This makes sense to me.
ls *02*[AB]*.txt
returns similarly all files such as 02A and 02B.
Now I want to return all files 01A, 01B, 02A, 02B. Hence I want something like:
ls *(01 or 02)*[AB]*.txt
Attempt 1: I tried with | but that throws an error.
Attempt 2: ls *[01,02]*[AB]*.tex but that gives the 03 files too, since I assume it is interpreting the 01 and 02 as individual matches.
Attempt 3: ls *["01","02"]*[AB]*.tex is the same again.
It's not hard to articulate a single wildcard which matches your requirement.
ls *0[12]*[AB]*.tex
In the general case, use multiple wildcards if you can't articulate a single one. Notice that the shell expands them in the order you write them, and if they both match some files, there will be duplicates in the expansion.
ls *01*[AB]*.tex *02*[AB]*.tex
You seem to be confused about what the metacharaters mean. * matches any string, ? matches any character, and [abc] matches any one character which is listed between the square brackets. [!abc] watches a single character which is not a, b, or c. Bash also supports an extension called brace expansion, where foo{bar,quux} is basically an abbreviation of foobar fooquux. Your attempt could thus be rearticulated as
ls *{01,02}*[AB].tex
though the repeated prefix 0 is obviously redundant, and would better be left outside the braces, and then you might as well switch back to straight square brackets.
There is also a separate extended globbing syntax which allows for more elaborate wildcards. See the reference manual for details.

Python: Search for String in Filenames

I have been trying to search the root file system for a certain string (in the middle of the filename or wherever). I read about grep and I have tried this code here:
grep -rnw /home/pi/music -e "Maroon"
and something strange happens, there are three filenames with Maroon in them (same capitalization and spacing), but only two show up in the terminal. Any ideas why that is? Are there any other, easier ways to do this?
I would also like to say that I saw this StackOverflow post here, but I could not get it to work. I believe that was focusing on specific filenames, while I would like to do a general search.
All help is very much appreciated!
grep reads through the files on your disk, and searches for the word "Maroon".
What I think you want (when searching for file names) is:
find /home/pi/music -iname "*maroon*"
This will display all files that are named *maroon* (case insensitive). If you want case sensitive, take a look at -name.
man find
Will list all options for find.
The correct (or rather, the more common) way to search for files in matching a certain pattern is to use the find command, like this:
find /home/pi/music -type f -iname "*maroon*" -ls
type, limit searches to a particular type, in this case f for regular files (so it will ignore directories, pipes, sockets, etc.)
iname case insensitive name search
ls list the files found.
grep is used to search within files for matching content.
You want to use find to search for filenames
find /home/pi/music -depth 1 -name \*Maroon\*
This will find a file where the name contains the string. You need to quote the filename, so the shell doesn't glob it. -depth 1 so you only search the current directory

Using bash to list files with a certain combination of characters

So I have a directory with ~50 files, and each contain different things. I often find myself not remembering which files contain what. (This is not a problem with the naming -- it is sort of like having a list of programs and not remembering which files contain conditionals).
Anyways, so far, I've been using
cat * | grep "desiredString"
for a string that I know is in there. However, this just gives me the lines which contain the desired string. This is usually enough, but I'd like it to give me the file names instead, if at all possible.
How could I go about doing this?
It sounds like you want grep -l, which will list the files that contain a particular string. You can also just pass the filename arguments directly to grep and skip cat.
grep -l "desiredString" *
In the directory containing the files among which you want to search:
grep -rn "desiredString" .
This can list all the files matching "desiredString", with file names, matching lines and line numbers.

How to delete files like 'Incoming11781rKD'

I have a programme that is generating files like this "Incoming11781Arp", and there is always Incoming, and there is always 5 numbers, but there are 3 letters/upper-case/lower-case/numbers/special case _ in any way. Like Incoming11781_pi, or Incoming11781rKD.
How can I delete them using a script run from a cron job please? I've tried -
#!/bin/bash
file=~/Mail/Incoming******
rm "$file";
but it failed saying that there was no matching file or directory.
You mustn't double-quote the variable reference for pathname expansion to occur - if you do, the wildcard characters are treated as literals.
Thus:
rm $file
Caveat: ~/Mail/Incoming****** doesn't work the way you think it does and will potentially match more files than intended, as it is equivalent to ~/Mail/Incoming*, meaning that any file that starts with Incoming will match.
To only match files starting with Incoming that are followed by exactly 6 characters, use ~/Mail/Incoming??????, as #Jidder suggests in a comment.
Note that you could make your glob (pattern) even more specific:
file=~/Mail/Incoming[0-9][0-9][0-9][0-9][0-9][[:alpha:]_][[:alpha:]_][[:alpha:]_]
See the bash manual for a description of pathname expansion and pattern syntax: http://www.gnu.org/software/bash/manual/bashref.html#index-pathname-expansion.
You can achieve the same effect with the find command...
$ directory='~/Mail/'
$ file_pattern='Incoming*'
$ find "${directory}" -name "${file_pattern}" -delete
The first two lines define the directory and the file pattern separately, the find command will then proceed to delete any matching files inside that directory.

How to rename files keeping a variable part of the original file name

I'm trying to make a script that will go into a directory and run my own application with each file matching a regular expression, specifically Test[0-9]*.txt.
My input filenames look like this TestXX.txt. Now, I could just use cut and chop off the Test and .txt, but how would I do this if XX wasn't predefined to be two digits? What would I do if I had Test1.txt, ..., Test10.txt? In other words, How would I get the [0-9]* part?
Just so you know, I want to be able to make a OutputXX.txt :)
EDIT:
I have files with filename Test[0-9]*.txt and I want to manipulate the string into Output[0-9]*.txt
Would something like this help?
#!/bin/bash
for f in Test*.txt ;
do
process < $f > ${f/Test/Output}
done
Bash Shell Parameter Expansion
A good tutorial on regexes in bash is here. Summarizing, you need something like:
if [[$filenamein =~ "^Test([0-9]*).txt$"]]; then
filenameout = "Output${BASH_REMATCH[1]}.txt"
and so on. The key is that, when you perform the =~" regex-match, the "sub-matches" to parentheses-enclosed groups in the RE are set in the entries of arrayBASH_REMATCH(the[0]entry is the whole match,1` the first parentheses-enclosed group, etc).
You need to use rounded brackets around the part you want to keep.
i.e. "Test([0-9]*).txt"
The syntax for replacing these bracketed groups varies between programs, but you'll probably find you can use \1 , something like this:
s/Test(0-9*).txt/Output\1.txt/
If you're using a unix shell, then 'sed' might be your best bet for performing the transformation.
http://www.grymoire.com/Unix/Sed.html#uh-4
Hope that helps
for file in Test[0-9]*.txt;
do
num=${file//[^0-9]/}
process $file > "Output${num}.txt"
done

Resources