I currently have a commonly used grep command:
grep -r -w <SEARCH> --include \*.c --include \*.cpp --include \*.h
Want to have this as an alias (say, grepc) but be able to change the SEARCH in the middle. Is it possible to have this as a variable, and if so how would I do this / call it?
No need to pass argument for this use case
alias xyz="grep -r --include='*.'{c,h,cpp} -w"
xyz 'whatever'
When arguments are needed to be passed, use functions instead. See this Q&A on unix stackexchange for discussion between alias and functions
Related
I am trying to upload many thousands of files to Google Cloud Storage, with the following command:
gsutil -m cp *.json gs://mybucket/mydir
But I get this error:
-bash: Argument list too long
What is the best way to handle this? I can obviously write a bash script to iterate over different numbers:
gsutil -m cp 92*.json gs://mybucket/mydir
gsutil -m cp 93*.json gs://mybucket/mydir
gsutil -m cp ...*.json gs://mybucket/mydir
But the problem is that I don't know in advance what my filenames are going to be, so writing that command isn't trivial.
Is there either a way to handle this with gsutil natively (I don't think so, from the documentation), or a way to handle this in bash where I can list say 10,000 files at a time, then pipe them to the gsutil command?
Eric's answer should work, but another option would be to rely on gsutil's built-in wildcarding, by quoting the wildcard expression:
gsutil -m cp "*.json" gs://mybucket/mydir
To explain more: The "Argument list too long" error is coming from the shell, which has a limited size buffer for expanded wildcards. By quoting the wildcard you prevent the shell from expanding the wildcard and instead the shell passes that literal string to gsutil. gsutil then expands the wildcard in a streaming fashion, i.e., expanding it while performing the operations, so it never needs to buffer an unbounded amount of expanded text. As a result you can use gsutil wildcards over arbitrarily large expressions. The same is true when using gsutil wildcards over object names, so for example this would work:
gsutil -m cp "gs://my-bucket1/*" gs://my-bucket2
even if there are a billion objects at the top-level of gs://my-bucket1.
If your filenames are safe from newlines you could use gsutil cp's ability to read from stdin like
find . -maxdepth 1 -type f -name '*.json' | gsutil -m cp -I gs://mybucket/mydir
or if you're not sure if your names are safe and your find and xargs support it you could do
find . -maxdepth 1 -type f -name '*.json' -print0 | xargs -0 -I {} gsutil -m cp {} gs://mybucket/mydir
Here's a way you could do it, using xargs to limit the number of files that are passed to gsutil at once. Null bytes are used to prevent problems with spaces in or newlines in the filenames.
printf '%s\0' *.json | xargs -0 sh -c 'copy_all () {
gsutil -m cp "$#" gs://mybucket/mydir
}
copy_all "$#"'
Here we define a function which is used to put the file arguments in the right place in the gsutil command. This whole process should happen the minimum number of times required to process all arguments, passing the maximum number of filename arguments possible each time.
Alternatively you can define the function separately and then export it (this is bash-specific):
copy_all () {
gsutil -m cp "$#" gs://mybucket/mydir
}
printf '%s\0' *.json | xargs -0 bash -c 'export -f copy_all; copy_all "$#"'
I wanted to write a short script with the following structure:
find the right folders
cd into them
replace an item
So my problem is that I get the right folders from findbut I don't know how to do the action for every line findis giving me. I tried it with a for loop like this:
for item in $(find command)
do magic for item
done
but the problem is that this command will print the relative pathnames, and if there is a space within my path it will split the path at this point.
I hope you understood my problem and can give me a hint.
You can run commands with -exec option of find directly:
find . -name some_name -exec your_command {} \;
One way to do it is:
find command -print0 |
while IFS= read -r -d '' item ; do
... "$item" ...
done
-print0 and read ... -d '' cause the NUL character to be used to separate paths, and ensure that the code works for all paths, including ones that contain spaces and newlines. Setting IFS to empty and using the -r option to read prevents the paths from being modified by read.
Note that the while loop runs in a subshell, so variables set within it will not be visible after the loop completes. If that is a problem, one way to solve it is to use process substitution instead of a pipe:
while IFS= ...
...
done < <(find command -print0)
Another option, if you have got Bash 4.2 or later, is to use the lastpipe option (shopt -s lastpipe) to cause the last command in pipelines to be run in the current shell.
If the pattern you want to find is simple enough and you have bash 4 you may not need find. In that case, you could use globstar instead for recursive globbing:
#!/bin/bash
shopt -s globstar
for directory in **/*pattern*/; do
(
cd "$directory"
do stuff
)
done
The parentheses make each operation happen in a subshell. That may have performance cost, but usually doesn't, and means you don't have to remember to cd back each time.
If globstar isn't an option (because your find instructions are not a simple pattern, or because you don't have a shell that supports it) you can use find in a similar way:
find . -whatever -exec bash -c 'cd "$1" && do stuff' _ {} \;
You could use + instead of ; to pass multiple arguments to bash each time, but doing one directory per shell (which is what ; would do) has similar benefits and costs to using the subshell expression above.
Here's how one might list all files matching a pattern in bash:
ls *.jar
How to list the complement of a pattern? i.e. all files not matching *.jar?
Use egrep-style extended pattern matching.
ls !(*.jar)
This is available starting with bash-2.02-alpha1.
Must first be enabled with
shopt -s extglob
As of bash-4.1-alpha there is a config option to enable this by default.
ls | grep -v '\.jar$'
for instance.
Little known bash expansion rule:
ls !(*.jar)
With an appropriate version of find, you could do something like this, but it's a little overkill:
find . -maxdepth 1 ! -name '*.jar'
find finds files. The . argument specifies you want to start searching from ., i.e. the current directory. -maxdepth 1 tells it you only want to search one level deep, i.e. the current directory. ! -name '*.jar' looks for all files that don't match the regex *.jar.
Like I said, it's a little overkill for this application, but if you remove the -maxdepth 1, you can then recursively search for all non-jar files or what have you easily.
POSIX defines non-matching bracket expressions, so we can let the shell expand the file names for us.
ls *[!j][!a][!r]
This has some quirks though, but at least it is compatible with about any unix shell.
If your ls supports it (man ls) use the --hide=<PATTERN> option. In your case:
$> ls --hide=*.jar
No need to parse the output of ls (because it's very bad) and it scales to not showing multiple types of files. At some point I needed to see what non-source, non-object, non-libtool generated files were in a (cluttered) directory:
$> ls src --hide=*.{lo,c,h,o}
Worked like a charm.
Another approach can be using ls -I flag (Ignore-pattern).
ls -I '*.jar'
And if you want to exclude more than one file extension, separate them with a pipe |, like ls test/!(*.jar|*.bar). Let's try it:
$ mkdir test
$ touch test/1.jar test/1.bar test/1.foo
$ ls test/!(*.jar|*.bar)
test/1.foo
Looking at the other answers you might need to shopt -s extglob first.
One solution would be ls -1|grep -v '\.jar$'
Some mentioned variants of this form:
ls -d *.[!j][!a][!r]
But this seems to be only working on bash, while this seems to work on both bash and zsh:
ls -d *.[^j][^a][^r]
ls -I "*.jar"
-I, --ignore=PATTERN
do not list implied entries matching shell PATTERN
It works without having to execute anything before
It works also inside watch quotes: watch -d 'ls -I "*.gz"', unlike watch 'ls !(*.jar)' which produces: sh: 1: Syntax error: "(" unexpected
Note: For some reason in Centos requires quoting the pattern after -I while Ubuntu does not
I can't retrieve the way to define a shell alias (in bash) like this one :
alias suppr='/usr/bin/find . -name "*~" | xargs rm -f'
but with "*~" as a parameter of the alias.
I would like to use it like : suppr ".bak" or suppr "*.svn" etc...
(it's just a dummy example here)
Use a function:
suppr() {
/usr/bin/find . -name "$#" | xargs rm -f
}
In general, functions are more flexible and safer to use than aliases. In fact, many people argue that functions should always be used instead of aliases.
why not save your command as a script, and put the script in Path? you can name the script file as anything.
I could have sworn you could do the following:
ls *.{java, cpp}
but that does not seem to work. I know this answer is probably on the site somewhere but I couldn't find it via search.
For instance, if I want to be able to use the globbing with a find command, I would want to do something like
find . -name "*.{java,cpp}" | xargs grep -n 'TODO'
Is this possible without resorting to using the -o binary operator?
It is likely that you are seeing an error message such as this:
ls: cannot access *.{java: No such file or directory
ls: cannot access ,cpp}: No such file or directory
If that's the case, it's because of the space after the comma. Leave it out:
ls *.{java,cpp}
For future reference, it is more helpful to post error messages than to say "it's not working" (please don't take this personally. It's meant for everyone to see. I even do it sometimestoo often).
ls *.{java,cpp} works just fine for me in bash...:
$ ls *.{java,cpp}
a.cpp ope.cpp sc.cpp weso.cpp
helo.java qtt.cpp srcs.cpp
Are you sure it's not working for you...?
find is different, but
find -E . -regex '.*\.(java|cpp)'
should do what you want (in some versions you may not need the -E or you may need a -regextype option there instead, "man find" on your specific system to find out).
But this does work in Bash:
$ ls
a.h a.s main.cpp main.s
$ ls *.{cpp,h}
a.h main.cpp
Are you sure you're in Bash? If you are, maybe an alias is causing the issue: try /bin/ls *.{java,cpp} to make sure you don't call the aliased ls.
Or, just take out the spaces in your list inside the {} -- the space will cause an error because Bash will see *.{java, as one argument to ls, and it will see cpp} as a second argument.
For your particular example, this may also do what you want
grep -rn TODO . --include '*.java' --include '*.cpp'