Loop in bash for reading repositories(folders) - bash

I have made this script which:
Clones all repositories from Bitbucket to folder "temporary projects" . To clone the repos, script is using my "repolinks.csv" which is generally what the name says, links to repos saved as text file :)
After every repo Is cloned, the script search for all .ttf files in every folder(repo) in "temporaryprojects" and saves result(which are paths to every ttf file) as TTF-Project-Paths
OTFINFO part is reading paths to .ttf files (TTF-Projects-Paths) and give me specified info about those ttf files (family, subfamily,author etc.) and saves it as "TTF-Projects-INFO"
#!/bin/bash
cd /Users/krzysztofpaszta/temporaryprojects
for repo in $(cat /users/krzysztofpaszta/repolinks.csv); do
git clone "$repo"
echo Repo cloned to /users/krzysztofpaszta/temporaryprojects
done
#tablica z nazwa repo (repo_links na przykład) + repo pętla
echo Pobieranie ścieżek wszystkich plików TTF na urządzeniu z projektow Boombit
find "/users/krzysztofpaszta/temporaryprojects/" -name "*.ttf" > /Users/krzysztofpaszta/TTF-Projects-PATHS.csv
echo Sciezki plikow pobrane
while read in; do
otfinfo --info filename="${fullfile##*/}" >> /users/krzysztofpaszta/TTF-Projects-INFO.csv "$in"
done < /users/krzysztofpaszta/TTF-Projects-PATHS.csv
echo dane plikow pobrane
rm -vrf /Users/krzysztofpaszta/temporaryprojects/*
echo Repo deleted
Everything is working great but now I am struggling how to modify this script. As you can see, right now all the repos are downloaded and then all the info about repos are saved in one file named TTF-Project-INFO. What I need it to do is to search for every repo singly and save the results as $repo.csv (name of one repo as csv) and to it constantly to the last repo in "repolinks.csv"
I modified the script like that:
#!/bin/bash
cd /Users/krzysztofpaszta/temporaryprojects
for repo in $(cat /users/krzysztofpaszta/repolinks.csv); do
git clone "$repo"
find "/users/krzysztofpaszta/temporaryprojects/$repo" -name "*.ttf" > /Users/krzysztofpaszta/$repo.csv
echo Repo cloned to /users/krzysztofpaszta/temporaryprojects
while read in; do
otfinfo --info filename="${fullfile##*/}" >> /users/krzysztofpaszta/TTF-Projects-INFO.csv "$in"
done < /users/krzysztofpaszta/TTF-Projects-PATHS.csv
But unfortunately it is doing actually the same this. Saves every repo to "temporary projects" and then search for every .ttf file paths and then info and saves those info in one file. I think it is just pretty simple loop in bash but I have no idea how to do it properly. Could someone give me some hint? I was trying to modify the script but no luck so far.
SAMPLES OF CSV'S:
TTF-Projects-Paths.csv:
/users/krzysztofpaszta/temporaryprojects/project1/Fonts/SwallowFallsMixAllCyryl.ttf
/users/krzysztofpaszta/temporaryprojects/project2/Fonts/KOMIKAZE.ttf
/users/krzysztofpaszta/temporaryprojects/project2/Graphics/Fonts/Arial Unicode.ttf
/users/krzysztofpaszta/temporaryprojects/project3/fonts/SwallowFallsMixAll.ttf
TTF-Projects-INFO.csv:
/users/krzysztofpaszta/temporaryprojects/project1/Fonts/LiberationSans.ttf:Family: Liberation Sans
/users/krzysztofpaszta/temporaryprojects/project1/Fonts/LiberationSans.ttf:Subfamily: Regular
/users/krzysztofpaszta/temporaryprojects/project1/Fonts/LiberationSans.ttf:Full name: Liberation Sans
/users/krzysztofpaszta/temporaryprojects/project1/Fonts/LiberationSans.ttf:PostScript name: LiberationSans
etc... project2, project3 with the same info about fonts.
repolinks.csv:
https://bitbucket.org/organisation/project1-settings https://bitbucket.org/organisation/crave-man https://bitbucket.org/organisation/adverts https://bitbucket.org/organisation/pipeline https://bitbucket.org/organisation/data https://bitbucket.org/organisation/async https://bitbucket.org/organisation/audio
etc..

If you want to use $repo.csv instead of the same file for all repos, change the output file in the appending redirection:
# old
>> /users/krzysztofpaszta/TTF-Projects-INFO.csv
# new
>> /users/krzysztofpaszta/"$repo".csv
The name of the directory is not the whole URL, just the last part, which you can extract in bash using parameter expansion:
dir=${repo##*/} # Remove everything up to the last slash.
Proper indenting helps understanding of the flow:
#!/bin/bash
cd /Users/krzysztofpaszta/temporaryprojects
for repo in $(cat /users/krzysztofpaszta/repolinks.csv); do
git clone "$repo"
dir=${repo##*/}
echo Repo cloned to /users/krzysztofpaszta/temporaryprojects
find /users/krzysztofpaszta/temporaryprojects/"$dir" -name "*.ttf" > /users/krzysztofpaszta/TTF-list-"$dir".csv
while read in ; do
otfinfo --info filename="${fullfile##*/}" "$in" >> /users/krzysztofpaszta/"$dir".csv
done < /users/krzysztofpaszta/TTF-list-"$dir".csv
done
I'm not sure what $fullfile is, so I left it as was.

Related

How to stage specific lines to git

I want to be able stage specific lines of code that match a pattern (MARKETING_VERSION in my case).
I've got the awk command which will show me the lines that match the pattern MARKETING_VERSION but I don't know how to stage the lines from that result to git.
awk '/MARKETING_VERSION/{print NR}' exampleFile.txt
the result in terminal is
1191
1245
How can I use this result to stage those specific lines in that file to git?
I know you can use git add -p but I want to use this in a shell script so I need a non-interactive version.
TIA
What git add -p <file> does is, very roughly, this:
tmpfile=$(mktemp)
tf2=$(mktemp)
tf3=$(mktemp)
git diff <file> > $tmpfile
while [ -s $tmpfile ]; do
extract first diff hunk from $tmpfile to $tf2 and rest to $tf3
show you $tf2, ask if you want to include this hunk
(with options to edit the hunk, etc); repeat until ready
if you say to *add* the hunk, run git apply --cached $tf2
cat < $tf3 > $tf2
done
rm -f $tmpfile $tf2 $tf3
That is, git add -p uses git apply --cached (a specialized sub-variant of git apply --index that ignores the working tree copy of the file). The key takeaway you need, from the above, is this: There are three versions of the file!
The first one (completely ignored here) is frozen for all time and is in the HEAD commit.
The second one is in Git's index aka staging area. That's used by git diff above as the "old version".
The third one is in your working tree. That's used by git diff above as the "new version".
The patches that Git lets you take or skip are simply the result of comparing the "old" (index) and "new" (working tree) version. If you take some patch, Git updates the in-index copy by applying the patch.
Hence, if there are some set of lines in the working tree version (say, lines 100 through 110 inclusive) that you'd like to use to replace some other set of lines (say, lines 90 through 92 inclusive) in the index version, the way to construct that is:
extract the index version;
scrape out lines 1-89 from the index version; concatenate lines 100-110 from the working tree version; concatenate lines 93-end from the index version, all into a temporary file;
replace the index copy with the temporary file.
To read the index version, use git show or git cat-file -p with the name of the index version of the file. If the file's name is path/to/file, the index version's name is :path/to/file (short for :0:path/to/file: we want the copy in slot zero; there must not be a copy in slots 1, 2, or 3 so that there is a copy in slot 0; you can simply attempt to read it from slot zero, and if that fails, assume the file either isn't in the index, or is conflicted).
Reading the working tree file (some select subset of lines) is left as an exercise, as is the concatenation part, and any error checking you wish to include.
Assuming the final resulting file is in a temporary file named $tf (as a shell variable), to update the index copy, you must first make sure an appropriate blob hash ID exists:
hash=$(git hash-object -w -t blob --path="$path" -- "$tf")
for instance (this assumes you want to run the usual .gitattribute filters, if any, and know that the path is $path). Then, if that goes well, use that hash ID with git update-index:
git update-index --cacheinfo "$mode,$hash,$path"
where $mode is either 100644 or 100755 as appropriate for the file. If you don't want to change the mode, you can read the previous mode with git ls-files --cached or similar. Otherwise, provided core.fileMode is true, read the mode from the working tree copy of the file, to match the behavior of git add: convert "has any executable bit set" to 100755 and "has no executable bit set" to 100644. When core.fileMode is false—use git config --get --type bool core.filemode to read it—git add uses the existing mode for this add-patch case.)

"for loop" to run rsync on relevant folders only

I have a folder contains around 650 folders (source), I generated a list of the relevant folders i want (final.txt).
i am trying to use a "for loop" to copy only the relevant sub-folders to a new location (target).
i keep getting the original content of the "source" copied to the "target".
i run:
for var in `cat final.txt` ; do rsync -ah $var source/ target/ ; done
I tried different syntax but can't seem to get what I need.
what am i doing wrong?
I expect to copy only the folders which name is in the final.txt list copied to the target (all "names" in the file are a single word, matching to some of the folder names for exactly)
ok after messing around I should have ran this (it works)
for var in `cat final.txt` ; do rsync -ah source/$var target/ ; done

SVN checkout files that have been committed within given time period

I'm writing a deployment bash script that will publish recent changes in source control to a different machine. I'm new to svn from the command line (have used it in development for years) and new to bash scripting.
I need a way to checkout only files that have been modified recently. Something like this:
svn checkout svn://server/repo/project/trunk -mtime -1d4h
The idea being this would only checkout files that have been committed within the last 28 hours.
You can't checkout changes, you get only some state of some part of repository (i.e "revision"), which will include all files, existing in this node (subdirectory) in this revision (and added|modified in any revision before this revision, including this revision)
Date format specification in Subversion doesn't allow "relative date-time", only absolute values
Scripts, which export|save outside repository all files, changed in revision|revision range, exist and can be found in Net (Subversion Command Line Script to export changed files as good bash-sample)
Consequences of the above notes
You must to supply correct Subversion-style date or revision number for starting point
Relative date with free-form specification can be easy constructed with bash date command (-d "28 hours before") and stored in variable, which can be used as parameter for electrictoolbox's script
Deploy of files from export-directory to final destination is final part (heavy environment-specific, no suggestions here now)
I found something simpler to suite my needs based on this:
How can I keep the original file [commit] timestamp on Subversion?
I checkout or update a working copy of the project using --config-option config:miscellany:use-commit-times=yes which sets the timestamps on the filesystem equal to the last checkout time. I then use a standard find command. For example:
#!/bin/bash
function listfn {
while read file; do
if [[ !( $file =~ ^.*\.svn.*$ ) ]]; then
echo $file
fi
done
}
svn checkout --config-option config:miscellany:use-commit-times=yes svn://server/repo/project/trunk project-fordeploy
find project-fordeploy -mtime -1d4h | listfn

Adding a status (file integrity)check to a cbr cbz converting bash script

First post, so Hi! Let me start by saying I'm a total noob regarding programming. I understand very basic stuff, but when it comes to checking exit codes or what the adequate term is, I'm at a loss. Apparently my searchfoo is really weak in this area, I guess it's a question of terminology.
Thanks in advance for taking your time to reading this/answering my question!
Description: I found a script that converts/repack .cbr files to .cbz files. These files are basically your average rar and zip files, however renamed to another extension as they are used for (comic)book applications such as comicrack, qcomicbook and what not. Surprisingly enough there no cbr -> cbz converters out there. The advantages of .cbz is besides escaping the proprietary rar file format, that one can store the metadata from Comic Vine with e. g comictagger.
Issue: Sometimes the repackaging of the files doesn't end well and would hopefully be alleviated by a integrity check & another go. I modified said script slightly to use p7zip as it can both pack/unpack 7z, zip-files and some others, i. e great for options. p7zip can test the archive by:
7z t comicfile.cbz tmpworkingdir
I guess it's a matter of using if & else here(?) to check the integrity and then give it another go, if there are any error.
Question/tl;dr: What would be the "best"/adequate approach to add a integrity file check to the script below?
#!/bin/bash
#Source: http://comicrack.cyolito.com/forum/13-scripts/30013-cbr3cbz-rar-to-zip-conversion-for-linux
echo "Converting CBRs to CBZs"
# Set the "field separator" to something other than spaces/newlines" so that spaces
# in the file names don't mess things up. I'm using the pipe symbol ("|") as it is very
# unlikely to appear in a file name.
IFS="|"
# Set working directory where to create the temp dir. The user you are using must have permission
# to write into this directory.
# For performance reasons I'm using ram disk (/dev/shm/) in Ubuntu server.
WORKDIR="/dev/shm/"
# Set name for the temp dir. This directory will be created under WORDDIR
TEMPDIR="cbr2cbz"
# The script should be invoked as "cbr2cbz {directory}", where "{directory}" is the
# top-level directory to be searched. Just to be paranoid, if no directory is specified,
# then default to the current working directory ("."). Let's put the name of the
# directory into a shell variable called SOURCEDIR.
# Note: "$1" = "The first command line argument"
if test -z "$1"; then
SOURCEDIR=`pwd`
else
SOURCEDIR="$1"
fi
echo "Working from directory $SOURCEDIR"
# We need an empty directory to work in, so we'll create a temp directory here
cd "$WORKDIR"
mkdir "$TEMPDIR"
# and step into it
cd "$TEMPDIR"
# Now, execute a loop, based on a "find" command in the specified directory. The
# "-printf "$p|" will cause the file names to be separated by the pipe symbol, rather than
# the default newline. Note the backtics ("`") (the key above the tab key on US
# keyboards).
for CBRFILE in `find "$SOURCEDIR" -name "*.cbr" -printf "%p|while read line; do
# Now for the actual work. First, extract the base file name (without the extension)
# using the "basename" command. Warning: more backtics.
BASENAME=`basename $CBRFILE ".cbr"`
# And the directory path for that file, so we know where to put the finished ".cbz"
# file.
DIRNAME=`dirname $CBRFILE`
# Now, build the "new" file name,
NEWNAME="$BASENAME.cbz"
# We use RAR file's name to create folder for unpacked files
echo "Processing $CBRFILE"
mkdir "$BASENAME"
# and unpack the rar file into it
7z x "$CBRFILE" -O"$BASENAME"
cd "$BASENAME"
# Lets ensure the permissions allow us to pack everything
sudo chmod 777 -R ./*
# Put all the extracted files into new ".cbz" file
7z a -tzip -mx=9 "$NEWNAME" *
# And move it to the directory where we found the original ".cbr" file
mv "$NEWNAME" $DIRNAME/"$NEWNAME"
# Finally, "cd" back to the original working directory, and delete the temp directory
# created earlier.
cd ..
rm -r "$BASENAME"
# Delete the RAR file also
rm "$CBRFILE"
done
# At the end we cleanup by removing the temp folder from ram disk
cd ..
echo "Conversion Done"
rm -r "$TEMPDIR"
Oh the humanity, not posting more than two links before 10 reputation and I linked the crap out of OP.. [edit]ah.. mh-mmm.. there we go..
[edit 2] I removed unrar as an dependency and use p7zip instead, as it can extract rar-files.
You will need two checks:
7z t will test the integrity of the archive
You should also test the integrity of all the image files in the archive. You can use at tools like ImageMagick for this.
A simple test would be identify file but that might read only the header. I'd use convert file -resize 5x5 png:- > /dev/null
This scales the image down to 5x5 pixels, converts it to PNG and then pipes the result to /dev/null (discarding it). For the scaling, the whole image has to be read. If this command fails with an error, something is wrong with the image file.

Split large repo into multiple subrepos and preserve history (Mercurial)

We have a large base of code that contains several shared projects, solution files, etc in one directory in SVN. We're migrating to Mercurial. I would like to take this opportunity to reorganize our code into several repositories to make cloning for branching have less overhead. I've already successfully converted our repo from SVN to Mercurial while preserving history. My question: how do I break all the different projects into separate repositories while preserving their history?
Here is an example of what our single repository (OurPlatform) currently looks like:
/OurPlatform
---- Core
---- Core.Tests
---- Database
---- Database.Tests
---- CMS
---- CMS.Tests
---- Product1.Domain
---- Product1.Stresstester
---- Product1.Web
---- Product1.Web.Tests
---- Product2.Domain
---- Product2.Stresstester
---- Product2.Web
---- Product2.Web.Tests
==== Product1.sln
==== Product2.sln
All of those are folders containing VS Projects except for the solution files. Product1.sln and Product2.sln both reference all of the other projects. Ideally, I'd like to take each of those folders, and turn them into separate Hg repos, and also add new repos for each project (they would act as parent repos). Then, If someone was going to work on Product1, they would clone the Product1 repo, which contained Product1.sln and subrepo references to ReferenceAssemblies, Core, Core.Tests, Database, Database.Tests, CMS, and CMS.Tests.
So, it's easy to do this by just hg init'ing in the project directories. But can it be done while preserving history? Or is there a better way to arrange this?
EDIT::::
Thanks to Ry4an's answer, I was able to accomplish my goal. I wanted to share how I did it here for others.
Since we had a lot of separate projects, I wrote a small bash script to automate creating the filemaps and to create the final bat script to actually do the conversion. What wasn't completely apparent from the answer, is that the convert command needs to be run once for each filemap, to produce a separate repository for each project. This script would be placed in the directory above a svn working copy that you have previously converted. I used the working copy since it's file structure best matched what I wanted the final new hg repos to be.
#!/bin/bash
# this requires you to be in: /path/to/svn/working/copy/, and issue: ../filemaplister.sh ./
for filename in *
do
extension=${filename##*.} #$filename|awk -F . '{print $NF}'
if [ "$extension" == "sln" -o "$extension" == "suo" -o "$extension" == "vsmdi" ]; then
base=${filename%.*}
echo "#$base.filemap" >> "$base.filemap"
echo "include $filename" >> "$base.filemap"
echo "C:\Applications\TortoiseHgPortable\hg.exe convert --filemap $base.filemap ../hg-datesort-converted ../hg-separated/$base > $base.convert.output.txt" >> "MASTERGO.convert.bat"
else
echo "#$filename.filemap" >> "$filename.filemap"
echo "include $filename" >> "$filename.filemap"
echo "rename $filename ." >> "$filename.filemap"
echo "C:\Applications\TortoiseHgPortable\hg.exe convert --filemap $filename.filemap ../hg-datesort-converted ../hg-separated/$filename > $filename.convert.output.txt" >> "MASTERGO.convert.bat"
fi
done;
mv *.filemap ../hg-conversion-filemaps/
mv *.convert.bat ../hg-conversion-filemaps/
This script looks at every file in an svn working copy, and depending on the type either creates a new filemap file or appends to an existing one. The if is really just to catch misc visual studio files, and place them into a separate repo. This is meant to be run on bash (cygwin in my case), but running the actual convert command is accomplished through the version of hg shipped with TortoiseHg due to forking/process issues on Windows (gah, I know...).
So you run the MASTERGO.convert.bat file, which looks at your converted hg repo, and creates separate repos using the supplied filemap. After it is complete, there is a folder called hg-separated that contains a folder/repo for each project, as well as a folder/repo for each solution. You then have to manually clone all the projects into a solution repo, and add the clones to the .hgsub file. After committing, an .hgsubstate file is created and you're set to go!
With the example given above, my .hgsub file looks like this for "Product1":
Product1.Domain = /absolute/path/to/Product1.Domain
Product1.Stresstester = /absolute/path/to/Product1.Stresstester
Product1.Web = /absolute/path/to/Product1.Web
Product1.Web.Tests = /absolute/path/to/Product1.Web.Tests
Once I transfer these repos to a central server, I'll be manually changing the paths to be urls.
Also, there is no analog to the initial OurPlatform svn repo, since everything is separated now.
Thanks again!
This can absolutely be done. You'll want to use the hg convert command. Here's the process I'd use:
convert everything to a single hg repository using hg convert with a source type of svn and a dest type of hg (it sounds like you've already done this step)
create a collection of filemap files for use with hg convert's --filemap option
run hg convert with source type hg and dest type hg and the source being the mercurial repo created in step one -- and do it for each of the filemaps you created in step two.
The filemap syntax is shown in the hg help convert output, but here's the gist:
The filemap is a file that allows filtering and remapping of files and
directories. Comment lines start with '#'. Each line can contain one of
the following directives:
include path/to/file
exclude path/to/file
rename from/file to/file
So in your example your filemaps would look like this:
# this is Core.filemap
include Core
rename Core .
Note that if you have an include that the exclusion of everything else is implied. Also that rename line ends in a dot and moves everything up one level.
# this is Core.Tests
include Core.Tests
rename Core.Tests .
and so on.
Once you've created the broken-out repositories for each of the new repos, you can delete the has-everything initial repo created in step one and start setting up your subrepo configuration in .hgsub files.

Resources