I have a private GitHub repo (which I can't share here) cloned locally. I want to split a subfolder in this repo into a new subtree repo. I'm following these instructions Using Git subtrees for repository separation (under Splitting code into its own repository).
My specific command is:
> git subtree split -P .\plugins\rg-feed-client -b rg-feed-client
however it fails with exactly 24 "assertion failed" error messages that look like this:
1/ 26 (0)2/ 26 (1)assertion failed: [ plugins/rg-feed-client = .\plugins\rg-fee
3/ 26 (2)assertion failed: [ plugins/rg-feed-client = .\plugins\rg-feed-client ]
...
26/ 26 (25)assertion failed: [ plugins/rg-feed-client = .\plugins\rg-feed-client ]
If I try any other subfolder, the exact same happens. I have no idea what may be wrong here... HELP!
My repo has 2 remotes: origin, and a remote for an existing subtree that I added to my repo.
This was probably due to the backslashes in --prefix (I was running Windows back then.)
Split -P can't gracefully handle directory path . Use following command instead -
git subtree split --prefix=plugins/rg-feed-client -b rg-feed-client
A few points to remember -
Avoid prefixing ./ with path i.e instead of ./plugins/rg-feed-client use plugins/rg-feed-client
Avoid any trailing / after the path , i.e NO plugins/rg-feed-client/
Related
So I have this little bash script to output a csv file that shows all my commit history from a specific month.
function make-git-report() {
if [ "$1" != "" ]
then
local month=$1
else
local month=$(date +%m)
fi
local year=$(date +%Y)
local nextYear=$year
local nextMonth=$((month+1))
if [ "$nextMonth" = "13" ]
then
local nextMonth="01"
local nextYear=$((year+1))
fi
local start="$year-$month-01"
local end="$nextYear-$nextMonth-01"
rm -f log.csv
git --no-pager log \
--author="Evert" \
--since="$start" \
--before="$end" \
--branches --remotes --tags --no-decorate --no-merges \
--pretty=format:'§"%ch";"%an";"%s";' --stat | \
grep -v \| | tr -s "\n\n" | tr "\n" '"' | tr "§" "\n" > templog.csv
echo "\nDate;Author;Message;Changes" >> templog.csv
tac templog.csv > log.csv
rm -f templog.csv
}
But I just realized that if a branch is deleted during that month, and it was only merged using a squash merge, then a lot of commits will not show up in my csv file.
I've understood that git reflog will somehow still contain that missing data, but I'm not sure how to merge that information into the output from git log while graciously avoiding things like duplicate entries and maybe more unwanted results that I now can't think of.
Can anybody give me a little hint, a push in the right direction, on how to solve this?
You can't use the reflog to reliably get information about deleted branches :
the log gets updated only if you see the commit at all
e.g : if you take 2 days off, your local repo will have no trace of what happened during those two days ...
two clones of the same repo will not have the same reflog
the reflog will also contain information about rebased/cherry-picked commit (and possibly add duplicate information on the log you are trying to build ?)
the reflog for HEAD will stay around, but reflog for specific branches/remote branches will get deleted if said branch is deleted too
etc ...
I guess you describe a workflow based around a known git central service (github ? gitlab ? azure devops ? ...) where your branches get merged via pull requests.
Each of these services keep a link to the branch of its pull requests :
a generic way to see that is to run git ls-remote, and see what refs are listed (you should see a list of refs/pull/xxx/... or refs/merge-requests/xxx/... or ...)
and read the documentation for said service
You can fetch these pull requests :
# with this refspec : all pull requests will be stored under origin/pull/*,
# as if they were remote branches
git fetch origin +refs/pull/*:refs/remotes/origin/pull/*
# or you may choose to store them in a different set of refs :
git fetch origin +refs/pull/*:refs/pull/*
You may also use the API of said service to check the state of a pull request.
Combining the information above, you may choose, pull request by pull request, whether you should add pull/xyz/head to the list of refs in your git log command or not.
note that, if your branch are squash merged, the "merge commits" will actually not be merge commits, so you would have to choose another way than --no-merges to exclude them from your log (for example : print the commits sha and skip the ones that are known to be a merge request result)
I would like to get current branch or tag by reading only the content in .git folder.
I have read many solutions and they all depend on executing git status, git branch, git describe, or something similar and then parse the output. But what if we can't be sure that there is a git binary to call? We can't rely on that.
For a branch, it looks almost very straight forward: cat .git/HEAD, but for tags, it get's a little more complicated. I use git-flow to create my feature-branches and my tags. When I switch to a tag I get:
$ git checkout tags/v0.11.2
Note: checking out 'tags/v0.11.2'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at 86ce70a... Merge branch 'hotfix/0.11.2'
And now the content at .git/HEAD is just a hash: 86ce70a29fdb2c0bdb0b683d00ab61607d8531de.
If I want to see the content of the object related to that hash I do:
$ zlib-flate -uncompress < .git/objects/86/ce70a29fdb2c0bdb0b683d00ab61607d8531de
commit 309tree 9d01f72a058e705a7bc6f9ffc5489096edd2e85a
parent 8c767a0d7538f735c5a537ed14f7f96eb8ae05f8
parent 67d98e0149c72856ddb07ff42197071a4c35fa87
author ####################################### 1520980212 -0600
committer #################################### 1520980212 -0600
Merge branch 'hotfix/0.11.2'
The last line is the message I put in the commit, but it doesn't mean I can get the tag version from there as the message is different on every commit.
I also tried to find any file containing that hash within the .git folder running:
$ grep -ilr `cat .git/HEAD` .git/
.git/gitk.cache
.git/HEAD
.git/FETCH_HEAD
.git/logs/HEAD
.git/logs/refs/heads/master
.git/logs/refs/remotes/origin/master
But none of the files had anything that pointed me to the tag name.
I'm running out of ideas. Any light would be really appreciated.
We can actually address your use case (knowing which tag is currently checked-out in prod, without using the git binary nor by re-implementing it) if we assume that
you have local access to the git repository (and to git commands) before deploying to prod, and that the prod environment (which does not contains git) contains the same repo with the same .git folder.
Indeed, it suffices to run git pack-refs beforehand, keep the repo and the .git folder in the prod environment and run something like
grep -B1 -e "\^$(cat .git/HEAD)" .git/packed-refs | head -n1 | cut -d ' ' -f 2
to get the ambient tag.
FYI Here is a demo performed on a GitHub repo:
$ git clone https://github.com/coq/coq.git
Cloning into 'coq'...
remote: Counting objects: 230444, done.
remote: Compressing objects: 100% (8/8), done.
remote: Total 230444 (delta 1), reused 2 (delta 0), pack-reused 230436
Receiving objects: 100% (230444/230444), 104.35 MiB | 1.23 MiB/s, done.
Resolving deltas: 100% (189775/189775), done.
$ cd coq
$ git pack-refs
$ cat .git/packed-refs
# pack-refs with: peeled fully-peeled
72d6d5e87759b62dcd9974c87bf59496d27e10b0 refs/remotes/origin/master
6aecb9a1fe3f9b027dfd702931298bc61d40b6d3 refs/remotes/origin/v8.0
f7cdf553d983a79fe0fbb08403f6a55230016074 refs/remotes/origin/v8.1
be16dcb34d1d09aa9c850997b3ef3b0cc0e7a864 refs/remotes/origin/v8.2
04a6362feb6cbfaa00be4d001dee2b390d0ff21c refs/remotes/origin/v8.3
7f2240ff25f232e8a27100a619881d0742ab7976 refs/remotes/origin/v8.4
df8c706c2fbdc30b4e2c514b97282e621cd5c9a3 refs/remotes/origin/v8.5
0106ca8db489cd9a202a0c8c3715504d5d1dc86c refs/remotes/origin/v8.6
88c5a874d4767c2e61c885cd8f51d4600e90086a refs/remotes/origin/v8.7
f8b9d25ea4c1b229c56c97ef92cb24ee5f70f53b refs/remotes/origin/v8.8
c6857bddd1bdda169dcb5c93e9f680b5379165a7 refs/tags/V7-0
^e3de2b7791fe5c0798c69e390b3fe58ef6574d01
ca2de40f47aed2204dbf75ddf1b7683654682938 refs/tags/V7-0beta
^36183770e5b6cd724ae18a423c8c2e10bcb5f574
9cb91decf1bb9b8b5c7b7e08a347b75f42d46189 refs/tags/V7-0beta2
^82be21875deb1df309f036137f8b1e29411f50fb
7b34b15de5541ff72d191a18cfc7270eff25b4f5 refs/tags/V7-0beta3-ocaml3-01
^31bf213dea6062c7f27140d76528db59ab287538
6c40cca52d505589f1cb87e8008074cce74467eb refs/tags/V7-0beta4
^45d447c4394c838f3ead2d44a3c7e9965365bc3d
03b24eb11846ac23678782dae41bbbba14c40f2a refs/tags/V7-1
^8e6d0c79f6d5b53441edefc2c1b179d37bee483d
cf335d602ab4d58a36d4d5cca3f03951c6da442a refs/tags/V7-2
^43b06dafec4ceec4a4fd21bda3345f90e9eb76c6
41382705c233b06397c7652a4191fd61fcdcd748 refs/tags/V7-3
^d81ffa972da947a5e99fc4a694768d80382a1d26
1f29769c278aced5153eb2df8d3e7d10bdb95917 refs/tags/V7-3-1
^7c408eb83a36ec055d2a69c783e6c23565841a0d
e0028558217335ea68edd859e939320f5f7a8602 refs/tags/V7-4
^6f6e35a49761ef5b825a08c735999a1ad431994b
f6375d6a93140797817b6ad0868534ca9f98a200 refs/tags/V8-0
^6bdd52469e6bfa16aa9fb118cf7ecbf70825172b
57f2c33e94dad2baf0efe0115949ca21ba4aab7c refs/tags/V8-0beta
^1c4da62a877abbe570dc0618a4212cbf1e6d0166
5a5ab6279dbcbed573c0bb0dae6fc08f8ac6827b refs/tags/V8-0cdrom
^74b16389d20003978bd216f410e1d4ffac60002a
a97e6b42f251b91df24f7cb89598c5e267186fd9 refs/tags/V8-0pl1
^6724ff26374f34c954a95d515c749b68bb39ea6c
71d0080ff394fa7d7898109f9c1dfb43ca596c8d refs/tags/V8-0pl2
^2cb5ddf3462b346dbef715a7d5cd69913accf5a5
b04a60c01c8d1221677b7882dd218922a2bc2fea refs/tags/V8-0pl3
^3e4ce7f60451a648b430b354127c232861f50331
8d3a6b922a37dc006afa91e2fe2ed20c9d448456 refs/tags/V8.0-APP
^d204fb7a7cbdbc17559b8e8e437092a0e91bc926
5d12a600774b236d841be6c9f9e1625d0c638ade refs/tags/V8.0pl4
^9cec31ed26778fd1b786ae7962ad10a8810f02dc
bbba7006e341c0254d44778790e874c301ff368d refs/tags/V8.1
^919e894f7be4bed5be27afc30dcb09ae5eb0f429
e14d28196f6da95c0cdd235e11b780e209319e5f refs/tags/V8.1-APP
^a92a08d28f4ac3d9e14c7a18cecf01079c214774
c730c5de35040942b1c06ba93b100ebd1ca725a1 refs/tags/V8.1beta
^4160197d55eb664a5d906c4d85e2c0341e725327
62af7ca2b960da3a0824c85f7d4348608562e068 refs/tags/V8.1gamma
^46a97e6b74efc6bc814140d196c3c14e37b86497
206a042d14fd5348f86800b783dda9e8c684b2fb refs/tags/V8.1pl1
^6b29e2e184041c257cc419c3b80e71bb6806b517
5b16d061d93e7ade027990f99be47a0938954c23 refs/tags/V8.1pl2
^b54738571f96d4d168090820248eb9b213ba7ee4
d730b7ec5436f5849559621e79145b28f0f98e43 refs/tags/V8.1pl3
^259b6943bb9356d78adb97eb57835429bcb0b136
7976ddcc38a2d8f4c47f8952a2d93745932b0ce9 refs/tags/V8.1pl4
^657da139cb9a3e298384dcf6687457869410b308
a181741597e9ef3a5ce7c87cee46b708ccfab7ea refs/tags/V8.1pl5
^4bb87a72c7f906f8ca1a1e9850321a7fc7e86ec5
fd36c77f1eb70597a19137be89819aad2830cb29 refs/tags/V8.1pl6
^6b4b897e4bea40c1aa01a6b7b5d774667656c951
e3111a5f34d17e817dffc772ea911492aff0876e refs/tags/V8.1pre-beta
^39e218aea065e8502e017d5cd055717586287b49
5c868756d2f74c0a9ccb64d24c6d349b49738f05 refs/tags/V8.2
^23173e8c40c63eb5d0975b96a83cf8dae6d76759
1a355cdf6e34fe62b9cea8afe85f022b7503c2b2 refs/tags/V8.2-1
^3f34ae7aacbb5010382b82387a95a055d6bf9756
19bb470a8d3abce24fb2917f9996c14e9cad5e6d refs/tags/V8.2alpha
^286b99ebc0735eb68b3793036a84e6c7d42a9b3c
0849bb25451bbf6e9fce5b5a1400eac4b76b6502 refs/tags/V8.2beta
^fe36cf9ba43eb15221e3e74bd3dac5c3b79a5bf0
f12a923b29d6df611e2fdecbc31fbc3d1c2da06e refs/tags/V8.2beta2
^35b3e53106007bd9459f196d3b6ad05983557a7d
f985acb0bc8b048d832b9790f0120584efdfda69 refs/tags/V8.2beta3
^f059735ff7f27a548ce5f505b7b20b8cfcc1d3e3
9936999f647bf018dffd3a8c7d8d60cb583fd805 refs/tags/V8.2beta4
^bbf8a42aae2dfbd89b3a19a22970e2a734a68ca0
fd708216478666979f80dac3c1e213e1cab30c71 refs/tags/V8.2pl1
^2ae63e7171cb052034fb10b08c2b9ca124408e7b
c685466562e116a9fd09f0e2ea20d2d5612d90de refs/tags/V8.2pl2
^c0639c58819b6ce7869521ede5c2510ea72627e7
a5e77b347f7d1b17f912a73a2bd4de00166e19ed refs/tags/V8.2pl3
^0f56e3eb52bf294143387dfeff5bb0b2b00d353e
92502cc84d6fe38e202f72deb43761657d1da70b refs/tags/V8.2rc1
^8d70d3bded72ce12bd64d991f5003e3f839e8d45
6c7ddfff6aa08a64b4c22af6e24a1130894f0e30 refs/tags/V8.2rc2
^d9bc69a778977aecd4407849f323f5692165f3c8
c1a4081a73c80b73e87f30e07f81a46355e173a4 refs/tags/V8.3
^828c278f69b3aef75cfcf0493641546c94aa4133
f3836c951894fb4654aaca3f635593a28e7c2712 refs/tags/V8.3-beta0
^f7e80f56a6ebdd87a1a8ac5d75a0c4ba0943ec56
bb7677e9e1570b5981f256e770b9e1c0a7f2cdde refs/tags/V8.3-rc1
^3471d0b44ae2e7f932153adba1ce830016610b6d
39d4b181cb9834df3037c1759486341450c673a0 refs/tags/V8.3pl1
^dc3e9e8802f968db2d6b80760a6f955d8fe9b824
37bc52ae4e4eea6e4488dbb2e2b1ad98acc9b8f5 refs/tags/V8.3pl2
^9341752f9bb20db7e36a5470c97b59b40bb9ae53
586ac0022cdc4f353f510fb2921387ef00c51ef2 refs/tags/V8.3pl3
^d6e1259142f3bb9fcf652cd255418552dbd7b9f2
d67128226a73234a89cad78711424429ad5f7455 refs/tags/V8.3pl4
^b0421db33d6632828b2088b82bdaec832c5aaebc
51526fe09288caed67a5d7c7d705945d2bcc89a6 refs/tags/V8.3pl5
^3bf75b1ff7c00467b8269e2f789eb1968e54d63c
08af22b7de836a5fef0f9947a5f0894d371742de refs/tags/V8.4
^3366f276c63b17a3d78865e12f6d94595f87bb18
808ecb68d2885de84d0e0e2f298a4d42c49a08fa refs/tags/V8.4beta
^a9a068d694699bb2c0c7ce229acfb95ff168e38a
bc905f82d46d94f0e2c78cf4a4ce02f64cf2c534 refs/tags/V8.4beta2
^53191e6e1cd4f8b614219ac99fa7f9bae5c3850b
275706e5fc6319545ee9e7020bc9f7ecb7681848 refs/tags/V8.4pl1
^46b4aac455472f03dd63d01d40d4891c44db0e8c
16b09a075e8c0e0479331986141060c4829e0613 refs/tags/V8.4pl2
^9680d833f1cebf2a3e3082e76494bd36f7b9ef1a
985f884b4a59a75522d5421138ab0b88f128a7ae refs/tags/V8.4pl3
72b423c9497033b3b4ca4e023899f204ceaac9e9 refs/tags/V8.4pl4
0ec5646cbcc725dbe1121e24258fc060223e4d51 refs/tags/V8.4pl5
b705cf029b9db7003c8324366c049c49c21dd5c6 refs/tags/V8.4pl6
2cc42290a2f9af4f6e4c9daaa8415feda784a7c4 refs/tags/V8.4rc1
^e06fd439e3a6193b6efc0234e96c222e66211096
5e23fb90b39dfa014ae5c4fb46eb713cca09dbff refs/tags/V8.5
eaa3d0b15adf4eb11ffb00ab087746a5b15c4d5d refs/tags/V8.5beta1
94afd8996251c30d2188a75934487009538e1303 refs/tags/V8.5beta2
0fd6ad21121c7c179375b9a50c3135abab1781b2 refs/tags/V8.5beta3
d5cbd7b881dcc8b3599b3330e342f0aa55ef467f refs/tags/V8.5pl1
e1661dc9a43b34526437e9bc3029e6320e09f899 refs/tags/V8.5pl2
2290dbb9c95b63e693ced647731623e64297f5c8 refs/tags/V8.5pl3
04394d4f17bff1739930ddca5d31cb9bb031078b refs/tags/V8.5rc1
0d1438851ba3a0b9f76847abc42f3bf8ad26c4cb refs/tags/V8.6
b095c4a1d754d4a003d1324cb15b58666b313221 refs/tags/V8.6.1
bdcf5b040b975a179fe9b2889fea0d38ae4689df refs/tags/V8.6beta1
d24aaa4d0e45dc3ec31c5f576516b01ded403dd8 refs/tags/V8.6rc1
15edfc8f92477457bcefe525ce1cea160e4c6560 refs/tags/V8.7+alpha
169afbbc9c560ea3d2fa63a421a639cf59e4cfb5 refs/tags/V8.7+beta1
^bf128a420614ced228c4eb0fcfd901994c2efb65
4b98c97ceecd547a4191b854b58a3c553341bcf3 refs/tags/V8.7+beta2
^9704cd12804dd036637460da803773f67d6031d1
56f98b99f34eb657c3288c6a8839cfc6133c5e9f refs/tags/V8.7.0
^78e3385221c5c6d024b33107517f5674b3d341c2
8c6816def18031126edd99c89bd0257244299276 refs/tags/V8.7.1
^391bb5e196901a3a9426295125b8d1c700ab6992
714c1769144139ca2187cb4f5362f9056218d188 refs/tags/V8.7.2
^2881a184ef4e8a3275ddf34c07d740db42e0c5d3
71f5c4efd6d62c5283f76c263b6c2d6a6b7e64ae refs/tags/V8.8+alpha
^307f08d2ad2aca5d48441394342af4615810d0c7
fc22a4181b178532eabd1f33b7120374d17cbcd6 refs/tags/V8.8+beta1
^8dee3cd515600d50ae95188d44aad8dcb161b0ea
0ec67923c45fb09acc5be96cb19b3e1b603e5b25 refs/tags/V8.8.0
^6a929e8b94fc95f81699668cea95bc4b91ec67ca
1f48326c7edf7f6e7062633494d25b254a6db82c refs/tags/last-coqide-for-8.4pl3
$ git checkout V8.8.0
Note: checking out 'V8.8.0'.
You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.
If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:
git checkout -b <new-branch-name>
HEAD is now at 6a929e8b9... Backport PR #7277: Mention sphinxcontrib-bibtex in INSTALL.doc
$ cat .git/HEAD
6a929e8b94fc95f81699668cea95bc4b91ec67ca
$ grep -B1 -e 6a929e8b94fc95f81699668cea95bc4b91ec67ca .git/packed-refs
0ec67923c45fb09acc5be96cb19b3e1b603e5b25 refs/tags/V8.8.0
^6a929e8b94fc95f81699668cea95bc4b91ec67ca
$ grep -B1 -e "\^$(cat .git/HEAD)" .git/packed-refs | head -n1 | cut -d ' ' -f 2
refs/tags/V8.8.0
To elaborate a bit on this solution, it relies on the git pack-refs command which extracts the SHA1 corresponding to all the refs (tags and remote branches) from the objects of the repo, and write a file .git/packed-refs that recapitulates this information.
It can be noted that the problem of identifying the two SHA1 that can associated to an annotated tag (namely, the SA1 of the tag itself and the SHA1 of the underlying commit) is easily solved by this packed-refs file, without needing to use zlib-flate or so.
(But of course, the prototype shell command proposed in this answer could be adapted because it will only work if the checked-out commit is an annotated tag, cf. the ^ character involved here.)
You might as well assume (or require) that Git is installed, becaues the closest you will come to being able to produce the output of git describe is by copying the algorithm from git describe. You'll also want to have access to at least git rev-parse and git reflog if you try the idea I suggest here.
Note that even git describe might not produce the tag name that you used to run git checkout. If you have two tags that point to the same commit, it's not possible, after-the-fact, to go from revision hash ID to tag name, since checking out either one gets you the same state—except for reflog traces:
You can look at the HEAD reflog (if it exists), because git checkout writes a message here along with the hash IDs. For a git checkout command that checks out a tag, the message is checkout: moving from <name-or-ID> to <tag-name>. So HEAD is detached and the reflog entry for HEAD#{0} matches moving from .* to (.*)$ and the captured regular expression at the end is a valid tag name and running git rev-parse $tag^{commit} produces the correct hash ID, the last checkout was no doubt given that tag name. (Adjust the regexp syntax as necessary depending on which RE grammar it uses.)
You could open and read .git/logs/HEAD directly to avoid using the git binary, but I think the reflog format has changed at least once, and there's no guarantee that it will be stable in the future. It's probably better to use tools that do make some guarantee.
I want to use Cloudera's MapReduceIndexerTool to understand how morphlines work. I created a basic morphline that just reads lines from the input file and I tried to run that tool using that command:
hadoop jar /opt/cloudera/parcels/CDH/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool \
--morphline-file morphline.conf \
--output-dir hdfs:///hostname/dir/ \
--dry-run true
Hadoop is installed on the same machine where I run this command.
The error I'm getting is the following:
net.sourceforge.argparse4j.inf.ArgumentParserException: Cannot write parent of file: hdfs:/hostname/dir
at org.apache.solr.hadoop.PathArgumentType.verifyCanWriteParent(PathArgumentType.java:200)
The /dir directory has 777 permissions on it, so it is definitely allowed to write into it. I don't know what I should do to allow it to write into that output directory.
I'm new to HDFS and I don't know how I should approach this problem. Logs don't offer me any info about that.
What I tried until now (with no result):
created a hierarchy of 2 directories (/dir/dir2) and put 777 permissions on both of them
changed the output-dir schema from hdfs:///... to hdfs://... because all the examples in the --help menu are built that way, but this leads to an invalid schema error
Thank you.
It states 'cannot write parent of file'. And the parent in your case is /. Take a look into the source:
private void verifyCanWriteParent(ArgumentParser parser, Path file) throws ArgumentParserException, IOException {
Path parent = file.getParent();
if (parent == null || !fs.exists(parent) || !fs.getFileStatus(parent).getPermission().getUserAction().implies(FsAction.WRITE)) {
throw new ArgumentParserException("Cannot write parent of file: " + file, parser);
}
}
In the message printed is file, in your case hdfs:/hostname/dir, so file.getParent() will be /.
Additionally you can try the permissions with hadoop fs command, for example you can try to create a zero length file in the path:
hadoop fs -touchz /test-file
I solved that problem after days of working on it.
The problem is with that line --output-dir hdfs:///hostname/dir/.
First of all, there are not 3 slashes at the beginning as I put in my continuous trying to make this work, there are only 2 (as in any valid HDFS URI). Actually I put 3 slashes because otherwise, the tool throws an invalid schema exception! You can easily see in this code that the schema check is done before the verifyCanWriteParent check.
I tried to get the hostname by simply running the hostname command on the Cent OS machine that I was running the tool on. This was the main issue. I analyzed the /etc/hosts file and I saw that there are 2 hostnames for the same local IP. I took the second one and it worked. (I also attached the port to the hostname, so the final format is the following: --output-dir hdfs://correct_hostname:8020/path/to/file/from/hdfs
This error is very confusing because everywhere you look for the namenode hostname, you will see the same thing that the hostname command returns. Moreover, the errors are not structured in a way that you can diagnose the problem and take a logical path to solve it.
Additional information regarding this tool and debugging it
If you want to see the actual code that runs behind it, check the cloudera version that you are running and select the same branch on the official repository. The master is not up to date.
If you want to just run this tool to play with the morphline (by using the --dry-run option) without connecting to Solr and playing with it, you can't. You have to specify a Zookeeper endpoint and a Solr collection or a solr config directory, which involves additional work to research on. This is something that can be improved to this tool.
You don't need to run the tool with -u hdfs, it works with a regular user.
I want to make a rsync with an update of the distant tree. I'd like my command to recursively create missings leaf folders ex :
Before:
Source
A/A/C_file
A/B/C_file
A/C/C_file
B/A/C_file
B/B/C_file
B/C/C_file
distant
A/A/C_file
A/C/C_file
B/A/C_file
B/C/C_file
After the Rsync command "rsync -atvrz source/dir/ distant/dir " :
Distant :
A/A/C_file
A/B/C_file
A/C/C_file
B/A/C_file
B/B/C_file
B/C/C_file
The --relative solution doesn't work for me because it creates the new path inside the distant : "distant/dir/source/dir"
Seems to work when the user right is harmonized.
So a chown -r user:user /distant solved the issue.
The command git diff "??/??/15 - 12:34" "??/??/?? - 03:21" throws an error. It seems that : is the culprit.
The client that handles git for me, didn't throw and error with the colon in the commit name, but git Bash for Windows won't let me get access to the commit using the command line options. I've tried ':' or ":" or \: and none of those options worked.
How would someone use a colon on the command line or how would someone escape the colon character?
* EDIT *
Here's a copy of the output from git log --oneline
9c34cd9 git merge
f1195c7 09/04/2015 - 15:05
db38edb 09/03/15 - 17:28
c20dea6 09/02/15 - 19:43
e33cd9c 08/28/15 - 00:12
48692a9 08/26/15 - 16:02
8072375 08/25/15 - 19:58
c6babf3 08/25/15 - 12:12
ff6afbf 08/14/15 - 19:43
a0ccc60 08/08/15 - 13:43
9b446ae 08/04/15 - 16:11
34a7dfe 08/02/15 - 21:09
f6005ba 07/31/15 - 16:12
18dc958 07/31/15 - 16:11
3d4c7fb 07/31/15 - 13:48
c6c9ef9 07/25/15 - 22:42
9fd46df 07/25/15 - 15:23
78fa4ed 07/20/15 - 12:27
af399b7 07/16/15 - 17:00
33fbd24 07/14/15 - 17:46
458bb5e 07/14/15 - 12:32
418a92d 07-13-15 - EOD
72b1408 07/13/15 - 17:43
a6bc32f Merge https://github.com/halcyonsystems/amelia
ec27a81 new file: assets/css/main.css
new file: assets/im2ff9bc3 Initial commit
From the help page (git help diff):
NAME
git-diff - Show changes between commits, commit and working tree, etc
SYNOPSIS
git diff [options] [<commit>] [--] [<path>...]
git diff [options] --cached [<commit>] [--] [<path>...]
git diff [options] <commit> <commit> [--] [<path>...]
git diff [options] <blob> <blob>
git diff [options] [--no-index] [--] <path> <path>
So you can specify two different path options. If those are timestamps you have to get them into the right syntax. See git help revisions:
<refname>#{<date>}, e.g. master#{yesterday}, HEAD#{5 minutes ago}
A ref followed by the suffix # with a date specification enclosed in a brace pair (e.g. {yesterday}, {1 month 2 weeks 3 days 1 hour 1 second ago} or {1979-02-26
18:30:00}) specifies the value of the ref at a prior point in time. This suffix may only be used immediately following a ref name and the ref must have an existing log
($GIT_DIR/logs/<ref>). Note that this looks up the state of your local ref at a given time; e.g., what was in your local master branch last week. If you want to look at
commits made during certain times, see --since and --until.
This works well for a single file as refname.
If you want to really diff all the changes in your full repo between two times, see this question and this question
Update based on the update of the question and clarifications in the comments:
As the dates appear in the first line of the commit message, you first have to search for the matching date (=commit message) in your repository to determine the unique checksum that identifies the respective commit just as Wumpus explained in the comments:
git log --oneline --grep='07/25/15 - 22:42'
This should work in your case. In the general case where the date or the search string cannot be found in the first line of the commit message use:
git log --grep='07/25/15 - 22:42'
If you have multiple branches and don't know on which branch the respective commit is to be found, add the --all switch.
On the output you will find the checksum, e.g. 3d4c7fb. This is the unique identifier which you can then feed into git diff. Note that the full checksum is actually a bit longer, but abbreviations are fine as long as they are unambiguous. Usually the first four to six digits are sufficient, depending on the amount of commits done in the past.
As Wumpus already said: This is awful. Do not add the commit date to the log message. It is redundant and therefore meaningless: git already maintains two dates for each commit: The author date and the commit date. For the first commit of a changeset these two are identical. During operations that integrate one commit onto a different branch (and generating a new commit checksum), the commit date represents the timestamp of the operation while the author date remains the same old. As I explained above you can refer to a file at a certain timestamp with filename#{timestamp}. See git help revisions to learn about the details of what you can do with the syntax. It's neat and quite flexible.