I'm trying to do a report of all the objects in all the projects we have in Cloud Storage of our Org. I'm using this repo from the Google Professionnal Services as it's doing exactly what we want: https://github.com/GoogleCloudPlatform/professional-services/tree/main/tools/gcs2bq
We want to use containers instead of just the go code on a Cloud Function for portability mainly.
Locally everything is good and the program behave as expected but when I try in Cloud Run things get tricky. From what I understand, the go part needs to listen to a port, which I added at the beginning of the main so the container can be deployed, which it is:
// Determine port for HTTP service
port := os.Getenv("PORT")
if port == "" {
port = "8080"
log.Printf("defaulting to port %s", port)
}
Start HTTP server.
log.Printf("listening on port %s", port)
if err := http.ListenAndServe(":"+port, nil); err != nil {
log.Fatal(err)
}
But as you can see in the repo, the first file called is the run.sh one. Which set environment variables and then call the main.go. It sucessfully complete it's task, which is get all the size of the different files. But after that the run.sh doesnt "resume" and go to the part where it uploads the data in a BigQuery table, which locally work.
Here is the part in the run.sh file where I have a problem. Note : I don't have errors from executing the ./gcs2bq Note 2 : Every environment variable has a correct value
./gcs2bq $GCS2BQ_FLAGS || error "Export failed!" 2 <- doesnt get past this line
gsutil mb -p "${GCS2BQ_PROJECT}" -c standard -l "${GCS2BQ_LOCATION}" -b on "gs://${GCS2BQ_BUCKET}" || echo "Info: Storage bucket already exists: ${GCS2BQ_BUCKET}"
gsutil cp "${GCS2BQ_FILE}" "gs://${GCS2BQ_BUCKET}/${GCS2BQ_FILENAME}" || error "Failed copying ${GCS2BQ_FILE} to gs://${GCS2BQ_BUCKET}/${GCS2BQ_FILENAME}!" 3
bq mk --project_id="${GCS2BQ_PROJECT}" --location="${GCS2BQ_LOCATION}" "${GCS2BQ_DATASET}" || echo "Info: BigQuery dataset already exists: ${GCS2BQ_DATASET}"
bq load --project_id="${GCS2BQ_PROJECT}" --location="${GCS2BQ_LOCATION}" --schema bigquery.schema --source_format=AVRO --use_avro_logical_types --replace=true "${GCS2BQ_DATASET}.${GCS2BQ_TABLE}" "gs://${GCS2BQ_BUCKET}/${GCS2BQ_FIL$
error "Failed to load gs://${GCS2BQ_BUCKET}/${GCS2BQ_FILENAME} to BigQuery table ${GCS2BQ_DATASET}.${GCS2BQ_TABLE}!" 4
gsutil rm "gs://${GCS2BQ_BUCKET}/${GCS2BQ_FILENAME}" || error "Failed deleting gs://${GCS2BQ_BUCKET}/${GCS2BQ_FILENAME}!" 5
rm -f "${GCS2BQ_FILE}"
I'm kinda new to containers and Cloud Run and even after reading projects and documentation, I'm not sure what I'm doing wrong, Is it normal that the .sh is "stuck" when calling the main.go? I can provide more details/explaination if needed.
Okay so for anyone who encounter similar situation this is how I made it work for me.
The container isn't supposed to stop so no exit, it will just go back to the main function.
That means that when I called executable it just looped and never exited and completed the task. So the solution here is to "recode" everything past the call in golang directly into the main.go
Here the run.sh is then useless so I used another .go file that listen for http request and then call the code that gather data and send it to Bigquery.
Related
I'm working with the LetsEncrypt dns-01 challenge system which entails dynamically creating a TXT record in Google Cloud DNS with specific content, so LE can assert proof of ownership for generating a wildcard certificate (so I can't use http-01). The problem is sometimes LE tells me to create a TXT record that starts with a "-", for example -E_DFDFHJKF1783FSHDJ. I cannot get the gcloud cli to properly accept this data no matter what I do.
Example:
gcloud dns record-sets transaction start --zone=myzone
gcloud dns record-sets transaction add "-E_ASDFSDF" --ttl=30 --zone=myzone --name=test --type=TXT
gcloud dns record-sets transaction remove "-A_DSFKHSDF" --ttl=30 --zone=myzone --name=test2 --type=TXT
If you run those commands and inspect the resulting transaction.yaml you can see whether it properly contains the right string. If it did it correct, you should see something like:
- kind: dns#resourceRecordSet
name: test.
rrdatas:
- '"ASDFASDF"'
ttl: 30
type: TXT
I am executing this via Node's child_process, but I have the issue even if I execute it directly from bash, so Node isn't really meaningful issue at the moment. I've tried echoing the value in. I've tried setting an environment variable and using that in the string.
No matter what I do I get an error like the following:
ERROR: (gcloud.dns.record-sets.transaction.add) unrecognized arguments: -E_ASDFSDF
It turns out some characters need to be escaped in the CLI. I can confirm that the following works:
gcloud dns --project=myprojectid record-sets transaction add "\-test123" --name=test.mydomain.com. --ttl=300 --type=TXT --zone=myzoneid
I am using tarantool/tarantool:2.6.0 Docker image (the latest at the moment) and writing lua scripts for the project. I try to find out how to see the results of callin' print() function. It's quite difficult to debug my code without print() working.
In tarantool console print() have no effect also.
Using simple print()
Docs says that print() works to stdout, but I don't see any results when I watch container's logs by docker logs -f <CONTAINER_NAME>
I also tried to set container's logs driver to local. Than I get one time print to container's logs, but only once...
The container's /var/log directory is always empty.
Using box.session.push()
Using box.session.push() works fine in console, but when I use it in lua script:
-- app.lua
function log(s)
box.session.push(s)
end
-- No effect
log('hello')
function say_something(s)
log(s)
end
box.schema.func.create('say_something')
box.schema.user.grant('guest', 'execute', 'function', 'say_something')
And then call say_something() from nodeJs connector like this:
const TarantoolConnection = require('tarantool-driver');
const conn = new TarantoolConnection(connectionData);
const res = await conn.call('update_links', 'hello');
I get error:
Any suggestions?
Thanx!
I suppose you've missed io.flush() after print command.
After I added io.flush() after each print call my messages start to write to logs (docker logs -f <CONTAINER_NAME>).
Also I'd recommend to use log module for such purpose. It writes to stderr without buffering.
Regarding the error in the connector, I think nodejs connector simply doesn't support pushes.
I want to use Cloudera's MapReduceIndexerTool to understand how morphlines work. I created a basic morphline that just reads lines from the input file and I tried to run that tool using that command:
hadoop jar /opt/cloudera/parcels/CDH/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool \
--morphline-file morphline.conf \
--output-dir hdfs:///hostname/dir/ \
--dry-run true
Hadoop is installed on the same machine where I run this command.
The error I'm getting is the following:
net.sourceforge.argparse4j.inf.ArgumentParserException: Cannot write parent of file: hdfs:/hostname/dir
at org.apache.solr.hadoop.PathArgumentType.verifyCanWriteParent(PathArgumentType.java:200)
The /dir directory has 777 permissions on it, so it is definitely allowed to write into it. I don't know what I should do to allow it to write into that output directory.
I'm new to HDFS and I don't know how I should approach this problem. Logs don't offer me any info about that.
What I tried until now (with no result):
created a hierarchy of 2 directories (/dir/dir2) and put 777 permissions on both of them
changed the output-dir schema from hdfs:///... to hdfs://... because all the examples in the --help menu are built that way, but this leads to an invalid schema error
Thank you.
It states 'cannot write parent of file'. And the parent in your case is /. Take a look into the source:
private void verifyCanWriteParent(ArgumentParser parser, Path file) throws ArgumentParserException, IOException {
Path parent = file.getParent();
if (parent == null || !fs.exists(parent) || !fs.getFileStatus(parent).getPermission().getUserAction().implies(FsAction.WRITE)) {
throw new ArgumentParserException("Cannot write parent of file: " + file, parser);
}
}
In the message printed is file, in your case hdfs:/hostname/dir, so file.getParent() will be /.
Additionally you can try the permissions with hadoop fs command, for example you can try to create a zero length file in the path:
hadoop fs -touchz /test-file
I solved that problem after days of working on it.
The problem is with that line --output-dir hdfs:///hostname/dir/.
First of all, there are not 3 slashes at the beginning as I put in my continuous trying to make this work, there are only 2 (as in any valid HDFS URI). Actually I put 3 slashes because otherwise, the tool throws an invalid schema exception! You can easily see in this code that the schema check is done before the verifyCanWriteParent check.
I tried to get the hostname by simply running the hostname command on the Cent OS machine that I was running the tool on. This was the main issue. I analyzed the /etc/hosts file and I saw that there are 2 hostnames for the same local IP. I took the second one and it worked. (I also attached the port to the hostname, so the final format is the following: --output-dir hdfs://correct_hostname:8020/path/to/file/from/hdfs
This error is very confusing because everywhere you look for the namenode hostname, you will see the same thing that the hostname command returns. Moreover, the errors are not structured in a way that you can diagnose the problem and take a logical path to solve it.
Additional information regarding this tool and debugging it
If you want to see the actual code that runs behind it, check the cloudera version that you are running and select the same branch on the official repository. The master is not up to date.
If you want to just run this tool to play with the morphline (by using the --dry-run option) without connecting to Solr and playing with it, you can't. You have to specify a Zookeeper endpoint and a Solr collection or a solr config directory, which involves additional work to research on. This is something that can be improved to this tool.
You don't need to run the tool with -u hdfs, it works with a regular user.
I'm using following code segment to get the XML definition of a virtual machine running on XEN Hypervisor. The code is trying to execute the command virsh dumpxml Ubutnu14 which will give the XML of the VM named Ubuntu14
virshCmd := exec.Command("virsh", "dumpxml", "Ubuntu14")
var virshCmdOutput bytes.Buffer
var stderr bytes.Buffer
virshCmd.Stdout = &virshCmdOutput
virshCmd.Stderr = &stderr
err := virshCmd.Run()
if err != nil {
fmt.Println(err)
fmt.Println(stderr.String())
}
fmt.Println(virshCmdOutput.String())
This code always goes into the error condition for the given domain name and I get the following output.
exit status 1
error: failed to get domain 'Ubuntu14'
error: Domain not found: no domain with matching name 'Ubuntu14'
But if I run the standalone command virsh dumpxml Ubuntu14, I get the correct XML definition.
I would appreciate if someone could give me some hints on what I'm doing wrong. My host machine is Ubuntu-16.04 and golang version is go1.6.2 linux/amd64
I expect you are running virsh as a different user in these two scenarios, and since you don't provide any URI, it is connecting to a different libvirtd instance. If you run virsh as non-root, then it'll usually connect to qemu:///session, but if you run virsh as root, then it'll usually connect to qemu:///system. VMs registered against one URI, will not be visible when connecting to the other URI.
BTW, if you're using go, you'd be much better off using the native Go library bindings for libvirt instead of exec'ing virsh. Your "virsh dumpxml" invokation is pretty much equivalent to this:
import (
"github.com/libvirt/libvirt-go"
)
conn, err := libvirt.NewConnect("qemu:///system")
dom, err := conn.LookupDomainByName("Ubuntu14")
xml, err := dom.GetXMLDesc(0)
(obviously do error handling too)
Using a Raspi/Debian - I have a script that parses the results from an iwlist scan and sends them via UDP to a Pure Data patch. This runs fine in gui mode, but now I'm trying to automate the whole process in another script with the following:
pd-extended -nogui /home/pi/patch.pd & /home/pi/libOSC/scan.sh && fg
But when I run this new script, the UDP appears to only send the info to Pure Data once, and then the scanning continues but Pd does not receive the packet. Any help with this would be appreciated.
What happens when you run /home/pi/libOSC/scan.sh? It sends the results only once? Then maybe you need to do it differently, like calling that script from within pd using the 'shell' or 'popen' objects for instance. Or you implement a polling command via UDP that will return the values.
how does your scan.sh script look like?
you probably want to make it something like:
pdhost=localhost
pdport=9999
do_scan() {
## some code here that does the scan and print's the result to stdout
}
doscan | while read line
do
echo "${line};" | pdsend ${pdhost} ${pdport}
done
rather than the following:
doscan | pdsend ${pdhost} ${pdport}