Error creating DB in bicgcouch with database_dir moved - bigcouch

I have a cluster setup and I've moved the data dir from /opt/bigcouch/var/lib to /bigcouch
I changed the following lines in /opt/bigcouch/etc/default.ini
database_dir = /bigcouch
view_index_dir = /bigcouch
I'm having a issue where if I try to create a new DB it returns JSON saying the DB was created but the actual DB is not created
I have the owner of the dir set to bigcouch
If I create the DB
curl -X PUT localhost:5984/testDB
and then
curl localhost:5986/dbs/_all_docs
I get zero records back

If you edited the .ini file by hand you'll need to restart bigcouch. I suggest using _config instead in general.

Related

How to list azure Databricks workspaces along with properties like workspaceId?

My objective is to create a csv file that lists all azure databricks workspaces and in particular has the workspace id.
I have been able to retrieve all details as json using the CLI:
az rest -m get --header "Accept=application/json" -u 'https://management.azure.com/subscriptions/{subscriptionId}/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01' > workspaces.json
How can I retrieve the same information using azure resource graph?
If you prefer to work with the workspace list api that returns json, here is one approach for post processing the data (in my case I ran this from a jupyter notebook):
import json
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', None)
# json from https://learn.microsoft.com/en-us/rest/api/databricks/workspaces/list-by-subscription?tabs=HTTP&tryIt=true&source=docs#code-try-0
# E.g.
# az rest -m get --header "Accept=application/json" -u 'https://management.azure.com/subscriptions/{subscriptionId}/providers/Microsoft.Databricks/workspaces?api-version=2018-04-01' > workspaces.json
pdf = pd.read_json('./workspaces.json')
# flatten the nested json
pdf_flat = pd.json_normalize(json.loads(pdf.to_json(orient="records")))
# drop columns with name '*.type'
pdf_flat.drop(pdf_flat.columns[pdf_flat.columns.str.endswith('.type')], axis=1, inplace=True)
# drop rows without a workspaceId
pdf_flat = pdf_flat[ ~pdf_flat['value.properties.workspaceId'].isna() ]
# drop unwanted columns
pdf_flat.drop(columns=[
'value.properties.parameters.enableFedRampCertification.value',
'value.properties.parameters.enableNoPublicIp.value',
'value.properties.parameters.natGatewayName.value',
'value.properties.parameters.prepareEncryption.value',
'value.properties.parameters.publicIpName.value',
'value.properties.parameters.relayNamespaceName.value',
'value.properties.parameters.requireInfrastructureEncryption.value',
'value.properties.parameters.resourceTags.value.databricks-environment',
'value.properties.parameters.storageAccountName.value',
'value.properties.parameters.storageAccountSkuName.value',
'value.properties.parameters.vnetAddressPrefix.value',
], inplace=True)
pdf_flat
I was able to retrieve the information I needed by:
Searching for databricks resources in the Azure portal:
From there I could click Open Query to use the Azure Resource Graph Explorer and write a query to extract the information I need:
I ended up using the following query:
// Run query to see results.
where type == "microsoft.databricks/workspaces"
| project id,properties.workspaceId,name,tenantId,type,resourceGroup,location,subscriptionId,kind,tags

gcloud cli failing to add record when contents start with dash

I'm working with the LetsEncrypt dns-01 challenge system which entails dynamically creating a TXT record in Google Cloud DNS with specific content, so LE can assert proof of ownership for generating a wildcard certificate (so I can't use http-01). The problem is sometimes LE tells me to create a TXT record that starts with a "-", for example -E_DFDFHJKF1783FSHDJ. I cannot get the gcloud cli to properly accept this data no matter what I do.
Example:
gcloud dns record-sets transaction start --zone=myzone
gcloud dns record-sets transaction add "-E_ASDFSDF" --ttl=30 --zone=myzone --name=test --type=TXT
gcloud dns record-sets transaction remove "-A_DSFKHSDF" --ttl=30 --zone=myzone --name=test2 --type=TXT
If you run those commands and inspect the resulting transaction.yaml you can see whether it properly contains the right string. If it did it correct, you should see something like:
- kind: dns#resourceRecordSet
name: test.
rrdatas:
- '"ASDFASDF"'
ttl: 30
type: TXT
I am executing this via Node's child_process, but I have the issue even if I execute it directly from bash, so Node isn't really meaningful issue at the moment. I've tried echoing the value in. I've tried setting an environment variable and using that in the string.
No matter what I do I get an error like the following:
ERROR: (gcloud.dns.record-sets.transaction.add) unrecognized arguments: -E_ASDFSDF
It turns out some characters need to be escaped in the CLI. I can confirm that the following works:
gcloud dns --project=myprojectid record-sets transaction add "\-test123" --name=test.mydomain.com. --ttl=300 --type=TXT --zone=myzoneid

MapReduceIndexerTool output dir error "Cannot write parent of file"

I want to use Cloudera's MapReduceIndexerTool to understand how morphlines work. I created a basic morphline that just reads lines from the input file and I tried to run that tool using that command:
hadoop jar /opt/cloudera/parcels/CDH/lib/solr/contrib/mr/search-mr-*-job.jar org.apache.solr.hadoop.MapReduceIndexerTool \
--morphline-file morphline.conf \
--output-dir hdfs:///hostname/dir/ \
--dry-run true
Hadoop is installed on the same machine where I run this command.
The error I'm getting is the following:
net.sourceforge.argparse4j.inf.ArgumentParserException: Cannot write parent of file: hdfs:/hostname/dir
at org.apache.solr.hadoop.PathArgumentType.verifyCanWriteParent(PathArgumentType.java:200)
The /dir directory has 777 permissions on it, so it is definitely allowed to write into it. I don't know what I should do to allow it to write into that output directory.
I'm new to HDFS and I don't know how I should approach this problem. Logs don't offer me any info about that.
What I tried until now (with no result):
created a hierarchy of 2 directories (/dir/dir2) and put 777 permissions on both of them
changed the output-dir schema from hdfs:///... to hdfs://... because all the examples in the --help menu are built that way, but this leads to an invalid schema error
Thank you.
It states 'cannot write parent of file'. And the parent in your case is /. Take a look into the source:
private void verifyCanWriteParent(ArgumentParser parser, Path file) throws ArgumentParserException, IOException {
Path parent = file.getParent();
if (parent == null || !fs.exists(parent) || !fs.getFileStatus(parent).getPermission().getUserAction().implies(FsAction.WRITE)) {
throw new ArgumentParserException("Cannot write parent of file: " + file, parser);
}
}
In the message printed is file, in your case hdfs:/hostname/dir, so file.getParent() will be /.
Additionally you can try the permissions with hadoop fs command, for example you can try to create a zero length file in the path:
hadoop fs -touchz /test-file
I solved that problem after days of working on it.
The problem is with that line --output-dir hdfs:///hostname/dir/.
First of all, there are not 3 slashes at the beginning as I put in my continuous trying to make this work, there are only 2 (as in any valid HDFS URI). Actually I put 3 slashes because otherwise, the tool throws an invalid schema exception! You can easily see in this code that the schema check is done before the verifyCanWriteParent check.
I tried to get the hostname by simply running the hostname command on the Cent OS machine that I was running the tool on. This was the main issue. I analyzed the /etc/hosts file and I saw that there are 2 hostnames for the same local IP. I took the second one and it worked. (I also attached the port to the hostname, so the final format is the following: --output-dir hdfs://correct_hostname:8020/path/to/file/from/hdfs
This error is very confusing because everywhere you look for the namenode hostname, you will see the same thing that the hostname command returns. Moreover, the errors are not structured in a way that you can diagnose the problem and take a logical path to solve it.
Additional information regarding this tool and debugging it
If you want to see the actual code that runs behind it, check the cloudera version that you are running and select the same branch on the official repository. The master is not up to date.
If you want to just run this tool to play with the morphline (by using the --dry-run option) without connecting to Solr and playing with it, you can't. You have to specify a Zookeeper endpoint and a Solr collection or a solr config directory, which involves additional work to research on. This is something that can be improved to this tool.
You don't need to run the tool with -u hdfs, it works with a regular user.

How to do BULK INSERT in Oracle Database

I am trying to do a bulk insert into tables from a CSV file using Oracle11. My problem is that the database is on a remote machine which I can sqlpl to using this:
sqlpl username#oracle.machineName
Unfortunately the sqlldr has trouble connecting using the following command:
sqlldr userid=userName/PW#machinename control=BULK_LOAD_CSV_DATA.ctl log=sqlldr.log
Error is:
Message 2100 not found; No message file for product=RDBMS, facility=ULMessage 2100 not found; No message file for product=RDBMS, facility=UL
Now having given up on this approach I tried writing a basic sql script, but I am unsure of the proper Oracle keyword for BULK. I know this works in MySql but I get:
unknown command beginning "BULK INSER..."
When running the script:
BULK INSERT <TABLE_NAME>
FROM 'CSVFILE.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
I don't care which one works! Either one will do, I just need a little help.
Sorry I am a dumb dumb! I forgot to add oracle/bin to my path!
If you have found this post, add the bin directory to your path (linux) using the following commands:
export ORACLE_HOME=/path/to/oracle/client
export PATH=$PATH:$ORACLE_HOME/bin
Sorry if I wasted anyone's time ....

Greenplum loading data from table to file using external table

I ran the below create script and it created the table:-
Create writable external table FLTR (like dbname.FLTR)
LOCATION ('gpfdist://172.90.38.190:8081/fltr.out')
FORMAT 'CSV' (DELIMITER ',' NULL '')
DISTRIBUTED BY (fltr_key);
But when I tried inserting into the file like insert into fltr.out select * from dbname.fltr
I got the below error, cannot find server connection.
Please help me out
I think your gpfdist is probably not running try:
gpfdist -p 8081 -l ~/gpfdist.log -d ~/ &
on 172.90.38.190.
This will start gpfidist using your home directory as the data directory.
When I do that my inserts work and create a file ~/fltr.out

Resources