How to use gpload utility? - greenplum

I have YAML file below:
---
VERSION: 1.0.0.1
DATABASE: xxx
USER: xxx
HOST: xxx
PORT: 5432
GPLOAD:
INPUT:
- SOURCE:
LOCAL_HOSTNAME:
- 192.168.0.21
PORT: 8081
FILE:
- /home/root/test_input.txt
- COLUMNS:
- age: int4
- name: varchar
- surname: varchar
- FORMAT: text
- DELIMITER: '|'
- ERROR_LIMIT: 2
- LOG_ERRORS: true
OUTPUT:
- TABLE: sf_dfs.test_gpload
- MODE: INSERT
PRELOAD:
- REUSE_TABLES: true
But i recieve a error: error when connecting to gpfdist http://192.168.0.21:8081//home/root/test_input.txt, quit after 11 tries (seg0 slice1 192.168.0.23:6000 pid=2021845)
encountered while running INSERT INTO
Maybe somebody have experience about this program?

Looks like it is a port issue. If the database is up then please rerun the job with different port. Ensure that firewall is not blocking this port.

A couple of questions:
Are you running gpload as root? root generally does not have access permissions to the database. It needs to be run as gpadmin or a superuser.
The input file is in /home/root. If you are running as gpadmin, can gpadmin access this file? Permissions on the file?
Finally, does the target table exist in the database (sf_dfs.test_gpload)? Was it created and distributed across all segments? The error would seem to indicate the table is not there.

Related

How to connect Spring Boot application to a Postgres database that's in a Docker container

I'm working on a Spring Boot application and it has to use a database. I can run the database locally and connect to it without problems but now I'm trying to run just the database in a Docker container and connect to it with my Spring application and I can't get it to work.
This is the docker-compose file that is used to start the docker container with the postgres image
version: '3.8'
services:
db:
container_name: postgres-container
image: postgres:14.1-alpine
restart: always
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
- POSTGRES_DB=shop
ports:
- '5432:5432'
volumes:
- db:/var/lib/postgresql/data
volumes:
db:
driver: local
This is the application.yml file
spring:
datasource:
url: jdbc:postgresql://localhost:5432/shop
username: postgres
password: password
This is the V1.0__init.sql file used to initiate a table in the database
CREATE TABLE IF NOT EXISTS igredient(
id SERIAL PRIMARY KEY NOT NULL,
name VARCHAR(50) NOT NULL,
price INTEGER NOT NULL,
is_healthy BOOL NOT NULL,
type VARCHAR(10)
);
I tried connecting to the database with DBeaver in the following way (can't upload an image):
Host: localhost
Port: 5432
Database: shop
Authentication: Database Native
Username: postgres
Password: password
Advanced:
Local Client: PostgreSQL 14
When I test the connection I get this "FATAL: database "shop" does not exist" and I don't understand what I'm missing. I have been googling for too long and I don't have a clue what else to do.
I tried running the Spring application but I get this error:
org.postgresql.util.PSQLException: FATAL: database "shop" does not exist
and this:
Message : FATAL: database "shop" does not exist
Also DB Browser in Intellij is giving me this error in the V1.0__init.sql file (I guess because I'm not connected to a database):
Invalid or incomplete statement
expected one of the following:
BITMAP CONTEXT DIRECTORY EDITIONABLE EDITIONING FORCE FUNCTION INDEX MATERIALIZED MULTIVALUE NO NONEDITIONABLE OR PACKAGE PROCEDURE PUBLIC ROLE SEQUENCE SYNONYM TRIGGER TYPE UNIQUE USER VIEW

"No logs found" in grafana

I installed Loki, grafana and promtail and all three runing. on http://localhost:9080/targets Ready is True, but the logs are not displayed in Grafana and show in the explore section "No logs found"
promtail-local-config-yaml:
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://localhost:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
host: ward_workstation
agent: promtail
__path__: D:/LOGs/*log
loki-local-config.yaml:
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 9096
common:
path_prefix: /tmp/loki
storage:
filesystem:
chunks_directory: /tmp/loki/chunks
rules_directory: /tmp/loki/rules
replication_factor: 1
ring:
instance_addr: 127.0.0.1
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://localhost:9093
How can i solve this problem?
Perhaps you are using Loki in Windows ?
In your promtail varlogs job ,the Path "D:/LOGs/*log" is obviously wrong, you cannot access the windows file from your docker directly.
You shoud mount your windows file to your docker like this:
promtail:
image: grafana/promtail:2.5.0
volumes:
- D:/LOGs:/var/log
command: -config.file=/etc/promtail/config.yml
networks:
- loki
Then everything will be ok.
Note that, in your promtail docker the config is like this:
you can adjust both to make a match...
Here's a general advice how to debug Loki according to the question's title:
(1) Check promtail logs
If you discover such as error sending batch you need to fix your Promtail configuration.
level=warn ts=2022-10-12T16:26:20.667560426Z caller=client.go:369 component=client host=monitor:3100 msg="error sending batch, will retry" status=-1 error="Post \"http://loki:3100/loki/api/v1/push\": dial tcp: lookup *Loki* on 10.96.0.10:53: no such host"
(2) Open the Promtail config page and check, if Promtail has read your given configuration: http://localhost:3101/config
(3) Open the Promtail targets page http://localhost:3101/targets and check
if your service is listed as Ready
if the log file contains the wanted contents and is readable by Promtail. If you're using docker or kubernetes I would log into the Promtail Container and would try to read the logfile manually.
To the specific problem of the questioner:
The questioner said, that the services are shown as READY in the targets page. So I recommend to check (1) Promtail configuration and (3b) access to log files (as Frank).

Gluster_Volume module in ansible

Request you to help me on the following Issue
I am writing a High available LAMPAPP on UBUNTU 14.04 with ansible (on my home lab). All the tasks are getting excecuted till the glusterfs installation however creating the Glusterfs Volume is a challenge for me since a week. If is use the command moudle the glusterfs volume is getting created
- name: Creating the Gluster Volume
command: sudo gluster volume create var-www replica 2 transport tcp server01-private:/data/glusterfs/var-www/brick01/brick server02-private:/data/glusterfs/var-www/brick02/brick
But if i use the GLUSTER_VOLUME module i am getting the error
- name: Creating the Gluster Volume
gluster_volume:
state: present
name: var-www
bricks: /server01-private:/data/glusterfs/var-www/brick01/brick,/server02-private:/data/glusterfs/var-www/brick02/brick
replicas: 2
transport: tcp
cluster:
- server01-private
- server02-private
force: yes
run_once: true
The error is
"msg": "error running gluster (/usr/sbin/gluster --mode=script volume add-brick var-www replica 2 server01-private:/server01-private:/data/glusterfs/var-www/brick01/brick server01-private:/server02-private:/data/glusterfs/var-www/brick02/brick server02-private:/server01-private:/data/glusterfs/var-www/brick01/brick server02-private:/server02-private:/data/glusterfs/var-www/brick02/brick force) command (rc=1): internet address 'server01-private:/server01-private' does not conform to standards\ninternet address 'server01-private:/server02-private' does not conform to standards\ninternet address 'server02-private:/server01-private' does not conform to standards\ninternet address 'server02-private:/server02-private' does not conform to standards\nvolume add-brick: failed: Host server01-private:/server01-private is not in 'Peer in Cluster' state\n"
}
May i know the mistake i am committing
The bricks: declaration of Ansible gluster_volume module requires only the path of the brick. The nodes participating in the volume are identified as cluster:.
The <hostname>:<brickpath> format is required for the gluster command line. However when you use the Ansible module, this is not required.
So your task should be something like:
- name: Creating the Gluster Volume
gluster_volume:
name: 'var-www'
bricks: '/data/glusterfs/var-www/brick01/brick,/data/glusterfs/var-www/brick02/brick'
replicas: '2'
cluster:
- 'server01-private'
- 'server02-private'
transport: 'tcp'
state: 'present'

gpload utility: For Bulk data Loading -Source hadoop into greenplum

We have small Hadoop and greenplum cluster. By using gpload merge statement want to put Hadoop data into greenplum.
Please advice process on it.
Question:
Do I need to install gpload utility on Hadoop based Linux vm?
Then, I have to schedule the merge based gpload script at regular interval?
Is it possible to ingest Hadoop file by running gpload in greenplum vm's only?
Input gpload.yml
VERSION: 1.0.0.1
DATABASE: test
USER: gpadmin
HOST: gpdbhostname
PORT: 5432
GPLOAD:
INPUT:
- SOURCE:
LOCAL_HOSTNAME:
- gpdbhostname
PORT: 8080
FILE:
- /home/gpadmin/demo/input_table.txt
- COLUMNS:
- id: bigint
- time: timestamp
- FORMAT: text
- DELIMITER: ';'
- NULL_AS: ''
OUTPUT:
- TABLE: output_table
- MODE: merge
- MATCH_COLUMNS:
- id
- UPDATE_COLUMNS:
- time
~
In this case, What will be my gpload.yml, if I would like to write source hdfs csv file into greenplum regular table via gpload merge script.

kubernetes, kubeconfig file structure

I have set up a kubernetes cluster a while ago using kube-up (I guess, I am not totally sure as it is really a while ago) and very recently I have set up another kubernetes cluster using coreOS and its tools. They both generated kubeconfig files and those files are working perfectly for each of them respectively. Although, there are some differences and this why this post. I want to understand those differences properly.
Here are the two files -
1.> One generate earlier (most likely using kube-up)
apiVersion: v1
clusters:
- cluster:
certificate-authority-data: CERTIFICATE_AUTH_DATA
server: https://our.kube.server.1
name: aws_kubernetes
contexts:
- context:
cluster: aws_kubernetes
user: aws_kubernetes
name: aws_kubernetes
current-context: aws_kubernetes
kind: Config
preferences: {}
users:
- name: aws_kubernetes
user:
client-certificate-data: SECRET_CERTIFICATE
client-key-data: SECRET_CLIENT_KEY
token: SECRET_TOKEN
- name: aws_kubernetes-basic-auth
user:
password: PASSWORD
username: USERNAME
2.> Second generated later with the coreOS tools
apiVersion: v1
kind: Config
clusters:
- cluster:
certificate-authority: path/to/ca.pem
server: https://our.kube-server.2
name: kube-aws-cluster-cluster
contexts:
- context:
cluster: kube-aws-cluster-cluster
namespace: default
user: kube-aws-
cluster-admin
name: kube-aws-cluster-context
users:
- name: kube-aws-cluster-admin
user:
client-certificate: path/to/admin.pem
client-key: path/to/admin-key.pem
current-context: kube-aws-cluster-context
As you can see there is difference of names of the keys and their values in between these two version; e.g. - certificate-authority-data vs certificate-authority and also one being a string and another being the relative path to a .pem file.
I was wondering -
1.> Are the names of the keys interchangeable, ex - certificate-authority-data can be certificate-authority or vice versa
2.> Are the types of values pre defined? What I mean is, if I copy the content of the .pem file and paste it against, say certificate-authority, will kubectl be able to authorize?
It will be great if I can have an idea about this.I am sorry if there is any confusion in my question. If so please ask me and I will try to make it clear as much as possible.
Thanks in advance
------------------ EDIT ----------------
I kind of made some experiments and I understand that they are not interchangeable. I have a different question now. Which is more straight forward -
Which among these two is a standard or latest version of kubeconfig file?
The *-data fields inline the content of the referenced files, base64-encoded. That allows the kubeconfig file to be self-contained, and able to be moved/copied/distributed without also carrying along referenced files on disk. Either format is valid, depending on your use case.

Resources