Minio data transfer - minio

I have a broken minio cluster and don't have access to the control plane. I still have access to the filesystem directory with data and buckets, access to .minio-sys where the broken minio config and other cluster data are located. How can I migrate all my data/buckets with all the files in them to a new minio cluster?

If you were running a single MinIO instance, this is simple - just copy the the directory containing .minio.sys onto another system and start MinIO again with the new directory.
If you were running multiple instances (i.e. distributed MinIO), copy each directory containing .minio.sys into new disks (each such directory is a "disk" for MinIO) and start MinIO on the new disks.

Related

How to move the data and log location in ElasticSearch

I have ES cluster setup with 3 master and 2 data node and running properly. I want to change one of the data node data and log location from local to external disk
In my current YAML file
path.data: /opt/elasticsearch/data
path.logs: /opt/logs/elasticsearch
Now I added 2 external disk to my server to store data/logs and would like to change the location to the new drives
I have added the new disk. What is correct process to point ES data/log to the new disk
The data on this node can be deleted as this is a dev env.
Could I just stop the ES on this server
delete the info in the current data and log folder
mount the new drive to the same mount point and restart the cluster
Thanks
You could just change the settings in YAML file and restart the elasticsearch service, it should work for you. There is no automatic reload when you change any YAML configuration.
Steps :
change Path in YAML
Restart the service

How to move and copy files/read files from Azure Blob into databricks and transform the file and send to target bob container

I am new beginner in to Azure Databricks. I wanted to know how can we copy and read files from Azure Blob source container into databricks, transform as needed and sent back to target container in blob.
Can some provide python code here?
It's not recommended to copy files to DBFS. I would suggest you to mount the blob storage account and then you can read/write files to the storage account.
You can mount a Blob storage container or a folder inside a container to Databricks File System (DBFS). The mount is a pointer to a Blob storage container, so the data is never synced locally.
Reference: https://learn.microsoft.com/en-us/azure/databricks/data/data-sources/azure/azure-storage
Do not Store any Production Data in Default DBFS Folders
.
Reference: Azure Databricks - best practices.

How to configure elasticsearch snapshots using persistent volumes as the "shared file system repository" in Kubernetes(on GCP)?

I have registered the snapshot repository and have been able to create snapshots of the cluster for a pod. I have used a mounted persistent volume as the "shared file system repository" as the backup storage.
However in a production cluster with multiple nodes, it is required that the shared file system is mounted for all the data and master nodes.
Hence I would have to mount the persistent volume for the data nodes and the master nodes.
But Kubernetes persistent volumes don't have a "read write many" option. So can't mount it on all the nodes and hence am unable to register the snapshot repository. Is there a way to use persistent volumes as the backup snapshot storage for a production elastic search cluster in Google Kubernetes Engine?
Reading this, I guess that you are using a cluster created on your own and not GKE, since you cannot install agents on master nodes and workers will get recreated whenever there is a node pool update. Please make this clear since it can be misleading.
There are multiple volumes that allow multiple readers, such as cephfs, glusterfs and nfs. You can take a look at the different volume types on this

replacing minio node in 4-node cluster

One of the nodes in our 4-node minio cluster is having issues and will be terminated by our cloud provider in a couple of weeks. I have done prior testing with minio and know that it will continue to function with 3 nodes, but I will be bootstrapping a new node a few minutes after I terminate the old one, and our container orchestrator should drop a new minio container on that node and into the minio cluster; I'm not concerned about that part.
What I would like to know is how can I kickstart minio to rebalance after the new node is online? In the past when I tested this scenario the new minio container did not pull much if any data from the other nodes. Is that because we're still at the (n/2) + 1?
Hypothetically, what would it take for me to see data being transferred between minio containers-- another (different) node being replaced after the new one is online?
At what point would I see data loss?
If it matters, this minio registry just holds container images from an internal docker registry-- the amount of data it holds is relatively small and static, and writes only happen when I push an image to the registry.
FWIW, minio does not resync automatically. You need to do an "mc admin heal ". Even that sppears to only get the added minio container's disk into the online state, so it is available for subsequent uploads.

How to increment or add storage (up to 500Gb) to new AWS instance?

I want to create new instance in the AWS cloud. The standard space is 8Gb which is not enough for my purpose.
I have my application in the /var/www-directory where files from users get uploaded. I need the additional space for these files. After the user uploaded the file i will move them to my S3 storage.
How can i increment the storage limit of a new instance or add new volume (EBS) mounted as /dev/sdb at /var?

Resources