what will be the impact of changing the ownership of vob in clearcase? Does it impact the metatdata? Is there anything else need to be changed so that users won't face any problems? how does it effect triggertype, hyperlinks and attribute type ?
Note: the group is not going to be changed but I am not sure if the existing group will be affected after changing the ownership?
You would use the command cleartool protectvob -chown ...
It doesn't have side effect (on metadata like pvob, trtypes, attributes, ...), unless you are running it on multisite vobs.
You can use -chown by itself or in combination with -chgrp.
-cho/wn user
Specifies a new VOB owner, who becomes the owner of all the VOB's storage pools and all of the data containers in them.
Note that on multisite:
You run protectvob -chown or protectvob -chgrp on a VOB replica that preserves identities, you must follow these steps to prevent metadata divergence among replicas in the VOB family:
Stop synchronization among identities-preserving replicas in the family. Make sure that all update packets have been imported.
Run protectvob -chown or protectvob -chgrp on all identities-preserving replicas in the family. You must use the same options and arguments in each command.
Restart synchronization.
(A VOB's protection with respect to privileged access by remote users is never replicated. It must be set on each replica by a privileged user logged on to the VOB server host.)
Related
I would really appreciate help on the correct course of action. The setup is 3 ELK nodes which have all roles.
No shard replication is done. Node 3 experienced a failure on the disk which contains the data folder. An old copy (about a month) of that folder exists, and I know it would not be sufficient to copy the data in.
My question is, what is the correct course of action at this point which would return the stack to normal operation mode:
install a new disk and just launch the node? By a strike of luck, that was our least important data.
install the new disk and copy the old data and see if it can recover that data?
Also, would doing option 1, while launching an experimental node on which the data folder is mounted and restore whichever recoverable data and re-index them remotely to the original cluster?
Another option is to try to use the bin/elasticsearch-shard tool to see if you can repair part of the data.
I just set up the NVMeOF/RDMA environment to play around. I have a target node which NVMe SSD is accessed by some client nodes. However, when I delete a file say test on one client node, the rest nodes cannot see this operation and can still read the content of test as normal. I know that RDMA bypasses the kernel, so I guess this is because of the cache? I then have tried to clean up the cache using these commands:
sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches
sudo sync; echo 1 | sudo tee /proc/sys/vm/drop_caches
sudo sync; echo 2 | sudo tee /proc/sys/vm/drop_caches
Unfortunately, other nodes still keep this file.
So actually I have two questions:
Does it exactly due to the cache? How does it work?
What is the correct way to clean up the cache so that other nodes can see the deletion without re-mount?
Any help will be greatly appreciated!
Relatively short answer
Like Boris said, you don't want to do that (distributed consistency on storage is a hard problem), and you need something else to do what you want. Flushing caches may not work because you've got multiple distinct views of the system + caching behaviors
Longer answer:
As Boris mentioned, NVMeoF is a block protocol. This means that at a broad level (with some hand-waving) all it can do is read and write blocks at a particular address. In practice, we usually have layers above the NVMe/NVMeoF communication layer like file systems that handle this abstraction.
I can't tell right now if you're using a file system or if you reading/writing the device directly, but in either case you are at least partially correct that the page cache may be getting in the way, even with RDMA.
Now, if you are using local file systems on your client nodes, you quickly get inconsistent views. The filesystem (and consequently overall operating system and its view of the state of the page cache and block storage) has no idea anyone else wrote anything. So even if you write and sync on one client, you may have to bypass the page cache on another (e.g. use O_DIRECT reads, which have their own sets of complexities) and make sure you target something that eventually refers to the same block addresses that were written on the NVMe target from your other client.
In theory, this will let you read data written by another if everything lines up correctly, in practice though this can cause confusion, especially if a file system or application on one client writes something at a location, and the other client attempts to read or write that location unknowingly. Now you have a consistency problem.
NVMeoF (with RDMA or any other transport) is a block level storage protocol and not a file level storage protocol. Thus, there is no guarantee to the atomicity of file operations across nodes in NVMeoF systems. Even if one node deletes a file, there is no guarantee that:
The delete operation was actually translated to block erase operations and sent to the storage server;
Even if the storage server deleted the blocks, there is no guarantee that other clients that have cached this data will not continue to read it. Moreover, another client can overwrite the deleted file.
Overall, I think that to have any guarantees at the file level, you should consider a distribute file systems, rather than NVMeoF.
What is the correct way to clean up the cache so that other nodes can see the deletion without re-mount?
There is no good way to do it. Flushing the cache on all nodes and only then reading may work, but it depends on the file system.
I'm currently reading about hadoop and I came across this which has puzzled me (please bear in mind that I am a complete novice when it comes to hadoop) -
Use the Hadoop get command to copy a file from HDFS to your local file
system:
$ hadoop hdfs dfs -get file_name /user/login_user_name
What is a local file system? I understand that a HDFS partitions a file into different blocks throughout the cluster (but I know there's more to it than that). My understanding of the above command is that I can copy a file from the cluster to my personal (i.e. local) computer? Or is that completely wrong? I'm just not entirely sure what is meant by a local file system.
LocalFS means it may be your LinuxFS or WindowsFS. And which is not part of the DFS.
Your understanding is correct, using -get you are getting file from HDFS to Local FS and you cannot use both hadoop and hdfs. Command should be like below
hdfs dfs -get file_name local_path or hadoop fs -get file_name local_path
As per the file system logic, you can divide the filesystem into different drives. in the same way, you can create hadoop file system as a separate filesystem in linux file system.
Your local file system would be the File system over which you have installed the hadoop.your machine act as local in this case when copying file from your machine to the hadoop.
you might want to look at :HDFS vs LFS
THINK of a cluster node (server) as having to fulfill 2 needs:
the need to store its own operating system, application and user data-related files; and
the need to store its portion of sharded or "distributed" cluster data files.
In each cluster data node then there needs to be 2 independent file systems:
the LOCAL ("non-distributed") file system:
stores the OS and all OS-related ancillary ("helper") files;
stores the binary files which form the applications which run on the server;
stores additional data files, but these exist as simple files which are NOT sharded/replicated/distributed in the server's "cluster data" disks;
typically comprised of numerous partitions - entire formatted portions of a single disk or multiple disks;
typically also running LVM in order to ensure "expandability" of these partitions containing critical OS-related code which cannot be permitted to saturate or the server will suffer catastrophic (unrecoverable) failure.
AND
the DISTRIBUTED file system:
stores only the sharded, replicated portions of what are actually massive data files "distributed" across all the other data drives of all the other data nodes in the cluster
typically comprised of at least 3 identical disks, all "raw" - unformatted, with NO RAID of any kind and NO LVM of any kind, because the cluster software (installed on the "local" file system) is actually responsible for its OWN replication and fault-tolerance, so that RAID and LVM would actually be REDUNDANT, and therefore cause unwanted LATENCY in the entire cluster performance.
LOCAL <==> OS and application and data and user-related files specific or "local" to the specific server's operation itself;
DISTRIBUTED <==> sharded/replicated data; capable of being concurrently processed by all the resources in all the servers in the cluster.
A file can START in a server's LOCAL file system where it is a single little "mortal" file - unsharded, unreplicated, undistributed; if you were to delete this one copy, the file is gone gone GONE...
... but if you first MOVE that file to the cluster's DISTRIBUTED file system, where it becomes sharded, replicated and distributed across at least 3 different drives likely on 3 different servers which all participate in the cluster, so that now if a copy of this file on one of these drives were to be DELETED, the cluster itself would still contain 2 MORE copies of the same file (or shard); and WHERE in the local system your little mortal file could only be processed by the one server and its resources (CPUs + RAM)...
... once that file is moved to the CLUSTER, now it's sharded into myriad smaller pieces across at least 3 different servers (and quite possibly many many more), and that file can have its little shards all processed concurrently by ALL the resources (CPUs & RAM) of ALL the servers participating in the cluster.
And that the difference between the LOCAL file system and the DISTRIBUTED file system operating on each server, and that is the secret to the power of cluster computing :-) !...
Hope this offers a clearer picture of the difference between these two often-confusing concepts!
-Mark from North Aurora
In Hadoop, i have enabled authorization. I have set few acl for a directory.
When i execute getfacl command in hadoop bin, i can see mask value in that.
hadoop fs -getfacl /Kumar
# file: /Kumar
# owner: Kumar
# group: Hadoop
user::rwx
user:Babu:rwx
group::r-x
mask::rwx
other::r-x
If i run the same command using webhdfs, mask value not shown.
http://localhost:50070/webhdfs/v1/Kumar?op=GETACLSTATUS
{
"AclStatus": {
"entries": [
"user:Babu:rwx",
"group::r-x"
],
"group": "Hadoop",
"owner": "Kumar",
"permission": "775",
"stickyBit": false
}
}
What the reason for not showing mask value in webhdfs for GETFACL command.
Help me to figure it out.
HDFS implements the POSIX ACL model. The linked documentation explains that the mask entry is persisted into the group permission bits of the classic POSIX permission model. This is done to support the requirements of POSIX ACLs and also support backwards-compatibility with existing tools like chmod, which are unaware of the extended ACL entries. Quoting that document:
In minimal ACLs, the group class permissions are identical to the
owning group permissions. In extended ACLs, the group class may
contain entries for additional users or groups. This results in a
problem: some of these additional entries may contain permissions that
are not contained in the owning group entry, so the owning group entry
permissions may differ from the group class permissions.
This problem is solved by the virtue of the mask entry. With minimal
ACLs, the group class permissions map to the owning group entry
permissions. With extended ACLs, the group class permissions map to
the mask entry permissions, whereas the owning group entry still
defines the owning group permissions.
...
When an application changes any of the owner, group, or other class
permissions (e.g., via the chmod command), the corresponding ACL entry
changes as well. Likewise, when an application changes the permissions
of an ACL entry that maps to one of the user classes, the permissions
of the class change.
This is relevant to your question, because it means the mask is not in fact persisted as an extended ACL entry. Instead, it's in the permission bits. When querying WebHDFS, you've made a "raw" API call to retrieve information about the ACL. When running getfacl, you've run an application that layers additional display logic on top of that API call. getfacl is aware that for a file with an ACL, the group permission bits are interpreted as the mask, and so it displays accordingly.
This is not specific to WebHDFS. If an application were to call getAclStatus through the NameNode's RPC protocol, then it would see the equivalent of the WebHDFS response. Also, if you were to use the getfacl command on a webhdfs:// URI, then the command would still display the mask, because the application knows to apply that logic regardless of the FileSystem implementation.
I use a hadoop (version 1.2.0) cluster of 16 nodes, one with a public IP (the master) and 15 connected through a private network (the slaves).
Is it possible to use a remote server (in addition to these 16 nodes) for storing the output of the mappers? The problem is that the nodes are running out of disk space during the map phase and I cannot compress map output any more.
I know that mapred.local.dirin mapred-site.xml is used to set a comma-separated list of dirs where the tmp files are stored. Ideally, I would like to have one local dir (the default one) and one directory on the remote server. When the local disk fills, then I would like to use the remote disk.
I am not very sure about about this but as per the link (http://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml) it says that:
The local directory is a directory where MapReduce stores intermediate data files.
May be a comma-separated list of directories on different devices in
order to spread disk i/o. Directories that do not exist are ignored.
Also there are some other properties which you should check out. These might be of help:
mapreduce.tasktracker.local.dir.minspacestart: If the space in mapreduce.cluster.local.dir drops under this, do not ask for more tasks. Value in bytes
mapreduce.tasktracker.local.dir.minspacekill: If the space in mapreduce.cluster.local.dir drops under this, do not ask more tasks until all the current ones have finished and cleaned up. Also, to save the rest of the tasks we have running, kill one of them, to clean up some space. Start with the reduce tasks, then go with the ones that have finished the least. Value in bytes.
The solution was to use the iSCSI technology. A technician helped us out to achieve that, so unfortunately I am not able to provide more details on that.
We mounted the remote disk to a local path (/mnt/disk) of each slave node, and created a tmp file there, with rwx priviledges for all users.
Then, we changed the $HADOOP_HOME/conf/mapred-site.xml file and added the property:
<property>
<name>mapred.local.dir</name>
<value>/mnt/disk/tmp</value>
</property>
Initially, we had two, comma-separated values for that property, with the first being the default value, but it still didn't work as expected (we still got some "No space left on device" errors). So we left only one value there.