I am using a single-node hadoop version release-3.3.1-RC3. In web ui hadoop under utilities -> browse the file system it is possible to view the contents of the file (beginning and end) directly in the browser. But instead I get the error Couldn't preview the file. NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'http://desktop-ht79hb0.:9864/webhdfs/v1/user/lemit/output/part-r-00000?op=OPEN&namenoderpcaddress=localhost:9000&length=32768&offset=0&_=1674670084685'. , but with hdfs dfs in the console I can view the contents of the file.
What I tried: followed the link to the error (didn't help), changed the port (didn't help), changed the /etc/hosts file (didn't help)
What I expected: the ability to view the file using webhdfs in a browser
Related
I am using Hadoop 3.3.4. In this version, I have configure everything like previous versions, but now, when I try tu use the Web UI it cannot allow me to upload and download data from it. Always it returns the enxt error: Couldn't preview the file. NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'http://spo:9864/webhdfs/v1/output/part-r-00000?op=OPEN&namenoderpcaddress=localhost:9000&offset=0&_=1667470954252'.
However, I can download and upload data from CMD using the hdfs command. I can use the other options of the Web UI and see the structure that I have in the HDFS system.
I have tried to modify the file core-site.xml and write in the option fs.defaultFS the name of my machine like hdfs:name:9000, like have sugested in Why can't DataNode download file?. However, it is not working.
I have installed Hadoop and able to access localhost Hadoop interface. When I try to upload files the interface gives me the error "Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error".
I typically recommend not using the web interface to upload files
If you want to "properly" upload data to HDFS, use hadoop fs -put in the terminal
Have pipeline in NiFi of the form listHDFS->moveHDFS, attempting to run the pipeline we see the error log
13:29:21 HSTDEBUG01631000-d439-1c41-9715-e0601d3b971c
ListHDFS[id=01631000-d439-1c41-9715-e0601d3b971c] Returning CLUSTER State: StandardStateMap[version=43, values={emitted.timestamp=1525468790000, listing.timestamp=1525468790000}]
13:29:21 HSTDEBUG01631000-d439-1c41-9715-e0601d3b971c
ListHDFS[id=01631000-d439-1c41-9715-e0601d3b971c] Found new-style state stored, latesting timestamp emitted = 1525468790000, latest listed = 1525468790000
13:29:21 HSTDEBUG01631000-d439-1c41-9715-e0601d3b971c
ListHDFS[id=01631000-d439-1c41-9715-e0601d3b971c] Fetching listing for /hdfs/path/to/dir
13:29:21 HSTERROR01631000-d439-1c41-9715-e0601d3b971c
ListHDFS[id=01631000-d439-1c41-9715-e0601d3b971c] Failed to perform listing of HDFS due to File /hdfs/path/to/dir does not exist: java.io.FileNotFoundException: File /hdfs/path/to/dir does not exist
Changing the listHDFS path to /tmp seems to run ok, thus making me think that the problem is with my permissions on the directory I'm trying to list. However, changing the NiFi user to a user that can access that directory (eg. hadoop fs -ls /hdfs/path/to/dir) by setting the bootstrap.properties value run.as=myuser and restarting (see https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#bootstrap_properties) still seems to produce the same problem for the directory. The literal dir. string being used that is not working is:
"/etl/ucera_internal/datagov_example/raw-ingest-tracking/version-1/ingest"
Does anyone know what is happening here? Thanks.
** Note: The hadoop cluster I am accessing does not have kerberos enabled (it is a secured MapR hadoop cluster).
Update: It appears that the mapr hadoop implementation is different enough that it requires special steps in order for NiFi to properly work on it (see https://community.mapr.com/thread/10484 and http://hariology.com/integrating-mapr-fs-and-apache-nifi/). May not get a chance to work on this problem for some time to see if still works (as certain requirements have changed), so am dumping the link here for others who may have this problem in the meantime.
Could you once make sure you have entered correct path and directory needs to be exists in HDFS.
It seems to be list hdfs processor not able to find the directory that you have configured in directory property and logs are not showing any permission denied issues.
If logs shows permission denied then you can change the nifi running user in bootstrap.conf and
Once you make change in nifi properties then NiFi needs to restart to apply the changes (or) change the permissions on the directory that NiFi can have access.
I am new to Pentaho and Spoon and I am trying to process a file from a local Hadoop node with a "Hadoop file input" item in Spoon (Pentaho). The problem is that every URI I have tried so far seems to be incorrect. I don't know how to really connect to the HDFS from Pentaho.
To make it clear, the correct URI is:
hdfs://localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
I know it's the correct one because I tested it via command-line and it perfectly works:
hdfs dfs -ls hdfs://localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
So, setting the environment field to "static", here are some of the URIs I have tried in Spoon:
hdfs://localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
hdfs://localhost:8020/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
hdfs://localhost:9001
hdfs://localhost:9001/user/data/prueba_concepto/
hdfs://localhost:9001/user/data/prueba_concepto
hdfs:///
I even tried the solution Garci GarcĂa gives here: Pentaho Hadoop File Input
which is setting the port to 8020 and use the following uri:
hdfs://catalin:#localhost:8020/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
And then changed it back to 9001 and tried the same technique:
hdfs://catalin:#localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
But still nothing worked for me ... everytime I press Mostrar Fichero(s)... button (Show file(s)), an error pops saying that that file cannot be found.
I added a "Hadoop File Input" image here.
Thank you.
Okey, so I actually solved this.
I had to add a new Hadoop Cluster from the tab "View" -> Right click on Hadoop Cluster -> New
There I had to input my HDFS Hadoop configuration:
Storage: HDFS
Hostname: localhost
Port: 9001 (by default is 8020)
Username: catalin
Password: (no password)
After that, if you hit the "Test" button, some of the tests will fail. I solved the second one by copying all the configuration properties I had in my LOCAL Hadoop configuration file ($LOCAL_HADOOP_HOME/etc/hadoop/core-site.xml) into the spoon's hadoop configuration file:
data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp25/core-site.xml
After that, I had to modify the data-integration/plugins/pentaho-big-data-plugin/plugin.properties and set the property "active.hadoop.configuration" to hdp25:
active.hadoop.configuration=hdp25
Restart spoon and you're good to go.
I am running hadoop in ubuntu and my hdfs itself appears to work: a command line like hadoop fs -ls / returns a result, and the namenode web interface at http://localhost:50070/dfshealth.jsp comes up.
However, when I click on the "Browse the filesystem" link on this page I get a 404 Error. The error
message displayed in the browser reads:
Problem accessing /browseDirectory.jsp. Reason:
/browseDirectory.jsp
The URL in the browser bar at this point is http://0.0.0.0:50070/browseDirectory.jsp?namenodeInfoPort=50070&dir=/.