Browsing files of hadoop - hadoop

I am running hadoop in ubuntu and my hdfs itself appears to work: a command line like hadoop fs -ls / returns a result, and the namenode web interface at http://localhost:50070/dfshealth.jsp comes up.
However, when I click on the "Browse the filesystem" link on this page I get a 404 Error. The error
message displayed in the browser reads:
Problem accessing /browseDirectory.jsp. Reason:
/browseDirectory.jsp
The URL in the browser bar at this point is http://0.0.0.0:50070/browseDirectory.jsp?namenodeInfoPort=50070&dir=/.

Related

Error in web ui hadoop related to webhdfs

I am using a single-node hadoop version release-3.3.1-RC3. In web ui hadoop under utilities -> browse the file system it is possible to view the contents of the file (beginning and end) directly in the browser. But instead I get the error Couldn't preview the file. NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'http://desktop-ht79hb0.:9864/webhdfs/v1/user/lemit/output/part-r-00000?op=OPEN&namenoderpcaddress=localhost:9000&length=32768&offset=0&_=1674670084685'. , but with hdfs dfs in the console I can view the contents of the file.
What I tried: followed the link to the error (didn't help), changed the port (didn't help), changed the /etc/hosts file (didn't help)
What I expected: the ability to view the file using webhdfs in a browser

Hadoop namenode cannot allow downloading and uploading data from Web UI

I am using Hadoop 3.3.4. In this version, I have configure everything like previous versions, but now, when I try tu use the Web UI it cannot allow me to upload and download data from it. Always it returns the enxt error: Couldn't preview the file. NetworkError: Failed to execute 'send' on 'XMLHttpRequest': Failed to load 'http://spo:9864/webhdfs/v1/output/part-r-00000?op=OPEN&namenoderpcaddress=localhost:9000&offset=0&_=1667470954252'.
However, I can download and upload data from CMD using the hdfs command. I can use the other options of the Web UI and see the structure that I have in the HDFS system.
I have tried to modify the file core-site.xml and write in the option fs.defaultFS the name of my machine like hdfs:name:9000, like have sugested in Why can't DataNode download file?. However, it is not working.

Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error on macOS Monterey

I have installed Hadoop and able to access localhost Hadoop interface. When I try to upload files the interface gives me the error "Failed to retrieve data from /webhdfs/v1/?op=LISTSTATUS: Server Error".
I typically recommend not using the web interface to upload files
If you want to "properly" upload data to HDFS, use hadoop fs -put in the terminal

Connection refused error for Hadoop

When I start my system and opens Hadoop. It gives error as "Connection refused".
When I format my name node using hadoop nodname -format, I'm able to access my Hadoop directory using hadoop dfs -ls /.
But every time I have to format my nodename.
You can't just turn off your computer and expect Hadoop to pick up where it left off when turning the system back on
You need to actually run stop-dfs to prevent corruption in the Namenode and Datanode directories
Check both namenode and datanode logs to inspect why it's not starting if you do get "connection refused", otherwise it's a network problem

Pentaho's "Hadoop File Input" (Spoon) always displays error when trying to read a file from HDFS

I am new to Pentaho and Spoon and I am trying to process a file from a local Hadoop node with a "Hadoop file input" item in Spoon (Pentaho). The problem is that every URI I have tried so far seems to be incorrect. I don't know how to really connect to the HDFS from Pentaho.
To make it clear, the correct URI is:
hdfs://localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
I know it's the correct one because I tested it via command-line and it perfectly works:
hdfs dfs -ls hdfs://localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
So, setting the environment field to "static", here are some of the URIs I have tried in Spoon:
hdfs://localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
hdfs://localhost:8020/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
hdfs://localhost:9001
hdfs://localhost:9001/user/data/prueba_concepto/
hdfs://localhost:9001/user/data/prueba_concepto
hdfs:///
I even tried the solution Garci GarcĂ­a gives here: Pentaho Hadoop File Input
which is setting the port to 8020 and use the following uri:
hdfs://catalin:#localhost:8020/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
And then changed it back to 9001 and tried the same technique:
hdfs://catalin:#localhost:9001/user/data/prueba_concepto/ListadoProductos_2017_02_13-15_59_con_id.csv
But still nothing worked for me ... everytime I press Mostrar Fichero(s)... button (Show file(s)), an error pops saying that that file cannot be found.
I added a "Hadoop File Input" image here.
Thank you.
Okey, so I actually solved this.
I had to add a new Hadoop Cluster from the tab "View" -> Right click on Hadoop Cluster -> New
There I had to input my HDFS Hadoop configuration:
Storage: HDFS
Hostname: localhost
Port: 9001 (by default is 8020)
Username: catalin
Password: (no password)
After that, if you hit the "Test" button, some of the tests will fail. I solved the second one by copying all the configuration properties I had in my LOCAL Hadoop configuration file ($LOCAL_HADOOP_HOME/etc/hadoop/core-site.xml) into the spoon's hadoop configuration file:
data-integration/plugins/pentaho-big-data-plugin/hadoop-configurations/hdp25/core-site.xml
After that, I had to modify the data-integration/plugins/pentaho-big-data-plugin/plugin.properties and set the property "active.hadoop.configuration" to hdp25:
active.hadoop.configuration=hdp25
Restart spoon and you're good to go.

Resources