I'm doing the R-Hadoop tutorial (october 2012) of Jeffrey Breen.
At the moment I try to populate hdfs and then run the commands Jeffrey published in his tutorial in RStudio. Unfortunately I got some troubles with it:
UPDATE: I now moved the data folder to:
/home/cloudera/data/hadoop/wordcount (and same for airline-Data)
No when I run populate.hdfs.sh I get the following output:
[cloudera#localhost ~]$ /home/cloudera/TutorialBreen/bin/populate.hdfs.sh
mkdir: cannot create directory /user/cloudera: File exists
mkdir: cannot create directory /user/cloudera/wordcount: File exists
mkdir: cannot create directory /user/cloudera/wordcount/data: File exists
mkdir: cannot create directory /user/cloudera/airline: File exists
mkdir: cannot create directory /user/cloudera/airline/data: File exists
put: Target /user/cloudera/airline/data/20040325.csv already exists
And then I tried the commands in RStudio as shown in the tutorial but I get errors at the end. Can someone show me what I did wrong?
> if (LOCAL)
+ {
+ rmr.options.set(backend = 'local')
+ hdfs.data.root = 'data/local/airline'
+ hdfs.data = file.path(hdfs.data.root, '20040325-jfk-lax.csv')
+ hdfs.out.root = 'out/airline'
+ hdfs.out = file.path(hdfs.out.root, 'out')
+ if (!file.exists(hdfs.out))
+ dir.create(hdfs.out.root, recursive=T)
+ } else {
+ rmr.options.set(backend = 'hadoop')
+ hdfs.data.root = 'airline'
+ hdfs.data = file.path(hdfs.data.root, 'data')
+ hdfs.out.root = hdfs.data.root
+ hdfs.out = file.path(hdfs.out.root, 'out')
+ }
> asa.csvtextinputformat = make.input.format( format = function(con, nrecs) {
+ line = readLines(con, nrecs)
+ values = unlist( strsplit(line, "\\,") )
+ if (!is.null(values)) {
+ names(values) = c('Year','Month','DayofMonth','DayOfWeek','DepTime','CRSDepTime',
+ 'ArrTime','CRSArrTime','UniqueCarrier','FlightNum','TailNum',
+ 'ActualElapsedTime','CRSElapsedTime','AirTime','ArrDelay',
+ 'DepDelay','Origin','Dest','Distance','TaxiIn','TaxiOut',
+ 'Cancelled','CancellationCode','Diverted','CarrierDelay',
+ 'WeatherDelay','NASDelay','SecurityDelay','LateAircraftDelay')
+ return( keyval(NULL, values) )
+ }
+ }, mode='text' )
> mapper.year.market.enroute_time = function(key, val) {
+ if ( !identical(as.character(val['Year']), 'Year')
+ & identical(as.numeric(val['Cancelled']), 0)
+ & identical(as.numeric(val['Diverted']), 0) ) {
+ if (val['Origin'] < val['Dest'])
+ market = paste(val['Origin'], val['Dest'], sep='-')
+ else
+ market = paste(val['Dest'], val['Origin'], sep='-')
+ output.key = c(val['Year'], market)
+ output.val = c(val['CRSElapsedTime'], val['ActualElapsedTime'], val['AirTime'])
+ return( keyval(output.key, output.val) )
+ }
+ }
> reducer.year.market.enroute_time = function(key, val.list) {
+ if ( require(plyr) )
+ val.df = ldply(val.list, as.numeric)
+ else { # this is as close as my deficient *apply skills can come w/o plyr
+ val.list = lapply(val.list, as.numeric)
+ val.df = data.frame( do.call(rbind, val.list) )
+ }
+ colnames(val.df) = c('crs', 'actual','air')
+ output.key = key
+ output.val = c( nrow(val.df), mean(val.df$crs, na.rm=T),
+ mean(val.df$actual, na.rm=T),
+ mean(val.df$air, na.rm=T) )
+ return( keyval(output.key, output.val) )
+ }
> mr.year.market.enroute_time = function (input, output) {
+ mapreduce(input = input,
+ output = output,
+ input.format = asa.csvtextinputformat,
+ output.format='csv', # note to self: 'csv' for data, 'text' for bug
+ map = mapper.year.market.enroute_time,
+ reduce = reducer.year.market.enroute_time,
+ backend.parameters = list(
+ hadoop = list(D = "mapred.reduce.tasks=2")
+ ),
+ verbose=T)
+ }
> out = mr.year.market.enroute_time(hdfs.data, hdfs.out)
Error in file(f, if (format$mode == "text") "r" else "rb") :
cannot open the connection
In addition: Warning message:
In file(f, if (format$mode == "text") "r" else "rb") :
cannot open file 'data/local/airline/20040325-jfk-lax.csv': No such file or directory
> if (LOCAL)
+ {
+ results.df = as.data.frame( from.dfs(out, structured=T) )
+ colnames(results.df) = c('year', 'market', 'flights', 'scheduled', 'actual', 'in.air')
+ print(head(results.df))
+ }
Error in to.dfs.path(input) : object 'out' not found
Thank you so much!
First of all, it looks like the command:
/usr/bin/hadoop fs -mkdir /user/cloudera/wordcount/data
Is being split into multiple lines. Make sure you're entering it as-is.
Also, it is saying that the local directory data/hadoop/wordcount does not exist. Verify that you're running this command from the correct directory and that your local data is where you expect it to be.
Related
I am trying to use the following code to generate a valid URL for accessing a blob in my Azure storage account. The Azure account name and key are stored in .env files. For some reason, the URL doesn't work; I get a Signature did not match error.
# version 2018-11-09 and later, https://learn.microsoft.com/en-us/rest/api/storageservices/create-service-sas#version-2018-11-09-and-later
signed_permissions = "r"
signed_start = "#{(start_time - 5.minutes).iso8601}"
signed_expiry = "#{(start_time + 10.minutes).iso8601}"
canonicalized_resource = "/blob/#{Config.azure_storage_account_name}/media/#{medium.tinyurl}"
signed_identifier = ""
signed_ip = ""
signed_protocol = "https"
signed_version = "2018-11-09"
signed_resource = "b"
signed_snapshottime = ""
rscc = ""
rscd = ""
rsce = ""
rscl = ""
rsct = ""
string_to_sign = signed_permissions + "\n" +
signed_start + "\n" +
signed_expiry + "\n" +
canonicalized_resource + "\n" +
signed_identifier + "\n" +
signed_ip + "\n" +
signed_protocol + "\n" +
signed_version + "\n" +
signed_resource + "\n" +
signed_snapshottime + "\n" +
rscc + "\n" +
rscd + "\n" +
rsce + "\n" +
rscl + "\n" +
rsct
sig = OpenSSL::HMAC.digest('sha256', Base64.strict_decode64(Config.azure_storage_account_key), string_to_sign.encode(Encoding::UTF_8))
sig = Base64.strict_encode64(sig)
#result = "#{medium.storageurl}?sp=#{signed_permissions}&st=#{signed_start}&se=#{signed_expiry}&spr=#{signed_protocol}&sv=#{signed_version}&sr=#{signed_resource}&sig=#{sig}"
PS: This is in Rails and medium is a record pulled from the DB that contains information about the blob in Azure.
Turns out the issue was clock skew. The signed_start and signed_expiry amounts I was using were too tight. WHen I relaxed then to -30/+20, I could reliably create SAS tokens using the snipper I posted.
Hello I'm trying to read tables related with ManyToOne , i get the result when i execute the query in Navicat :
but when i try to display data in the front with angular i failed i get only the main tables
this is the query :
//like this
#Query(value = "SELECT\n" +
"\tnotification.idnotif,\n" +
"\tnotification.message,\n" +
"\tnotification.\"state\",\n" +
"\tnotification.title,\n" +
"\tnotification.\"customData\",\n" +
"\tnotification.\"date\",\n" +
"\tnotification.receiver,\n" +
"\tnotification.sender,\n" +
"\tnotification.\"type\",\n" +
"\thospital.\"name\",\n" +
"\thospital.\"siretNumber\",\n" +
"\tusers.firstname,\n" +
"\tusers.\"isActive\" \n" +
"FROM\n" +
"\tnotification\n" +
"\tINNER JOIN hospital ON notification.receiver = :reciver\n" +
"\tINNER JOIN users ON notification.sender = :sender",nativeQuery = true)
List<Notification> findNotificationCustomQuery(#Param("reciver") Long reciver,#Param("sender") Long sender);
please what can i do to resolve this problem !
You are doing inner join in the native query. Follow as below. Change the return type to Object[] from Notification.
#Query(value = "SELECT\n" +
"\tnotification.idnotif,\n" +
"\tnotification.message,\n" +
"\tnotification.\"state\",\n" +
"\tnotification.title,\n" +
"\tnotification.\"customData\",\n" +
"\tnotification.\"date\",\n" +
"\tnotification.receiver,\n" +
"\tnotification.sender,\n" +
"\tnotification.\"type\",\n" +
"\thospital.\"name\",\n" +
"\thospital.\"siretNumber\",\n" +
"\tusers.firstname,\n" +
"\tusers.\"isActive\" \n" +
"FROM\n" +
"\tnotification\n" +
"\tINNER JOIN hospital ON notification.receiver = :reciver\n" +
"\tINNER JOIN users ON notification.sender =
:sender",nativeQuery = true)
List<Object []> findNotificationCustomQuery(#Param("reciver")
Long reciver,#Param("sender") Long sender);
Then you have to loop the result as below and get the attributes.
for(Object[] obj : result){
String is = obj[0];
//Get like above
}
Hoping someone has a bash script handy that will hit a mongodb and get the collection stats something like the below that I can use in a shell script?
var collectionNames = db.getCollectionNames(), stats = [];
collectionNames.forEach(function (n) { stats.push(db[n].stats()); });
stats = stats.sort(function(a, b) { return b['size'] - a['size']; });
for (var c in stats) { print(stats[c]['ns'] + ": " + stats[c]['size'] + " (" + stats[c]['storageSize'] + ")"); }
UPDATE
one other question --- looking to prefix the line with a datestamp
"db.getCollectionNames().forEach(function (n) { var s = db[n].stats(); print('date +'%D %r %Z'''namespace=' + s['ns'] +',count=' + s['count']+',avgObjSize=' + s['avgObjSize']+',storageSize=' + s['storageSize']) })"
but my date code doesn't seem to be working :(
mongo $DB_NAME --quiet --eval "db.getCollectionNames().forEach(function (n) { var s = db[n].stats(); print(s['ns'] + ',' + s['size'] + ',' + s['storageSize']) })" | sort --numeric-sort --reverse
It will print in a CSV format which you can you couple of tools to manipulate.
Update:
Just add avgObjSize, totalIndexSize and other keys you need, edit your main question with an output example so we can sort by whatever column you desire.
Update 2:
db.getCollectionNames().forEach(function (n) { var s = db[n].stats(); printjson({'namespace': s['ns'], 'size': s['size'], 'storage': s['storageSize']}) })
db.getCollectionNames().forEach(function (n) { var s = db[n].stats(); print('size=' + s['size'] +',avgObjSize=' + s['avgObjSize']) })
This code is not working, I keep having a loading sign on top to the right but the editor works. Is it possible to have some help towards the actual connection from the Ace Editor file to the C++ WT file.
//Start.
editor1 = new Wt::WText(wt_root);
editor1->setText("Testing for the highlight.");
editor1->setInline(false);
//REQUIREMENT FOR THE ACEEDITOR FILE INPUTED.
Wt::WApplication::instance()->require(std::string("AceFiles/ace.js"));
//CONFIG FOR THE EDITOR THAT WILL SUPPORT TEXT.
editor = new Wt::WContainerWidget(wt_root);
editor->resize(500, 500);
range = new Wt::WContainerWidget(wt_root);
//editor_ref IS THE STRING THAT THE USER IS WRITTING.
std::string editor_ref = editor->jsRef();
std::string range_ref = range->jsRef();
std::string command =
editor_ref + ".editor = ace.edit(" + editor_ref + ");" +
range_ref + ".range = ace.require('ace/range')." + range_ref + ";" +
editor_ref + ".editor.setTheme(\"ace / theme / github\");" +
editor_ref + ".editor.getSession().setMode(\"ace/mode/assembly_x86\");" +
editor_ref + ".editor.session.addMarker(new Range(1, 0, 15, 0), \"fullLine\");";
editor->doJavaScript(command);
//CONFIG. FOR THE JSIGNAL USED.
//BEING THE CONNECTION BETWEEN THE C++ DOC AND THE JAVA SCRIPT.
jsignal = new Wt::JSignal<std::string>(editor, "textChanged");
jsignal->connect(this, &Ui_AceEditor::textChanged);
//CONFIG FOR THE BUTTON.
b = new Wt::WPushButton("Save", wt_root);
command = "function(object, event) {" +
jsignal->createCall(editor_ref + ".editor.getValue()") +
";}";
b->clicked().connect(command);
I have a click function that does a jQuery/Ajax $.post to get data from a webservice when a span is clicked. When there is a Firebug break point set on the click function, everything works as expected (some new table tr's are appended to a table). When there is no break point set, nothing happens when you click the span. Firebug doesn't show any errors. I assume from other stackoverflow questions that this is a timing problem, but I don't know what to do about it. I have tried changing from a $.post to a $.ajax and setting async to false, but that didn't fix it. Here's the code for the click handler:
$('.rating_config').click(function(event){
event.preventDefault();
event.stopPropagation();
var that = $(this);
// calculate the name of the module based on the classes of the parent <tr>
var mytrclasses = $(this).parents('tr').attr('class');
var modulestart = mytrclasses.indexOf('module-');
var start = mytrclasses.indexOf('-', modulestart) + 1;
var stop = mytrclasses.indexOf(' ', start);
var mymodule = mytrclasses.substring(start, stop);
mymodule = mymodule.replace(/ /g, '+');
mymodule = mymodule.replace(/_/g, '+');
mymodule = encodeURI(mymodule);
// calculate the name of the property based on the classes of the parent <tr>
var propertystart = mytrclasses.indexOf('property-');
var propstart = mytrclasses.indexOf('-', propertystart) + 1;
var propstop = mytrclasses.indexOf(' ', propstart);
var myproperty = mytrclasses.substring(propstart, propstop);
myproperty = myproperty.replace(/ /g, '+');
myproperty = myproperty.replace(/_/g, '+');
myproperty = encodeURI(myproperty);
var parentspanid = $(this).attr('id');
// Remove the comparison rows if they are already present, otherwise generate them
if ($('.comparison_' + parentspanid).length != 0) {
$('.comparison_' + parentspanid).remove();
} else {
$.post('http://localhost/LearnPHP/webservice.php?user=user-0&q=comparison&level=property&module=' + mymodule + '&version_id=1.0&property=' + myproperty + '&format=xml', function(data) {
var data = $.xml2json(data);
for (var propnum in data.configuration.modules.module.properties.property) {
var prop = data.configuration.modules.module.properties.property[propnum];
console.log(JSON.stringify(prop));
prop.mod_or_config = 'config';
var item_id = mymodule + '?' + prop.property_name + '?' + prop.version_id + '?' + prop.value;
item_id = convertId(item_id);
prop.id = item_id;
//alert('prop.conformity = ' + prop.conformity);
// genRow(row, module, comparison, comparison_parentspanid)
var rowstring = genRow(prop, mymodule, true, parentspanid);
console.log('back from genRow. rowstring = ' + rowstring);
$(that).closest('tr').after(rowstring);
//$('tr#node-' + data[row].id + ' span#rating' + row.id).css('background', '-moz-linear-gradient(left, #ff0000 0%, #ff0000 ' + data[row].conformity + '%, #00ff00 ' + 100 - data[row].conformity + '%, #00ff00 100%');
var conformity_color = getConformityColor(prop.conformity);
$('tr#comparison_module_' + mymodule + '_setting_' + prop.id + ' span#module_' + mymodule + '_rating' + prop.id).css({'background':'-moz-linear-gradient(left, ' + conformity_color + ' 0%, ' + conformity_color + ' ' + prop.conformity + '%, #fffff0 ' + prop.conformity + '%, #fffff0 100%)'});
//$('tr#comparison-' + data[row].id + ' span#rating' + data[row].id).css('background','-webkit-linear-gradient(left, #00ff00 0%, #00ff00 ' + data[row].conformity + '%, #ff0000 ' + (100 - (data[row].conformity + 2)) + '%, #ff0000 100%)');
}
});
// Hide the Fix by mod column
hideFixedByModCol();
$('tr.comparison_' + parentspanid).each(function(i){
if (i % 2 == 0) {
$(that).addClass('comparison_even');
} else {
$(that).addClass('comparison_odd');
}
});
}
});
Any help would be greatly appreciated!
I suspect your data is coming back improperly formed. Enclose your code from the break on within try {} catch {} to see the error generated. Also it would be a good idea to add error processing to your ajax request.