Im trying to port https://github.com/markkraay/mnist-from-scratch to rust as an introduction to ML and the rust programming language.
I've decided to use nalgebra instead of rewriting a matrix library. However, im running into an error stating function or associated item not found in `Matrix<f64, Dynamic, Dynamic, VecStorage<f64, Dynamic, Dynamic>> when attempting to run new_random() on a DMatrix and I cant see how to fix It.
For context this is my code
pub fn new(input: usize, hidden: usize, output: usize, learning_rate: usize) -> NeuralNetwork {
let hidden_weights = na::DMatrix::<f64>::new_random(hidden, input);
let output_weights = na::DMatrix::<f64>::new_random(output, hidden);
NeuralNetwork {
input,
hidden,
output,
learning_rate,
hidden_weights,
output_weights
}
}
Ive tried removing <f64> so that it is instead
na::DMatrix::new_random(hidden, input);
but there is no difference
To use new_random you have to enable the rand feature of nalgebra like so in Cargo.toml:
[dependencies]
nalgebra = { version = "0.31.4", features = ["rand"] }
after that your code should work as you posted it.
If you have cargo-edit installed you can also do:
cargo add nalgebra --features rand
Related
I'm having issues buidling projects with rust, my lib.rs only has this
use near_sdk::borsh::{self, BorshDeserialize, BorshSerialize};
use near_sdk::{env, near_bindgen};
near_sdk::setup_alloc!();
#[near_bindgen]
#[derive(Default, BorshDeserialize, BorshSerialize)]
pub struct Counter {
// See more data types at https://doc.rust-lang.org/book/ch03-02-data-types.html
val: i8, // i8 is signed. unsigned integers are also available: u8, u16, u32, u64, u128
}
#[near_bindgen]
impl Counter {
pub fn get_num(&self) -> i8 {
return self.val;
}
self.val += 1;
let log_message = format!("Increased number to {}", self.val);
env::log(log_message.as_bytes());
after_counter_change();
}
pub fn decrement(&mut self) {
self.val -= 1;
let log_message = format!("Decreased number to {}", self.val);
env::log(log_message.as_bytes());
after_counter_change();
}
pub fn reset(&mut self) {
self.val = 0;
// Another way to log is to cast a string into bytes, hence "b" below:
env::log(b"Reset counter to zero");
}
}
fn after_counter_change() {
// show helpful warning that i8 (8-bit signed integer) will overflow above 127 or below -128
env::log("Make sure you don't overflow, my friend.".as_bytes());
}
when i run RUSTFLAGS='-C link-arg=-s' cargo build --target wasm32-unknown-unknown --release i get this error:
error: linker cc not found | = note: No such file or directory
(os error 2)
error: could not compile proc-macro2 due to previous error
if i add proc-macro2 to the dependecy, the error changes to cannot compile ...'other dependency' and it keeps adding new ones as i try to solve it adding them to Cargo.toml
[package]
name = "contract"
version = "0.1.0"
authors = ["Near Inc <hello#nearprotocol.com>"]
edition = "2021"
[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
near-sdk = "=4.0.0-pre.4"
[profile.release]
codegen-units=1
opt-level = "z"
lto = true
debug = false
panic = "abort"
overflow-checks = true
any help?
No, you just need to install:
sudo apt install build-essential
EDIT (Adding more info):
The Linux Rust installer doesn't check for a compiler toolchain, but seems to assume that you've already got a C linker installed! The best solution is to install the tried-and-true gcc toolchain.
How do I fix the Rust error "linker 'cc' not found" for Debian on Windows 10?
Credits to: https://stackoverflow.com/users/4498831/boiethios
it was something wrong with my WSL ubuntu, my cargo version was different from the windows one, was able to build it with windows
I'm a rust newbie, I started one week ago but this language is already very exciting. I'm rewritting a nodejs project in rust to get better performance and for the moment it's just crazy how faster it is.
I'm actually writting a proc_derive_macro (using the "syn" crate) to generate method on some specific struct. I'm almost done but i don't find how to generate enum variant. I will try to explain myself.
That's my code generation (using quote!)
quote! {
// The generated impl.
impl #name /*#ty_generics #where_clause*/ {
pub fn from_config(config: &IndicatorConfig) -> Result<Self, Error> {
let mut #name_lower = #name::default()?;
for (k, v) in config.opts.iter() {
println!("{:?} {:?}", k, v);
match (k.as_str(), v) {
("label", Values::String(val)) => {
#name_lower.label = val.clone();
}
("agg_time", Values::String(val)) => {
#name_lower.agg_time = Some(val.clone());
}
#(
(#fields_name_str, Values::Unteger(val)) => {
#name_lower.#fields_name = val.clone();
}
)*
(&_, _) => {}
}
}
#name_lower.init()?;
Ok(#name_lower)
}
}
};
As we can see I'm generating much of my code here
(#fields_name_str, Values::Unteger(val)) => {
#name_lower.#fields_name = val.clone();
}
But I didn't find a way to generate an "enum variant for the matching" (I don't know how we call that, i hope you will understand):
Values::String(val)
OR
Values::Unteger(val)
...
I'm writting a function which will create the variant matching according to parameter type found inside the struct:
fn create_variant_match(ty: &str) -> PatTupleStruct {
let variant = match ty {
"u32" => Ident::new("Unteger", Span::call_site()),
...
_ => unimplemented!(),
};
}
Actually I'm creating an Ident but I want to create the "enum variant match" -> Values::Unteger(val).
I watched the doc of the syn crate, spend hours trying to find a way, but it's a bit complex for my actual level, so I hope someone will explain me how to do that.
I found a simple way of doing that. Just need to parse a string (which i can format before) using the syn parser.
Didn't think about it before was trying to construct the Expr by hand (a bit stupid ^^)
syn::parse_str::<Expr>("Values::Unteger(val)")
which will generate the Expr needed
I have the mindset keeping my Strings immutable, a single source of truth.
As I take the same mindset into Rust, I find I have to do a lot of cloning.
Since the Strings do not change, all the cloning is unnecessary.
Below there is an example of this and link to the relevant playground.
Borrowing does not seem like an option as I would have to deal with references and their lifetimes. My next thought is to use something like Rc or Cow struct. But wrapping all the Strings with something like Rc feels unnatural. In my limited experience of Rust, I have never seen any exposed ownership/memory management structs, that is Rc and Cow. I am curious how a more experience Rust developer would handle such a problem.
Is it actually natural in Rust to expose ownership/memory management structs like Rc and Cow? Should I be using slices?
use std::collections::HashSet;
#[derive(Debug)]
enum Check {
Known(String),
Duplicate(String),
Missing(String),
Unknown(String)
}
fn main() {
let known_values: HashSet<_> = [
"a".to_string(),
"b".to_string(),
"c".to_string()]
.iter().cloned().collect();
let provided_values = vec![
"a".to_string(),
"b".to_string(),
"z".to_string(),
"b".to_string()
];
let mut found = HashSet::new();
let mut check_values: Vec<_> = provided_values.iter().cloned()
.map(|v| {
if known_values.contains(&v) {
if found.contains(&v) {
Check::Duplicate(v)
} else {
found.insert(v.clone());
Check::Known(v)
}
} else {
Check::Unknown(v)
}
}).collect();
let missing = known_values.difference(&found);
check_values = missing
.cloned()
.fold(check_values, |mut cv, m| {
cv.push(Check::Missing(m));
cv
});
println!("check_values: {:#?}", check_values);
}
From the discussion in the comments of my question, all the cloning of immutable Strings in the example is correct. The cloning is necessary due to Rust handling memory via ownership rather than a reference in other languages.
At best, without using Rc, I can see some reduction in the cloning by using move semantics on provided_values.
Update: Some interesting reading
https://www.reddit.com/r/rust/comments/5xjl95/rc_or_cloning/
https://medium.com/swlh/ownership-managing-memory-in-rust-ce7bf3f5c9d5
How to create a Rust struct with string members?
Cow would not work in my example as it involves a borrowing of a reference. Rc would be what I would have to use. In my example everything has to be converted to Rc but I can see the potential that this could all be hidden away through encapsulation.
use std::collections::HashSet;
use std::rc::Rc;
#[derive(Debug)]
enum Check {
Known(Rc<String>),
Duplicate(Rc<String>),
Missing(Rc<String>),
Unknown(Rc<String>)
}
fn main() {
let known_values: HashSet<_> = [
Rc::new("a".to_string()),
Rc::new("b".to_string()),
Rc::new("c".to_string())]
.iter().cloned().collect();
let provided_values = vec![
Rc::new("a".to_string()),
Rc::new("b".to_string()),
Rc::new("z".to_string()),
Rc::new("b".to_string())
];
let mut found = HashSet::new();
let mut check_values: Vec<_> = provided_values.iter().cloned()
.map(|v| {
if known_values.contains(&v) {
if found.contains(&v) {
Check::Duplicate(v)
} else {
found.insert(v.clone());
Check::Known(v)
}
} else {
Check::Unknown(v)
}
}).collect();
let missing = known_values.difference(&found);
check_values = missing
.cloned()
.fold(check_values, |mut cv, m| {
cv.push(Check::Missing(m));
cv
});
println!("check_values: {:#?}", check_values);
}
Playground
Given the following simplified program:
#[macro_use] extern crate log;
extern crate ansi_term;
extern crate fern;
extern crate time;
extern crate threadpool;
extern crate id3;
mod logging;
use std::process::{exit, };
use ansi_term::Colour::{Yellow, Green};
use threadpool::ThreadPool;
use std::sync::mpsc::channel;
use std::path::{Path};
use id3::Tag;
fn main() {
logging::setup_logging();
let n_jobs = 2;
let files = vec!(
"/tmp/The Dynamics - Version Excursions/01-13- Move on Up.mp3",
"/tmp/The Dynamics - Version Excursions/01-09- Whole Lotta Love.mp3",
"/tmp/The Dynamics - Version Excursions/01-10- Feel Like Making Love.mp3"
);
let pool = ThreadPool::new(n_jobs);
let (tx, rx) = channel();
let mut counter = 0;
for file_ in files {
let file_ = Path::new(file_);
counter = counter + 1;
let tx = tx.clone();
pool.execute(move || {
debug!("sending {} from thread", Yellow.paint(counter.to_string()));
let tag = Tag::read_from_path(file_).unwrap();
let a_name = tag.artist().unwrap();
debug!("recursed file from: {} {}",
Green.paint(a_name), file_.display());
tx.send(".").unwrap();
// TODO amb: not working..
// tx.send(a_name).unwrap();
});
}
for value in rx.iter().take(counter) {
debug!("receiving {} from thread", Green.paint(value));
}
exit(0);
}
Everything works as expected, unless the one commented line (tx.send(a_name).unwrap();) is put back in. In that case I get the following error:
error: `tag` does not live long enough
let a_name = tag.artist().unwrap();
^~~
note: reference must be valid for the static lifetime...
note: ...but borrowed value is only valid for the block suffix following statement 1 at 39:58
let tag = Tag::read_from_path(file_).unwrap();
let a_name = tag.artist().unwrap();
debug!("recursed file from: {} {}",
Green.paint(a_name), file_.display());
...
Generally I understand what the compiler tells me, but I don't see a problem since the variable tag is defined inside of the closure block. The only problem that I can guess is, that the variable tx is cloned outside and therefore can collide with the lifetime of tag.
My goal is to put all the current logic in the thread-closure inside of the thread, since this is the "processing" I want to spread to multiple threads. How can I accomplish this, but still send some value to the longer existing tx?
I'm using the following Rust version:
$ rustc --version
rustc 1.9.0 (e4e8b6668 2016-05-18)
$ cargo --version
cargo 0.10.0-nightly (10ddd7d 2016-04-08)
a_name is &str borrowed from tag. Its lifetime is therefore bounded by tag. Sending non 'static references down a channel to another thread is unsafe. It refers to something on the threads stack which might not even exist anymore once the receiver tries to access it.
In your case you should promote a_name to an owned value of type String, which will be moved to the receiver thread.
tx.send(a_name.to_owned()).unwrap();
I notice that Rust's test has a benchmark mode that will measure execution time in ns/iter, but I could not find a way to measure memory usage.
How would I implement such a benchmark? Let us assume for the moment that I only care about heap memory at the moment (though stack usage would also certainly be interesting).
Edit: I found this issue which asks for the exact same thing.
You can use the jemalloc allocator to print the allocation statistics. For example,
Cargo.toml:
[package]
name = "stackoverflow-30869007"
version = "0.1.0"
edition = "2018"
[dependencies]
jemallocator = "0.5"
jemalloc-sys = {version = "0.5", features = ["stats"]}
libc = "0.2"
src/main.rs:
use libc::{c_char, c_void};
use std::ptr::{null, null_mut};
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
extern "C" fn write_cb(_: *mut c_void, message: *const c_char) {
print!("{}", String::from_utf8_lossy(unsafe {
std::ffi::CStr::from_ptr(message as *const i8).to_bytes()
}));
}
fn mem_print() {
unsafe { jemalloc_sys::malloc_stats_print(Some(write_cb), null_mut(), null()) }
}
fn main() {
mem_print();
let _heap = Vec::<u8>::with_capacity (1024 * 128);
mem_print();
}
In a single-threaded program that should allow you to get a good measurement of how much memory a structure takes. Just print the statistics before the structure is created and after and calculate the difference.
(The "total:" of "allocated" in particular.)
You can also use Valgrind (Massif) to get the heap profile. It works just like with any other C program. Make sure you have debug symbols enabled in the executable (e.g. using debug build or custom Cargo configuration). You can use, say, http://massiftool.sourceforge.net/ to analyse the generated heap profile.
(I verified this to work on Debian Jessie, in a different setting your mileage may vary).
(In order to use Rust with Valgrind you'll probably have to switch back to the system allocator).
P.S. There is now also a better DHAT.
jemalloc can be told to dump a memory profile. You can probably do this with the Rust FFI but I haven't investigated this route.
As far as measuring data structure sizes is concerned, this can be done fairly easily through the use of traits and a small compiler plugin. Nicholas Nethercote in his article Measuring data structure sizes: Firefox (C++) vs. Servo (Rust) demonstrates how it works in Servo; it boils down to adding #[derive(HeapSizeOf)] (or occasionally a manual implementation) to each type you care about. This is a good way of allowing precise checking of where memory is going, too; it is, however, comparatively intrusive as it requires changes to be made in the first place, where something like jemalloc’s print_stats() doesn’t. Still, for good and precise measurements, it’s a sound approach.
Currently, the only way to get allocation information is the alloc::heap::stats_print(); method (behind #![feature(alloc)]), which calls jemalloc's print_stats().
I'll update this answer with further information once I have learned what the output means.
(Note that I'm not going to accept this answer, so if someone comes up with a better solution...)
Now there is jemalloc_ctl crate which provides convenient safe typed API. Add it to your Cargo.toml:
[dependencies]
jemalloc-ctl = "0.3"
jemallocator = "0.3"
Then configure jemalloc to be global allocator and use methods from jemalloc_ctl::stats module:
jemalloc_ctl::stats::allocated
jemalloc_ctl::stats::resident
Here is official example:
use std::thread;
use std::time::Duration;
use jemalloc_ctl::{stats, epoch};
#[global_allocator]
static ALLOC: jemallocator::Jemalloc = jemallocator::Jemalloc;
fn main() {
loop {
// many statistics are cached and only updated when the epoch is advanced.
epoch::advance().unwrap();
let allocated = stats::allocated::read().unwrap();
let resident = stats::resident::read().unwrap();
println!("{} bytes allocated/{} bytes resident", allocated, resident);
thread::sleep(Duration::from_secs(10));
}
}
There's a neat little solution someone put together here: https://github.com/discordance/trallocator/blob/master/src/lib.rs
use std::alloc::{GlobalAlloc, Layout};
use std::sync::atomic::{AtomicU64, Ordering};
pub struct Trallocator<A: GlobalAlloc>(pub A, AtomicU64);
unsafe impl<A: GlobalAlloc> GlobalAlloc for Trallocator<A> {
unsafe fn alloc(&self, l: Layout) -> *mut u8 {
self.1.fetch_add(l.size() as u64, Ordering::SeqCst);
self.0.alloc(l)
}
unsafe fn dealloc(&self, ptr: *mut u8, l: Layout) {
self.0.dealloc(ptr, l);
self.1.fetch_sub(l.size() as u64, Ordering::SeqCst);
}
}
impl<A: GlobalAlloc> Trallocator<A> {
pub const fn new(a: A) -> Self {
Trallocator(a, AtomicU64::new(0))
}
pub fn reset(&self) {
self.1.store(0, Ordering::SeqCst);
}
pub fn get(&self) -> u64 {
self.1.load(Ordering::SeqCst)
}
}
Usage: (from: https://www.reddit.com/r/rust/comments/8z83wc/comment/e2h4dp9)
// needed for Trallocator struct (as written, anyway)
#![feature(integer_atomics, const_fn_trait_bound)]
use std::alloc::System;
#[global_allocator]
static GLOBAL: Trallocator<System> = Trallocator::new(System);
fn main() {
GLOBAL.reset();
println!("memory used: {} bytes", GLOBAL.get());
{
let mut vec = vec![1, 2, 3, 4];
for i in 5..20 {
vec.push(i);
println!("memory used: {} bytes", GLOBAL.get());
}
for v in vec {
println!("{}", v);
}
}
// For some reason this does not print zero =/
println!("memory used: {} bytes", GLOBAL.get());
}
I've just started using it, and it seems to work well! Straight-forward, realtime, requires no external packages, and doesn't require changing your base memory allocator.
It's also nice that, because it's intercepting the allocate/deallocate calls, you should be able to add custom logic if desired (eg. if memory usage goes above X, print the stack-trace to see what's triggering the allocations) -- although I haven't tried this yet.
I also haven't yet tested to see how much overhead this approach adds. If someone does a test for this, let me know!