Unable to read from NEAR's rocksDB - nearprotocol

I'm trying to iterate through NEAR's RocksDB,
I've downloaded the small backup from s3 and using the code below to iterate through col33 (transactions)
But it doesn't print anything as RocksDB would be empty (but it is not obviously),
could you please point me out what I'm doing wrong?
Thanks
use std::env;
use rocksdb::{ColumnFamilyDescriptor, DB, IteratorMode, Options};
fn col_name(col: i32) -> String {
format!("col{}", col)
}
fn main() {
println!("Hello, RocksDB!");
let args: Vec<String> = env::args().collect();
let path = if args.len() > 1 {
args.get(1).unwrap().clone()
} else {
String::from("./data")
};
println!("data dir={}", &path);
let opts = Options::default();
let mut cfs:Vec<ColumnFamilyDescriptor> = Vec::new();
for col in 33..34 {
cfs.push(
rocksdb::ColumnFamilyDescriptor::new(col_name(col),opts.clone()));
}
let db = DB::open_cf_descriptors_read_only(
&opts,&path, cfs, false,
).unwrap();
let iter = db.iterator(IteratorMode::Start);
for (key, value) in iter {
println!("Saw {:?} {:?}", key, value);
let k = String::from_utf8(key.to_vec()).unwrap();
let v = String::from_utf8(value.to_vec()).unwrap();
println!("Saw {:?} {:?}", k, v);
}
let _ = DB::destroy(&Options::default(), &path);
}

I've found what was wrong,
as Asad Awadia mentioned, I'm using iterator over default column family here.
I've used iterator_cf instead and got some data:
let cf_handle = db.cf_handle("col33").unwrap();
let iter = db.iterator_cf(cf_handle, IteratorMode::Start);

Related

How to Filter out a Vec of Strings From a Vec of Structs Rust

I have a working solution to filtering out an input vec of strings compared to a vector of a struct. However, my code seems complicated and I tried simplify the code using a iter::filter(https://doc.rust-lang.org/stable/std/iter/struct.Filter.html). This caused issues because the iterator gave back values that were references and could not be directly used. It seems like my understanding of the iter and what can be done in a structs vector needs refreshing. Below is the simplified filtering code that works:
#[derive(Debug)]
pub struct Widget {
name: String,
pin: u16,
}
impl Widget{
pub fn new(widget_name: String, widget_pin: String) -> Widget {
let widget_pin_u16 = widget_pin.parse::<u16>().expect("Unable to parse");
let nw = Widget {
name: widget_name,
pin: widget_pin_u16
};
return nw
}
}
pub struct WidgetHolder {
widgets: Vec<Widget>,
widget_holder_name: String
}
impl WidgetHolder {
fn add_widgets(&mut self, valid_widgets_found: Vec<String>) {
let mut widgets_to_add: Vec<String> = Vec::new();
for widget in valid_widgets_found {
// The string musy be compared to pin field, so we're converting
let widget_offset = widget
.clone()
.parse::<u16>()
.expect("Unable to parse widget base into int.");
// If it doesnt exist in our widgetHolder widgets vector, then lets add it.
let mut widget_exists = false;
for existing_widget in &self.widgets {
if widget_offset == existing_widget.pin {
widget_exists = true;
break;
}
}
if !widget_exists {
widgets_to_add.push(widget.clone());
}
}
if widgets_to_add.is_empty() {
return;
}
for widget in widgets_to_add {
let loaded_widget = Widget::new(self.widget_holder_name.clone(), widget);
self.widgets.push(loaded_widget);
}
}
}
pub fn main() {
let init_vec = Vec::new();
let mut wh = WidgetHolder {
widgets: init_vec,
widget_holder_name: "MyWidget".to_string()
};
let vec1 = vec!["1".to_string(), "2".to_string(), "3".to_string()];
wh.add_widgets(vec1);
println!("{:?}", wh.widgets);
let vec2 = vec!["2".to_string(), "3".to_string(), "4".to_string()];
wh.add_widgets(vec2);
println!("{:?}", wh.widgets);
}
Is there a way I can clean up this code without having to use so many data structures and loops? The filter api looks clean but does it work with a vector inside of a struct that I am trying to mutate(append to it)?
EDIT
After trying to get a stack trace, I actually got the filter to work...
fn add_widgets(&mut self, valid_widgets_found: Vec<String>) {
let widgets_to_add: Vec<String> = valid_widgets_found.into_iter()
.filter(|widget_pin| {
let widget_offset = widget_pin.clone().parse::<u16>().expect("Unable to parse widget base into int.");
let mut widget_exists = false;
for existing_widget in &self.widgets {
if widget_offset == existing_widget.pin {
widget_exists = true;
break;
}
}
!widget_exists
})
.collect();
if widgets_to_add.is_empty() {
return;
}
for widget in widgets_to_add {
let loaded_widget = Widget::new(self.widget_holder_name.clone(), widget);
self.widgets.push(loaded_widget);
}
}
I figured out the answer. Seemed like a syntax error when I initially tried it. For anyone who's looking for a filter example in the future:
fn add_widgets(&mut self, valid_widgets_found: Vec<String>) {
let widgets_to_add: Vec<String> = valid_widgets_found.into_iter()
.filter(|widget_pin| {
let widget_offset = widget_pin.clone().parse::<u16>().expect("Unable to parse widget base into int.");
let mut widget_exists = false;
for existing_widget in &self.widgets {
if widget_offset == existing_widget.pin {
widget_exists = true;
break;
}
}
!widget_exists
})
.collect();
if widgets_to_add.is_empty() {
return;
}
for widget in widgets_to_add {
let loaded_widget = Widget::new(self.widget_holder_name.clone(), widget);
self.widgets.push(loaded_widget);
}
}

Adding writer function to tunnel program

How do I create a writer function for the tunnel program below? The code below is a sample program to create windows tunnel interface. I want to write a function that writes (or sends) packets to another server IP address. Github link for full code and its dependencies given below.
https://github.com/nulldotblack/wintun/blob/main/examples/basic.rs
use log::*;
use std::sync::{
atomic::{AtomicBool, Ordering},
Arc,
};
static RUNNING: AtomicBool = AtomicBool::new(true);
fn main() {
env_logger::init();
let wintun = unsafe { wintun::load_from_path("examples/wintun/bin/amd64/wintun.dll") }
.expect("Failed to load wintun dll");
let version = wintun::get_running_driver_version(&wintun);
info!("Using wintun version: {:?}", version);
let adapter = match wintun::Adapter::open(&wintun, "Demo") {
Ok(a) => a,
Err(_) => wintun::Adapter::create(&wintun, "Example", "Demo", None)
.expect("Failed to create wintun adapter!"),
};
let version = wintun::get_running_driver_version(&wintun).unwrap();
info!("Using wintun version: {:?}", version);
let session = Arc::new(adapter.start_session(wintun::MAX_RING_CAPACITY).unwrap());
let reader_session = session.clone();
let reader = std::thread::spawn(move || {
while RUNNING.load(Ordering::Relaxed) {
match reader_session.receive_blocking() {
Ok(packet) => {
let bytes = packet.bytes();
println!(
"Read packet size {} bytes. Header data: {:?}",
bytes.len(),
&bytes[0..(20.min(bytes.len()))]
);
}
Err(_) => println!("Got error while reading packet"),
}
}
});
println!("Press enter to stop session");
let mut line = String::new();
let _ = std::io::stdin().read_line(&mut line);
println!("Shutting down session");
RUNNING.store(false, Ordering::Relaxed);
session.shutdown();
let _ = reader.join();
println!("Shutdown complete");
}

How can I speed up the process of parsing a 40Gb txt file and inserting a row, per line, into an sqlite db

Background
I'm currently trying to parse a .txt com zone file. It is structured like so
blahblah.com xx xx examplens.com
Currently my code is structured like so
extern crate regex;
use regex::Regex;
use rusqlite::{params, Connection, Result};
#[derive(Debug)]
struct Domain {
id: i32,
domain: String,
}
use std::io::stdin;
fn main() -> std::io::Result<()> {
let mut reader = my_reader::BufReader::open("data/com_practice.txt")?;
let mut buffer = String::new();
let conn: Connection = Connection::open("/Users/alex/Projects/domain_randomizer/data/domains.db").unwrap();
while let Some(line) = reader.read_line(&mut buffer) {
let regexed_i_think = rexeginton_the_domain_only(line?.trim());
println!("{:?}", regexed_i_think);
sqliting(regexed_i_think, &conn).unwrap();
}
let mut stmt = conn.prepare("SELECT id, domain FROM domains").unwrap();
let domain_iter = stmt.query_map(params![], |row| {
Ok(Domain {
id: row.get(0)?,
domain: row.get(1)?,
})
}).unwrap();
for domain in domain_iter {
println!("Found person {:?}", domain.unwrap());
}
Ok(())
}
pub fn sqliting(domain_name: &str, conn: &Connection) -> Result<()> {
let yeah = Domain {
id: 0,
domain: domain_name.to_string()
};
conn.execute(
"INSERT INTO domains (domain) VALUES (?1)",
params![yeah.domain],
)?;
Ok(())
}
mod my_reader {
use std::{
fs::File,
io::{self, prelude::*},
};
pub struct BufReader {
reader: io::BufReader<File>,
}
impl BufReader {
pub fn open(path: impl AsRef<std::path::Path>) -> io::Result<Self> {
let file = File::open(path)?;
let reader = io::BufReader::new(file);
Ok(Self { reader })
}
pub fn read_line<'buf>(
&mut self,
buffer: &'buf mut String,
) -> Option<io::Result<&'buf mut String>> {
buffer.clear();
self.reader
.read_line(buffer)
.map(|u| if u == 0 { None } else { Some(buffer) })
.transpose()
}
}
}
pub fn rexeginton_the_domain_only(full_line: &str) -> &str {
let regex_str = Regex::new(r"(?m).*?.com").unwrap();
let final_str = regex_str.captures(full_line).unwrap().get(0).unwrap().as_str();
return final_str;
}
Issue
So I am Parsing a single domain at a time, each time making an Insert. As I've gathered, Inserting would be far more efficient if I was making thousands of Inserts in a single transaction. However, I'm not quite sure what an efficient approach is to refactoring my parsing around this.
Question
How should I reorient my parsing process around my Insertion process? Also how can I actively gauge the speed and efficiency of the process in the first place so I can compare and contrast in an articulate manner?

Iterative filter function that modifies tree like structure

I have a structure like
struct Node {
pub id: String,
pub dis: String,
pub parent: Option<NodeRefNodeRefWeak>,
pub children: Vec<NodeRef>,
}
pub type NodeRef = Rc<RefCell<Node>>;
pub type NodeRefNodeRefWeak = Weak<RefCell<Node>>;
I also have a start of a function that can iterate this structure to
pull out a match on a node id but it has issues.
What I would like is for this function to return the parent node of the whole tree with ONLY the branches that have a match somewhere on the branch.
Children past the search node can be removed.
Ie a function that filters all other nodes out of the tree.
For example with my rust playground link I would like it to return
level0_node_#1 (level0_node_#1)
level1_node_4 (level1_node_4)
level2_node_4_3 (level2_node_4_3)
level3_node_4_3_2 (level3_node_4_3_2)
However, using the recursive approach as below causes real issue with already borrowed errors when trying to remove branches etc.
Is there a way to achieve this filter function?
I have a test in the Rust playground.
fn tree_filter_node_objects<F>(node: &NodeRef, f: F) -> Vec<NodeRef>
where F: Fn(&str) -> bool + Copy {
let mut filtered_nodes: Vec<NodeRef> = vec![];
let mut borrow = node.borrow_mut();
if f(&borrow.id) {
filtered_nodes.push(node.clone());
}
for n in borrow.children.iter() {
let children_filtered = tree_filter_node_objects(n, f);
for c in children_filtered.iter() {
filtered_nodes.push(c.clone());
}
}
filtered_nodes
}
In the end I used this iterative approach.
pub fn tree_filter_node_dfs<F>(root: &NodeRef, f: F) -> Vec<NodeRef>
where F: Fn(&BmosHaystackObject) -> bool + Copy {
let mut filtered_nodes: Vec<NodeRef> = vec![];
let mut cur_node: Option<NodeRef> = Some(root.clone());
let mut last_visited_child: Option<NodeRef> = None;
let mut next_child: Option<NodeRef>;
let mut run_visit: bool = true;
while cur_node.is_some() {
if run_visit {
let n = cur_node.as_ref().unwrap();
if f(&n.borrow().object) {
}
}
if last_visited_child.is_none() {
let children = cur_node.as_ref().unwrap().borrow().children.clone();
if children.len() > 0 {
next_child = Some(children[0].clone());
}
else {
next_child = None;
}
}
else {
next_child = tree_filter_node_get_next_sibling(last_visited_child.as_ref().unwrap());
}
if next_child.is_some() {
last_visited_child = None;
cur_node = next_child;
run_visit = true;
}
else {
last_visited_child = cur_node;
cur_node = tree_node_parent_node(&last_visited_child.clone().unwrap().clone());
run_visit = false;
}
}
filtered_nodes
}

Select from a list of sockets using futures

I am trying out the yet-unstable async-await syntax in nightly Rust 1.38 with futures-preview = "0.3.0-alpha.16" and runtime = "0.3.0-alpha.6". It feels really cool, but the docs are (yet) scarce and I got stuck.
To go a bit beyond the basic examples I would like to create an app that:
Accepts TCP connections on a given port;
Broadcasts all the data received from any connection to all active connections.
Existing docs and examples got me this far:
#![feature(async_await)]
#![feature(async_closure)]
use futures::{
prelude::*,
select,
future::select_all,
io::{ReadHalf, WriteHalf, Read},
};
use runtime::net::{TcpListener, TcpStream};
use std::io;
async fn read_stream(mut reader: ReadHalf<TcpStream>) -> (ReadHalf<TcpStream>, io::Result<Box<[u8]>>) {
let mut buffer: Vec<u8> = vec![0; 1024];
match reader.read(&mut buffer).await {
Ok(len) => {
buffer.truncate(len);
(reader, Ok(buffer.into_boxed_slice()))
},
Err(err) => (reader, Err(err)),
}
}
#[runtime::main]
async fn main() -> std::io::Result<()> {
let mut listener = TcpListener::bind("127.0.0.1:8080")?;
println!("Listening on {}", listener.local_addr()?);
let mut incoming = listener.incoming().fuse();
let mut writers: Vec<WriteHalf<TcpStream>> = vec![];
let mut reads = vec![];
loop {
select! {
maybe_stream = incoming.select_next_some() => {
let (mut reader, writer) = maybe_stream?.split();
writers.push(writer);
reads.push(read_stream(reader).fuse());
},
maybe_read = select_all(reads.iter()) => {
match maybe_read {
(reader, Ok(data)) => {
for writer in writers {
writer.write_all(data).await.ok(); // Ignore errors here
}
reads.push(read_stream(reader).fuse());
},
(reader, Err(err)) => {
let reader_addr = reader.peer_addr().unwrap();
writers.retain(|writer| writer.peer_addr().unwrap() != reader_addr);
},
}
}
}
}
}
This fails with:
error: recursion limit reached while expanding the macro `$crate::dispatch`
--> src/main.rs:36:9
|
36 | / select! {
37 | | maybe_stream = incoming.select_next_some() => {
38 | | let (mut reader, writer) = maybe_stream?.split();
39 | | writers.push(writer);
... |
55 | | }
56 | | }
| |_________^
|
= help: consider adding a `#![recursion_limit="128"]` attribute to your crate
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
This is very confusing. Maybe I am using select_all() in a wrong way? Any help in making it work is appreciated!
For completeness, my Cargo.toml:
[package]
name = "async-test"
version = "0.1.0"
authors = ["xxx"]
edition = "2018"
[dependencies]
runtime = "0.3.0-alpha.6"
futures-preview = { version = "=0.3.0-alpha.16", features = ["async-await", "nightly"] }
In case someone is following, I hacked it together finally. This code works:
#![feature(async_await)]
#![feature(async_closure)]
#![recursion_limit="128"]
use futures::{
prelude::*,
select,
stream,
io::ReadHalf,
channel::{
oneshot,
mpsc::{unbounded, UnboundedSender},
}
};
use runtime::net::{TcpListener, TcpStream};
use std::{
io,
net::SocketAddr,
collections::HashMap,
};
async fn read_stream(
addr: SocketAddr,
drop: oneshot::Receiver<()>,
mut reader: ReadHalf<TcpStream>,
sender: UnboundedSender<(SocketAddr, io::Result<Box<[u8]>>)>
) {
let mut drop = drop.fuse();
loop {
let mut buffer: Vec<u8> = vec![0; 1024];
select! {
result = reader.read(&mut buffer).fuse() => {
match result {
Ok(len) => {
buffer.truncate(len);
sender.unbounded_send((addr, Ok(buffer.into_boxed_slice())))
.expect("Channel error");
if len == 0 {
return;
}
},
Err(err) => {
sender.unbounded_send((addr, Err(err))).expect("Channel error");
return;
}
}
},
_ = drop => {
return;
},
}
}
}
enum Event {
Connection(io::Result<TcpStream>),
Message(SocketAddr, io::Result<Box<[u8]>>),
}
#[runtime::main]
async fn main() -> std::io::Result<()> {
let mut listener = TcpListener::bind("127.0.0.1:8080")?;
eprintln!("Listening on {}", listener.local_addr()?);
let mut writers = HashMap::new();
let (sender, receiver) = unbounded();
let connections = listener.incoming().map(|maybe_stream| Event::Connection(maybe_stream));
let messages = receiver.map(|(addr, maybe_message)| Event::Message(addr, maybe_message));
let mut events = stream::select(connections, messages);
loop {
match events.next().await {
Some(Event::Connection(Ok(stream))) => {
let addr = stream.peer_addr().unwrap();
eprintln!("New connection from {}", addr);
let (reader, writer) = stream.split();
let (drop_sender, drop_receiver) = oneshot::channel();
writers.insert(addr, (writer, drop_sender));
runtime::spawn(read_stream(addr, drop_receiver, reader, sender.clone()));
},
Some(Event::Message(addr, Ok(message))) => {
if message.len() == 0 {
eprintln!("Connection closed by client: {}", addr);
writers.remove(&addr);
continue;
}
eprintln!("Received {} bytes from {}", message.len(), addr);
if &*message == b"quit\n" {
eprintln!("Dropping client {}", addr);
writers.remove(&addr);
continue;
}
for (&other_addr, (writer, _)) in &mut writers {
if addr != other_addr {
writer.write_all(&message).await.ok(); // Ignore errors
}
}
},
Some(Event::Message(addr, Err(err))) => {
eprintln!("Error reading from {}: {}", addr, err);
writers.remove(&addr);
},
_ => panic!("Event error"),
}
}
}
I use a channel and spawn a reading task for each client. Special care had to be taken to ensure that readers get dropped with writers: this is why oneshot future is used. When oneshot::Sender is dropped, the oneshot::Receiver future resolves to canceled state, which is a notification mechanism for a reading task to know it is time to halt. To demonstrate that it works, we drop a client as soon as we get "quit" message.
Sadly, there is a (seemingly useless) warning regarding an unused JoinHandle from the runtime::spawn call, and I don't really know how to eliminate it.

Resources