I see quite a huge discrepancy between the time it takes to process (read, parse) csv file inside a unit test vs exactly the same code reading and parsing the same file from the same location in either simulator or running on a device.
I see around x8 time difference. Running on the main thread.
Edit: The unit test is much faster. Logging times around method call, in both places
More findings:
what I actually found and perhaps this is important is that the function being called actually creates a number of threads and then waits for all of them to complete It splits an array of 180000 rows into chunks and processes each chunk asynchronously. (Using DispatchQueue.global().async and DispatchGroup).
Depending on the number of threads the performance while calling in the app degrades, while in a unit test is fairly similar.
func processCSV(fileName: String, columnToGet:[String]?, block: (([String:String]) ->())? = nil) throws -> ([[String:String]]) {
var userReport = [[String:String]?]()
var wmsEntries = 0
do {
let data = try String(contentsOfFile: fileName, encoding: .utf8)
var myStrings = data.components(separatedBy: .newlines)
let headerRow = myStrings.first?.components(separatedBy: ",")
let headerCount = headerRow?.count ?? 0
var headerColumnMap = [String:Int]()
try columnToGet?.forEach({ column in
guard let index = headerRow?.firstIndex(where: {$0.compare(column, options: .caseInsensitive) == .orderedSame}) else {
throw NSError(domain: "Unexpected or invalid header in csv file.", code: NSFileReadCorruptFileError, userInfo:nil )
}
headerColumnMap[column] = index
})
myStrings = Array(myStrings.dropFirst()).filter({!$0.isEmpty})
wmsEntries = myStrings.count
userReport = [[String:String]?](repeating: nil, count: wmsEntries)
let dispatchGroup = DispatchGroup()
func insert(_ record:[Substring], at:Int) {
var entry = [String:String]()
headerColumnMap.forEach({ key, value in
entry[key] = record[value].trimmingCharacters(in: .whitespacesAndNewlines)
})
DispatchQueue.global().async {
block?(entry)
}
userReport[at] = entry
}
let chunkSize = max(1000, myStrings.count / 9)
for (chunkIndex, chunk) in myStrings.chunked(into: chunkSize).enumerated() {
dispatchGroup.enter()
DispatchQueue.global().async {
for (counter, str) in chunk.enumerated() {
let data = self.parse(line: str)
let insertIndex = chunkIndex * chunkSize + counter
guard data.count == headerCount, data.count > columnToGet?.count ?? 0 else {
DDLogError("Error in file, mismatched number of values, on line \(myStrings[chunkIndex * chunkSize + counter])")
continue
}
insert(data, at: insertIndex)
}
dispatchGroup.leave()
}
}
dispatchGroup.wait()
} catch {
print(error)
}
let filtered = userReport.filter({$0 != nil}) as! [[String:String]]
self.numberLinesWithError = wmsEntries - filtered.count
return filtered
}
How do I create a writer function for the tunnel program below? The code below is a sample program to create windows tunnel interface. I want to write a function that writes (or sends) packets to another server IP address. Github link for full code and its dependencies given below.
https://github.com/nulldotblack/wintun/blob/main/examples/basic.rs
use log::*;
use std::sync::{
atomic::{AtomicBool, Ordering},
Arc,
};
static RUNNING: AtomicBool = AtomicBool::new(true);
fn main() {
env_logger::init();
let wintun = unsafe { wintun::load_from_path("examples/wintun/bin/amd64/wintun.dll") }
.expect("Failed to load wintun dll");
let version = wintun::get_running_driver_version(&wintun);
info!("Using wintun version: {:?}", version);
let adapter = match wintun::Adapter::open(&wintun, "Demo") {
Ok(a) => a,
Err(_) => wintun::Adapter::create(&wintun, "Example", "Demo", None)
.expect("Failed to create wintun adapter!"),
};
let version = wintun::get_running_driver_version(&wintun).unwrap();
info!("Using wintun version: {:?}", version);
let session = Arc::new(adapter.start_session(wintun::MAX_RING_CAPACITY).unwrap());
let reader_session = session.clone();
let reader = std::thread::spawn(move || {
while RUNNING.load(Ordering::Relaxed) {
match reader_session.receive_blocking() {
Ok(packet) => {
let bytes = packet.bytes();
println!(
"Read packet size {} bytes. Header data: {:?}",
bytes.len(),
&bytes[0..(20.min(bytes.len()))]
);
}
Err(_) => println!("Got error while reading packet"),
}
}
});
println!("Press enter to stop session");
let mut line = String::new();
let _ = std::io::stdin().read_line(&mut line);
println!("Shutting down session");
RUNNING.store(false, Ordering::Relaxed);
session.shutdown();
let _ = reader.join();
println!("Shutdown complete");
}
I am trying out the yet-unstable async-await syntax in nightly Rust 1.38 with futures-preview = "0.3.0-alpha.16" and runtime = "0.3.0-alpha.6". It feels really cool, but the docs are (yet) scarce and I got stuck.
To go a bit beyond the basic examples I would like to create an app that:
Accepts TCP connections on a given port;
Broadcasts all the data received from any connection to all active connections.
Existing docs and examples got me this far:
#![feature(async_await)]
#![feature(async_closure)]
use futures::{
prelude::*,
select,
future::select_all,
io::{ReadHalf, WriteHalf, Read},
};
use runtime::net::{TcpListener, TcpStream};
use std::io;
async fn read_stream(mut reader: ReadHalf<TcpStream>) -> (ReadHalf<TcpStream>, io::Result<Box<[u8]>>) {
let mut buffer: Vec<u8> = vec![0; 1024];
match reader.read(&mut buffer).await {
Ok(len) => {
buffer.truncate(len);
(reader, Ok(buffer.into_boxed_slice()))
},
Err(err) => (reader, Err(err)),
}
}
#[runtime::main]
async fn main() -> std::io::Result<()> {
let mut listener = TcpListener::bind("127.0.0.1:8080")?;
println!("Listening on {}", listener.local_addr()?);
let mut incoming = listener.incoming().fuse();
let mut writers: Vec<WriteHalf<TcpStream>> = vec![];
let mut reads = vec![];
loop {
select! {
maybe_stream = incoming.select_next_some() => {
let (mut reader, writer) = maybe_stream?.split();
writers.push(writer);
reads.push(read_stream(reader).fuse());
},
maybe_read = select_all(reads.iter()) => {
match maybe_read {
(reader, Ok(data)) => {
for writer in writers {
writer.write_all(data).await.ok(); // Ignore errors here
}
reads.push(read_stream(reader).fuse());
},
(reader, Err(err)) => {
let reader_addr = reader.peer_addr().unwrap();
writers.retain(|writer| writer.peer_addr().unwrap() != reader_addr);
},
}
}
}
}
}
This fails with:
error: recursion limit reached while expanding the macro `$crate::dispatch`
--> src/main.rs:36:9
|
36 | / select! {
37 | | maybe_stream = incoming.select_next_some() => {
38 | | let (mut reader, writer) = maybe_stream?.split();
39 | | writers.push(writer);
... |
55 | | }
56 | | }
| |_________^
|
= help: consider adding a `#![recursion_limit="128"]` attribute to your crate
= note: this error originates in a macro outside of the current crate (in Nightly builds, run with -Z external-macro-backtrace for more info)
This is very confusing. Maybe I am using select_all() in a wrong way? Any help in making it work is appreciated!
For completeness, my Cargo.toml:
[package]
name = "async-test"
version = "0.1.0"
authors = ["xxx"]
edition = "2018"
[dependencies]
runtime = "0.3.0-alpha.6"
futures-preview = { version = "=0.3.0-alpha.16", features = ["async-await", "nightly"] }
In case someone is following, I hacked it together finally. This code works:
#![feature(async_await)]
#![feature(async_closure)]
#![recursion_limit="128"]
use futures::{
prelude::*,
select,
stream,
io::ReadHalf,
channel::{
oneshot,
mpsc::{unbounded, UnboundedSender},
}
};
use runtime::net::{TcpListener, TcpStream};
use std::{
io,
net::SocketAddr,
collections::HashMap,
};
async fn read_stream(
addr: SocketAddr,
drop: oneshot::Receiver<()>,
mut reader: ReadHalf<TcpStream>,
sender: UnboundedSender<(SocketAddr, io::Result<Box<[u8]>>)>
) {
let mut drop = drop.fuse();
loop {
let mut buffer: Vec<u8> = vec![0; 1024];
select! {
result = reader.read(&mut buffer).fuse() => {
match result {
Ok(len) => {
buffer.truncate(len);
sender.unbounded_send((addr, Ok(buffer.into_boxed_slice())))
.expect("Channel error");
if len == 0 {
return;
}
},
Err(err) => {
sender.unbounded_send((addr, Err(err))).expect("Channel error");
return;
}
}
},
_ = drop => {
return;
},
}
}
}
enum Event {
Connection(io::Result<TcpStream>),
Message(SocketAddr, io::Result<Box<[u8]>>),
}
#[runtime::main]
async fn main() -> std::io::Result<()> {
let mut listener = TcpListener::bind("127.0.0.1:8080")?;
eprintln!("Listening on {}", listener.local_addr()?);
let mut writers = HashMap::new();
let (sender, receiver) = unbounded();
let connections = listener.incoming().map(|maybe_stream| Event::Connection(maybe_stream));
let messages = receiver.map(|(addr, maybe_message)| Event::Message(addr, maybe_message));
let mut events = stream::select(connections, messages);
loop {
match events.next().await {
Some(Event::Connection(Ok(stream))) => {
let addr = stream.peer_addr().unwrap();
eprintln!("New connection from {}", addr);
let (reader, writer) = stream.split();
let (drop_sender, drop_receiver) = oneshot::channel();
writers.insert(addr, (writer, drop_sender));
runtime::spawn(read_stream(addr, drop_receiver, reader, sender.clone()));
},
Some(Event::Message(addr, Ok(message))) => {
if message.len() == 0 {
eprintln!("Connection closed by client: {}", addr);
writers.remove(&addr);
continue;
}
eprintln!("Received {} bytes from {}", message.len(), addr);
if &*message == b"quit\n" {
eprintln!("Dropping client {}", addr);
writers.remove(&addr);
continue;
}
for (&other_addr, (writer, _)) in &mut writers {
if addr != other_addr {
writer.write_all(&message).await.ok(); // Ignore errors
}
}
},
Some(Event::Message(addr, Err(err))) => {
eprintln!("Error reading from {}: {}", addr, err);
writers.remove(&addr);
},
_ => panic!("Event error"),
}
}
}
I use a channel and spawn a reading task for each client. Special care had to be taken to ensure that readers get dropped with writers: this is why oneshot future is used. When oneshot::Sender is dropped, the oneshot::Receiver future resolves to canceled state, which is a notification mechanism for a reading task to know it is time to halt. To demonstrate that it works, we drop a client as soon as we get "quit" message.
Sadly, there is a (seemingly useless) warning regarding an unused JoinHandle from the runtime::spawn call, and I don't really know how to eliminate it.
I tried the chat example with websocket in play framework 2.6.x. It works fine. Now for the real application, I need to create multiple chat rooms based on user requests. And users will be able to access different chatrooms with an id or something. I think it might related to create a new flow for each room. Related code is here:
private val (chatSink, chatSource) = {
val source = MergeHub.source[WSMessage]
.log("source")
.map { msg =>
try {
val json = Json.parse(msg)
inputSanitizer.sanText((json \ "msg").as[String])
} catch {
case e: Exception => println(">>" + msg)
"Malfunction client"
}
}
.recoverWithRetries(-1, { case _: Exception ⇒ Source.empty })
val sink = BroadcastHub.sink[WSMessage]
source.toMat(sink)(Keep.both).run()
}
private val userFlow: Flow[WSMessage, WSMessage, _] = {
Flow.fromSinkAndSource(chatSink, chatSource)
}
But I really don't know how to create new flow with id and access it later. Can anyone help me on this?
I finally figured it out. Post the solution here in case anyone has similar problems.
My solution is to use the AsyncCacheApi to store Flows in cache with keys. Generate a new Flow when necessary instead of creating just one Sink and Source:
val chatRoom = cache.get[Flow[WSMessage, WSMessage, _]](s"id=$id")
chatRoom.map{room=>
val flow = if(room.nonEmpty) room.get else createNewFlow
cache.set(s"id=$id", flow)
Right(flow)
}
def createNewFlow: Flow[WSMessage, WSMessage, _] = {
val (chatSink, chatSource) = {
val source = MergeHub.source[WSMessage]
.map { msg =>
try {
inputSanitizer.sanitize(msg)
} catch {
case e: Exception => println(">>" + msg)
"Malfunction client"
}
}
.recoverWithRetries(-1, { case _: Exception ⇒ Source.empty })
val sink = BroadcastHub.sink[WSMessage]
source.toMat(sink)(Keep.both).run()
}
Flow.fromSinkAndSource(chatSink, chatSource)
}
Triggered by another question (which has been subsequently edited away though), I wanted to try out how easy it would be to chain calls to Scala 2.10's Try construct (cf. this presentation), using for-comprehensions.
The idea is to have a list of tokens and match them against a sequence of patterns, then return the first error or the successfully matched pattern. I arrived at the following pretty awkward version, and I wonder if this can be made simpler and nicer:
import util.Try
trait Token
case class Ident (s: String) extends Token
case class Keyword(s: String) extends Token
case class Punct (s: String) extends Token
case object NoToken extends Token
case class FunctionDef(id: Ident)
case class Expect[A](expectation: String)(pattern: PartialFunction[Token, A]) {
def unapply(tup: (Try[_], Token)) = Some(tup._1.map { _ =>
pattern.lift(tup._2).getOrElse(throw new Exception(expectation))
})
}
Now construct the expectations for Keyword("void") :: Ident(id) :: Punct("(") :: Punct(")") :: tail
val hasVoid = Expect("function def starts with void") { case Keyword("void") => }
val hasIdent = Expect("expected name of the function") { case id: Ident => id }
val hasOpen = Expect("expected opening parenthesis" ) { case Punct("(") => }
val hasClosed = Expect("expected closing parenthesis" ) { case Punct(")") => }
Construct a full test case:
def test(tokens: List[Token]) = {
val iter = tokens.iterator
def next(p: Try[_]) = Some(p -> (if (iter.hasNext) iter.next else NoToken))
def first() = next(Try())
val sq = for {
hasVoid (vd) <- first()
hasIdent (id) <- next(vd)
hasOpen (op) <- next(id)
hasClosed(cl) <- next(op)
} yield cl.flatMap(_ => id).map(FunctionDef(_))
sq.head
}
The following verifies the test mehod:
// the following fail with successive errors
test(Nil)
test(Keyword("hallo") :: Nil)
test(Keyword("void" ) :: Nil)
test(Keyword("void" ) :: Ident("name") :: Nil)
test(Keyword("void" ) :: Ident("name") :: Punct("(") :: Nil)
// this completes
test(Keyword("void" ) :: Ident("name") :: Punct("(") :: Punct(")") :: Nil)
Now especially the additional flatMap and map in yield seems horrible, as well as the need to call head on the result of the for comprehension.
Any ideas? Is Try very badly suited for for comprehensions? Shouldn't either Either or Try be "fixed" to allow for this type of threading (e.g. allow Try as a direct result type of unapply)?
The trick seems to be to not create Try instances in the inner structure, but instead let that throw exceptions and construct one outer Try.
First, let's get rid of the Try[Unit]'s:
case class Expect(expectation: String)(pattern: PartialFunction[Token, Unit]) {
def unapply(token: Token) =
pattern.isDefinedAt(token) || (throw new Exception(expectation))
}
case class Extract[A](expectation: String)(pattern: PartialFunction[Token, A]) {
def unapply(token: Token) = Some(
pattern.lift(token).getOrElse(throw new Exception(expectation))
)
}
Then the checks become:
val hasVoid = Expect ("function def starts with void") { case Keyword("void") => }
val getIdent = Extract("expected name of the function") { case id: Ident => id }
val hasOpen = Expect ("expected opening parenthesis" ) { case Punct("(") => }
val hasClosed = Expect ("expected closing parenthesis" ) { case Punct(")") => }
And the test method:
def test(tokens: List[Token]) = Try {
val iter = tokens.iterator
def next() = Some(if (iter.hasNext) iter.next else NoToken)
(for {
hasVoid() <- next()
getIdent(id) <- next()
hasOpen() <- next()
hasClosed() <- next()
} yield FunctionDef(id)).head // can we get rid of the `head`?
}