How to efficiently serve a file with actix-web - performance

In actix-web, it is possible to serve a file by returning in a handler:
HttpResponse::Ok().streaming(file)
But here, file must implement the Stream<Item = Result<Bytes, E>> trait. The File type from the crate async_std does not implement it, so I created a wrapper that implements it:
struct FileStreamer {
file: File,
}
impl Stream for FileStreamer {
type Item = Result<Bytes, std::io::Error>;
fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
let mut buf = [0; 1024];
self.file.read(&mut buf).poll_unpin(cx).map(|r| {
r.map(|n| {
if n == 0 {
None
} else {
Some(Bytes::copy_from_slice(&buf[0..n]))
}
})
.transpose()
})
}
}
It works but there is a problem. For every call to read we create a new instance of Bytes, which is a dynamically allocated buffer.
Is this the most efficient way to serve a file in actix-web?
It also feels to me, choosing the right buffer size in that case is actually more critical, as a small buffer will cause repetitive syscalls, and a too large buffer will cause slow memory allocation, that wont even be used entirely.
Am I right to consider recurring dynamic allocation as a performance issue?
PS: The file in question is not static, it is subject to modifications and deletion, for this reason, controlling the reading process is necessary.

From the actix-web documentation.
actix-web will send the file in question based on a path. This example takes a dynamic path from the URL. I feel you are overthinking the problem of streaming a file.
use actix_files::NamedFile;
use actix_web::{HttpRequest, Result};
use std::path::PathBuf;
async fn index(req: HttpRequest) -> Result<NamedFile> {
let path: PathBuf = req.match_info().query("filename").parse().unwrap();
Ok(NamedFile::open(path)?)
}
#[actix_web::main]
async fn main() -> std::io::Result<()> {
use actix_web::{web, App, HttpServer};
HttpServer::new(|| App::new().route("/{filename:.*}", web::get().to(index)))
.bind("127.0.0.1:8080")?
.run()
.await
}

Related

Is CoroutineScope(SupervisorJob()) runs in Main scope?

I was doing this code lab
https://developer.android.com/codelabs/android-room-with-a-view-kotlin#13
and having a question
class WordsApplication : Application() {
// No need to cancel this scope as it'll be torn down with the process
val applicationScope = CoroutineScope(SupervisorJob())
// Using by lazy so the database and the repository are only created when they're needed
// rather than when the application starts
val database by lazy { WordRoomDatabase.getDatabase(this, applicationScope) }
val repository by lazy { WordRepository(database.wordDao()) }
}
private class WordDatabaseCallback(
private val scope: CoroutineScope
) : RoomDatabase.Callback() {
override fun onCreate(db: SupportSQLiteDatabase) {
super.onCreate(db)
INSTANCE?.let { database ->
scope.launch {
var wordDao = database.wordDao()
// Delete all content here.
wordDao.deleteAll()
// Add sample words.
var word = Word("Hello")
wordDao.insert(word)
word = Word("World!")
wordDao.insert(word)
// TODO: Add your own words!
word = Word("TODO!")
wordDao.insert(word)
}
}
}
}
this is the code I found, as you can see, it is directly calling scope.launch(...)
my question is that:
isn't all the Room operations supposed to run in non-UI scope? Could someone help me to understand this? thanks so much!
Is CoroutineScope(SupervisorJob()) runs in Main scope?
No. By default CoroutineScope() uses Dispatchers.Default, as can be found in the documentation:
CoroutineScope() uses Dispatchers.Default for its coroutines.
isn't all the Room operations supposed to run in non-UI scope?
I'm not very familiar specifically with Room, but generally speaking it depends if the operation is suspending or blocking. You can run suspend functions from any dispatcher/thread. deleteAll() and insert() functions in the example are marked as suspend, therefore you can run them from both UI and non-UI threads.

Having a singleton pattern in Go Wire injection

I have a piece of code which is used to load configuration from file and parse it into a struct, I use this configuration quite often and hence I pass it around in the method parameters. Now I as my method parameters are increasing, I am looking at dependency injection and have settle with wire.
Now I have created a provider to load the configuration and an injector to provide the config struct. However each time I call the injection my file is read again, I want that the file is read once and the injection provided as many times as required without any additional loading.
Here is my provider:
// ProvideConfig config provider ...
func ProvideConfig() *config.FileConfig {
var cfp string
flag.StringVar(&cfp, "config", "config.json", "absolute path")
flag.Parse()
return config.Loadconfig(cfp)
}
Injector:
// GetConfig injector ...
func GetConfig() ConfigResource {
wire.Build(ProvideConfig, NewConfigResource)
return ConfigResource{}
}
Now when I call:
injection.GetConfig()
I see that ProvideConfig is called always. I can have a check in the provide config method the determine if the config is already loaded, I am not sure if there is a better way, something like a single instance loader which is built into the wire. I tried looking into the docs but could not find anything relevant.
As far as I'm aware, there's no built in way in wire to specify that a provider is a singleton / should only be called once.
This is accomplished in the usual way in Go, by using sync.Once. Your provider function can be a closure that does the expensive operation only once using sync.Once.Do. This is idiomatic in Go, and doesn't require any special provision from every library that wants to provide "single" loading.
Here's an example without wire:
type Value struct {
id int
msg string
}
type ValueProvider func() *Value
// consumer takes a function that provides a new *Value and consumes
// the *Value provided by it.
func consumer(vp ValueProvider) {
v := vp()
fmt.Printf("Consuming %+v\n", *v)
}
// MakeSingleLoader returns a ValueProvider that creates a value once using an
// expensive operation, and then keeps returning the same value.
func MakeSingleLoader() ValueProvider {
var v *Value
var once sync.Once
return func() *Value {
once.Do(func() {
v = ExpensiveOperation()
})
return v
}
}
// ExpensiveOperation emulates an expensive operation that can take a while
// to run.
func ExpensiveOperation() *Value {
return &Value{id: 1, msg: "hello"}
}
func main() {
sl := MakeSingleLoader()
consumer(sl)
consumer(sl)
consumer(sl)
}
If you're OK with the "singleton" value being a global, this code can be simplified a bit. Otherwise, it only calls ExpensiveOperation once, and keeps the value cached in a local inaccessible outside MakeSingleLoader.

golang API interface, what am I missing?

I want to create an interface to make it easy to add new storage backends.
package main
// Storage is an interface to describe storage backends
type Storage interface {
New() (newStorage Storage)
}
// File is a type of storage that satisfies the interface Storage
type File struct {
}
// New returns a new File
func (File) New() (newFile Storage) {
newFile = File{}
return newFile
}
// S3 is a type of storage that satisfies the interface Storage
type S3 struct {
}
// New returns a new S3
func (S3) New() (newS3 S3) {
newS3 = S3{}
return newS3
}
func main() {
// List of backends to choose from
var myStorage map[string]Storage
myStorage["file"] = File{}
myStorage["s3"] = S3{}
// Using one of the backends on demand
myStorage["file"].New()
myStorage["s3"].New()
}
But it seems not possible to define and satisfy a function that should return an object that satisfies the interface itself as well.
File.New() returns an object of type Storage which satisfies Storage.
S3.New() returns an object of type S3.
S3 should satisfies the interface Storage as well but I get this:
./main.go:32: cannot use S3 literal (type S3) as type Storage in assignment:
S3 does not implement Storage (wrong type for New method)
have New() S3
want New() Storage
What am I doing wrong?
I hope I am missing something basic here.
This code does not make sense at all. You are either implementing a factory pattern which is tied to a struct which is of the type the factory is going to produce or you are reinventing the wheel in a wrong way by reimplementing the already existing new keyword and tie it to a struct which is nil the time you would use it.
You can either get rid of the helper function and simply use
s := new(S3)
f := new (File)
Or you could use a static Factory function like:
// Do NOT tie your Factory to your type
function New() S3 {
return S3{}
}
Or, which seems to better suit your use case, create a factory interface, implement it and have its New() function return a Storage instance:
type StorageFactory interface {
New() Storage
}
type S3Factory struct {}
function (f *S3Factory) New() Storage {
return S3{}
}
There are various ways of registering your factory. You could use a global var and init
import "example.com/foo/storage/s3"
type FactoryGetter func() StorageFactory
type FactoryRegistry map[string] FactoryGetter
// Registry will be updated by an init function in the storage provider packages
var Registry FactoryRegistry
func init(){
Registry = make(map[string] FactoryGetter)
}
// For the sake of shortness, a const. Make it abflag, for example
const storageProvider = "s3"
func main(){
f := Registry[storageProvider]()
s := f.New()
s.List()
}
And somewhere in the S3 package
func init() {
Registry["s3"] = function(){ return S3Factory{}}
}
You could even think of making the Factories taking params.
I like what you're doing here and I've actually worked on projects that involved very similar design challenges, so I hope my suggestions can help you out some.
In order to satisfy the interface, you'd need to update your code from...
// New returns a new S3
func (S3) New() (newS3 S3) {
newS3 = S3{}
return newS3
}
to this
// New returns a new S3
func (S3) New() (newS3 Storage) {
newS3 = S3{}
return newS3
}
This means you will receive an instance of Storage back, so to speak. If you want to then access anything from S3 without having to use type assertion, it would be best to expose that S3 function/method in the interface.
So let's say you want a way to List your objects in your S3 client. A good approach to supporting this would be to update Storage interface to include List, and update S3 so it has its own implementation of List:
// Storage is an interface to describe storage backends
type Storage interface {
New() (newStorage Storage)
List() ([]entry, error) // or however you would prefer to trigger List
}
...
// New returns a new S3
func (S3) List() ([] entry, error) {
// initialize "entry" slice
// do work, looping through pages or something
// return entry slice and error if one exists
}
When it comes time to add support for Google Cloud Storage, Rackspace Cloud Files, Backblaze B2, or any other object storage provider, each of them will also need to implement List() ([] entry, error) as well - which is good! Once you've used this List function in the way you need, adding more clients/providers will be more like developing plugins than actually writing/architecting code (since your design is complete by that point).
The real key with satisfying interfaces is to have the signature match exactly and think of interfaces as a list of common functions/methods that you'd want every storage provider type to handle in order to meet your goals.
If you have any questions or if anything I've written is unclear, please comment and I'll be happy to clarify or adjust my post :)

Why does Rust forbid implementations of existing types?

How is this...
impl String {
fn foo(&self) {
//...
}
}
...any different to this?
fn foo(s: &String) {
//...
}
Then again, it is possible to extend the type implementation if you define a trait in your crate. Why?
There are several different arguments as indicated by the following source as to why one is unable to implement existing types that are outside of one's crate.
Local impl can be broken by future implementations. For example, consider "you've locally defined Add on Vec<T> as a concat operator, ... , and then ... after years of debate ... some mathy operation [is] to be performed instead. If you delete your impl and upgrade, your code will be ... broken2."
The readability of the code will also be affected by this change, that is, it could make the "value of that reading far more transient3."
There is also a security concern. Consider the following scenario that would be technically possible if this were allowed, that is, "an attacker [could] find a change in an impl in [some] library, a call site in an application they wish to backdoor, and send a "refactoring" pull request that "accidentally" replaces the new impl with the old impl so as to create a vulnerability, but their pull can reference the old code from the library. And they can embed the hostile impl into a macro in yet another create4."
Assuming the case that the local impl would be the preferred implementation if local impls were allowed. This would "would violate the coherence property [that is currently being maintained]5." This point can be further clarified through what is called the 'HashTable' problem5.
mod foo {
impl Hash for i32 { ... }
fn f(mut table: HashMap<i32, &'static str>) {
table.insert(0, "hello");
::bar::f(&table);
}
}
mod bar {
impl Hash for i32 { ... }
fn f(table: &HashMap<i32, &'static str>) {
assert_eq!(table.get(0), Some("hello"));
}
}

struggling with asynchronous patterns using NSURLSession

I'm using Xcode 7 and Swift 2 but my question isn't necessarily code specific, I'll gladly take help of any variety.
In my app I have a list of favorites. Due to API TOS I can't store any data, so I just keep a stub I can use to lookup when the user opens the app. I also have to look up each favorite one by one as there is no batch method. Right now I have something like this:
self.api.loadFavorite(id, completion: { (event, errorMessage) -> Void in
if errorMessage == "" {
if let rc = self.refreshControl {
dispatch_async(dispatch_get_main_queue()) { () -> Void in
rc.endRefreshing()
}
}
dispatch_async(dispatch_get_main_queue()) { () -> Void in
self.viewData.append(event)
self.viewData.sortInPlace({ $0.eventDate.compare($1.eventDate) == NSComparisonResult.OrderedDescending })
self.tableView.reloadData()
}
} else {
// some more error handling here
}
})
in api.loadFavorite I'm making a typical urlSession.dataTaskWithURL which is itself asynchronous.
You can see what happens here is that the results are loaded in one by one and after each one the view refreshes. This does work but its not optimal, for long lists you get a noticeable "flickering" as the view sorts and refreshes.
I want to be able to get all the results then just refresh once. I tried putting a dispatch group around the api.loadFavorites but the async calls in dataTaskWith URL don't seem to be bound by that group. I also tried putting the dispatch group around just the dataTaskWithURL but didn't have any better luck. The dispatch_group_notify always fires before all the data tasks are done.
Am I going at this all wrong? (probably) I considered switching to synchronous calls in the background thread since the api only allows one connection per client anyway but that just feels like the wrong approach.
I'd love to know how to get async calls that make other async calls grouped up so that I can get a single notification to update my UI.
For the record I've read about every dispatch group thread I could find here and I haven't been able to make any of them work. Most examples on the web are very simple, a series of print's in a dispatch group with a sleep to prove the case.
Thanks in advance.
If you want to invoke your method loadFavorite asynchronously in a loop for all favorite ids - which executes them in parallel - you can achieve this with a new method as shown below:
func loadFavorites(ids:[Int], completion: ([Event], ErrorType?) -> ()) {
var count = ids.count
var events = [Event]()
if count == 0 {
dispatch_async(dispatch_get_global_queue(0, 0)) {
completion(events, nil)
}
return
}
let sync_queue = dispatch_queue_create("sync_queue", dispatch_queue_attr_make_with_qos_class(DISPATCH_QUEUE_SERIAL, QOS_CLASS_USER_INITIATED, 0))
for i in ids {
self.api.loadFavorite(i) { (event, message) in
dispatch_async(sync_queue) {
if message == "" {
events.append(event)
if --count == 0 {
dispatch_async(dispatch_get_global_queue(0, 0)) {
completion(events, nil)
}
}
}
else {
// handle error
}
}
}
}
}
Note:
- Use a sync queue in order to synchronise access to shared array
events and the counter!
- Use a global dispatch queue where you invoke the completion handler!
Then call it like below:
self.loadFavorites(favourites) { (events, error) in
if (error == nil) {
events.sortInPlace({ $0.eventDate.compare($1.eventDate) == NSComparisonResult.OrderedDescending })
dispatch_async(dispatch_get_main_queue()) { () -> Void in
self.viewData = events
self.tableView.reloadData()
}
}
if let rc = self.refreshControl {
dispatch_async(dispatch_get_main_queue()) { () -> Void in
rc.endRefreshing()
}
}
Note also, that you need a different approach when you want to ensure that your calls to loadFavorite should be sequential.
If you need to support cancellation (well, who does not require this?), you might try to cancel the NSURLSession's tasks. However, in this case I would recommend to utilise a third party library which already supports cancellation of network tasks.
Alternatively, and in order to greatly simplify your asynchronous problems like those, build your network task and any other asynchronous task around a general utility class, frequently called Future or Promise. A future represents an eventual result, and is quite light wight. They are also "composable", that is you can define "continuations" which get invoked when the future completes, which in turn returns yet another future where you can add more continuations, and so force. See wiki Futures and Promises.
There are a couple of implementations in Swift and Objective-C. Ideally, these should also support cancellation. Unfortunately, I don't know any Swift library implementing Futures or Promises which support cancellation at this time - except my own library, which is not yet Open Source.
Another library which helps to solve common and also very complex asynchronous patterns is ReactiveCocoa, though it has a very steep learning curve and adds quite a lot of code to your project.
This is what finally worked for me. Easy once I figured it out. My problem was trying to take ObjC examples and rework them for swift.
func migrateFavorites(completion:(error: Bool) -> Void) {
let migrationGroup = dispatch_group_create()
let queue = dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_DEFAULT, 0)
// A lot of other code in there, fetching some core data etc
dispatch_group_enter(migrationGroup)
self.api.loadFavorite(id, completion: { (event, errorMessage) -> Void in
if errorMessage == "" {
if let rc = self.refreshControl {
dispatch_async(dispatch_get_main_queue()) { () -> Void in
rc.endRefreshing()
}
}
dispatch_async(dispatch_get_main_queue()) { () -> Void in
self.viewData.append(event)
self.viewData.sortInPlace({ $0.eventDate.compare($1.eventDate) == NSComparisonResult.OrderedDescending })
self.tableView.reloadData()
}
} else {
// some more error handling here
}
dispatch_group_leave(migrationGroup)
})
dispatch_group_notify(migrationGroup, queue) { () -> Void in
NSLog("Migration Queue Complete")
dispatch_async(dispatch_get_main_queue()) { () -> Void in
completion(error: migrationError)
}
}
}
The key was:
ENTER the group just before the async call
LEAVE the group as the last line in the completion handler
As I mentioned all this is wrapped up in a function so I put the function's completion handler inside the dispatch_group_notify. So I call this function and the completion handler only gets invoked when all the async tasks are complete. Back on my main thread I check for the error and refresh the ui.
Hopefully this helps someone with the same problem.

Resources