A talk on Generators & Bluebird.js coroutine()

Intro

Generators can be scary, confusing, and can require a lot of setup to really get the most out of. Most people are looking to just yield a statement so that certain async actions can occur in a specific order; this is most common, from what I have seen, when dealing with mongo read writes when using promises. 

In reality the patterns what I will go over can be applied to any promise or none promise based setup where you have multiple async operations that you may or may not have to wait on. 

Bluebird.js

If you are not familiar as to what Bluebird.js is here is the skinny. Bluebird.js is a library that fills a single purpose : Better, faster promises and support structures. The Bluebird team has done a great job at making Promises fast and accessible to a variety of environments which makes it a great tool for any node or front end project looking to ensure that the Promise spec is met and usable.

You can read their own "Why Bluebird?" section here : http://bluebirdjs.com/docs/why-bluebird.html

You can also see the benchmarks here : http://bluebirdjs.com/docs/benchmarks.html

Being able to rely on promises and generators being available is key to a lot of the work I do these days as they help control the flow of things like multiple atomic operations occurring inside of a single controller action making the action none atomic as a whole. This kind of thing can become a nightmare when dealing with in order writes or reads from say mongo.

Generators

If you haven't heard of or your knowledge is just lacking a bit on generators here is a quick description : Generators are functions that can be executed and then exited, but with their state maintained, and then reentered at another time. 

Many people ask why I don't just use async/await; without creating a debate or an entire article it is mostly due to the fact that at async/awaits core is generators. I prefer to use the common denominator. 

In addition to that Bluebird is simply faster then the native promises and other libs that provide other limited functionality sets. Refer the benchmarks link above for more info on how those benchmarks are created.

Creating & Using Generators

Finally lets look at some code! The Generator syntax is very simple and should look pretty familiar aside from a single difference.

You will notice the * at the end of the function, this is what defines the function as a generator which allows us to exit and reenter the function using a .next() method. It reality all a generator is, is a constructed Iterator type but with some additional functionality that is beyond just a primitive type which allows for better flow control. 

Here is an example of what using a generator as an Iterator type :

About The Yield Keyword

With the example above we aren't really doing anything that a primitive Iterator type can't do such as a Number type.  However it is important to observe that you have flow control that you don't with the primitive Iterators using the yield keyword. The yield keyword is an actor that functions like return but functions differently in an important way. 

Yield maintains the generators memory state which is what allows for the iterative functionality that you can control. The state of the generator is always at the line of the yield keyword just passed the value after it.

Let us take a look at what is happening behind the scenes for us here, which will give us a better understanding of how to interface with generators as an Iterative type.

Since a generator is technically an Iterator type, it exposes all the normal Iterator methods, this includes .next() as shown above. This is how you reenter a generator that has been stepped out of using the yield keyword. When you reenter the function it will continue the generator execution at the point start after the line terminator of the yield statement.

About next()

We have seen the next() method in action to continue till the next yield call in our generator. Next() can do a little more then just tell our generator to continue; it allows us to pass values into the generator for the duration of the execution up till the next yield that we can use or store for a cumulative value return from the generator.

Here is an example on the cumulative return of some numbers that we will pass into the generator mid execution between yields.


This method of execution is much more controllable when dealing with the aspects of potentially unknown amounts of processing or long running processes such as job queues, stitching DB query chains, or loop processing. It is also worth noting how the data is returned from the .next() method. is an object literal with two fields: value, and done. The value is whatever value you yielded in the generator; the done field represents if the generator ran into another yield statement or not denoting that the execution loop or sequence has completed within the generator. 

About throw()  

Sometimes you need to be able to cancel the execution of the generator based on a result of the yielded value that was returned when an iteration has occurred. The throw() that the generator interface allows you to define a try/catch in your generator which will get caught within the context of the generator; this allows you to either re-throw the error or let the error bubble up to the parent execution scope that invoked the generator.

Let's take a look at what this might look like:

As shown above you can actually pass in the error you want to be thrown inside of the generator. This pattern can help dealing with scope hell when dealing how your generators exit before they are technically complete.

About return()

Every once and a while you will need to get the current value of a generator when it is in a completed state or more likely; you will need to end a generators execution sequence early but you don't want to throw an error. The return() method does exactly as described it simply allows us to end a generator execution but without throwing an error. 

If you give the return() method a value as a parameter the value returned, which is the same as the type of return you get form the next() method, the value property will be the same value you passed in to return(). This can be useful when creating reusable components, or when using a factory pattern that can return a generator.

Take a look at the following example :

Bluebird Coroutine

There is sometimes confusion around what what a coroutine is and what it means within different contexts. A coroutine was originally one of the names that Generators went by the ECMA spec for a little while; so if you google "javascript coroutines" you will find a lot of examples that look a lot like the ones in this article because they are really just generators.

Coroutines are Promises

The Bluebird.coroutine() does a few things that are different. First off the result of the invoking the Bluebird.coroutine() method is that it returns a Promise that is resolved when the generator that is passed in returns the state of done : true. This means that you can suspend the execution of entire Promise generation functions with pairs nicely with the ability to wrap nearly anything us Bluebird.promisfy() which will ensure that any callback based method returns a promise instead. 

This enables you to use promise patterns when waiting for a generator to complete. Observe the following example :

It is important to note that when you wrap the generator in the Bluebird.coroutine() you need to return a promise both when you yield or what you use return. This is because the co-routine is iterating your generator for you and is looking for both a done state and a promise resolve on the promise state. It is also worth mentioning, that the results will only print after 1 second has passed. That is how you know that your generators yielded statement is actually being hit. This is due to the generator returning a Promise that delays the setting complete state for 1 second ( 1000 ms ).

Note that you can also change the order in which something will and get resolved just by changing what gets yielded or moving the statement all together.

Coroutines & Bluebird.all()

Much of the time you have maybe a single operation that then multiple operations depend on for data to complete their processing. Coroutines help a lot with this issue when combined with the Bluebird.all(). The .all() method allows you to await the results of many promises at once and then get the results all at the same time; this paired with the fact that now your generators can return a promise instance makes controlling multiple async operations a snap. 

I personally use the following pattern all over the place for things like dealing with multiple Mongo DB calls, file reads, and subsequent API calls for data stitching. Here is an example of the generator multi-async pattern :

Taking it further

There is a lot more that you can do with this pattern outside of DB calls and simple processing methods and modules. You can take your implementation a bit further by pulling in and using clustering with Node. Clustering is a basically how you do something much closer to true multi-threading with Node.

 To do something like this you can wrap the signals that a child process would send with data with a generator function which would allow you to yield the execution of say a function processing a network request to wait for the worker thread to be completed. That would allow the master thread to field other requests coming into the event loop while checking the status of the yielded call on every event cycle. 

This is a subject that will require another article; and is one that I choose to write. But it is worth thinking about and at least knowing about when using things like generators, promises, and Bluebird.

Conclusion

Included is what that logic flow looks like without the coroutine and yield. If nothing else it is easy to see that it creates much more readable code that will be much easier to deal with while you're working on it and in the future when someone else has to deal with it. I use the pattern(s) shown here for all kinds of implementations from APIs to heavy processing using clustering.

Links

  1. Bluebird.js
  2. Generators
  3. Spawning
  4. Mongoose