uv_link_t - libuv pipeline

Sun Aug 14 2016 20:00:00 GMT-0400 (EDT)

Preface

Writing servers/clients in C could be non-trivial. Even with the help of such powerful (and awesome dinosaur) libraries as libuv, it still takes lots of effort and boilerplate code to create real world applications.

Some of this boilerplate code comes from the use of the widespread protocols like TLS (SSL) and HTTP. While there are popular implementations available as an Open Source libraries (OpenSSL, http-parser), they still either provide very abstract interface (like http-parser), or an API to transfer the responsibility of the networking to the library itself (like SSL_set_fd() in OpenSSL and Amazon's s2n). Such abstract nature makes them easier to embed, but the adaptor code inevitably tend to appear in the particular applications.

Precursor - StreamBase

libuv is hardly an exception, and node.js and bud's TLS implementation is a vivid evidence of this. However, in a contrast to bud, node.js TLS code lives off on an abstraction called StreamBase. By separating libuv-specific adaptor code into a generic C++ class, we have created a foundation for a simpler and reusable implementation of any other protocol! See, for example, recent node_http_parser.cc which uses only a minor amount of power available through the means of StreamBase, but nevertheless provides 10-20% performance improvement since its inception.

This implementation has some major drawbacks, preventing its wider adoption outside of the node.js core:

  • C++ headers: lots of virtual classes, complex API, non-trivial inheritance scheme
  • High internal dependence on the node.js core itself

Because of these issues (and my own limitations) StreamBase has defied all attempts to make it public.

Heavily inspired by the success of StreamBase in the node.js core, a uv_link_t library was created. It has lots of similarities with the StreamBase, but it is:

  • Implemented in C: self-documented structures, C-cast based inheritance, etc
  • Standalone library

The API is based on the uv_stream_t and shouldn't come as a big surprise to the users, since uv_link_t is intended to be used together with libuv.

Here is a visual explanation of how uv_link_t works:

uv_link_source_t

Examples

Before we take a peek at the APIs, let's discuss what can be done with uv_link_t. Technically, any stream-based (i.e. anything that uses uv_stream-t) protocol can be implemented on top of it. Multiple protocols can be chained together (that's why it is called uv_link_t!), provided that there is an implementation:

TCP <-> TLS <-> HTTP <-> WebSocket.

This chaining works in a pretty transparent way, and every segment of it can be observed without disturbing the data flow and operation of the other links.

Existing protocols:

  • uv_ssl_t - TLS, based on OpenSSL's API
  • uv_http_t - low-level HTTP/1.1 implementation, possibly incomplete

Small demo-project:

Note that all these projects, including uv_link_t itself are supposed to be built with a gypkg, which is a subject for a future blog post.

API

The backbone of the API is a uv_link_t structure:

#include "uv_link_t.h"

static uv_link_methods_t methods = {
  /* To be discussed below */
};

void _() {
  uv_link_t link;

  uv_link_init(&link, &methods);

  /* ... some operations */
  uv_link_close(&link, close_cb);
}

In the most of the cases a first link should be an uv_link_source_t. It consumes an instance of uv_stream_t, and propagates reads and writes from the whole chain of links connected to it.

uv_link_source_t source;

uv_stream_t* to_be_consumed;
uv_link_source_init(&source, to_be_consumed);

As mentioned before, links can be chained together:

uv_link_t a;
uv_link_t b;

/* Initialize `a` and `b` */
uv_link_chain(/* from */ a, /* to */ b);

This uv_link_chain call means that the data emitted by a will be passed as an input to b, and the output of b will written to a.

Speaking of input/output, the API is pretty similar to libuv's:

int uv_link_write(uv_link_t* link, const uv_buf_t bufs[],
                  unsigned int nbufs, uv_stream_t* send_handle,
                  uv_link_write_cb cb, void* arg);

int uv_link_read_start(uv_link_t* link);
int uv_link_read_stop(uv_link_t* link);

void fn() {
  link->alloc_cb = /* something */;
  link->read_cb = /* something */;
}

Please check the API docs for further information on particular methods and structures (likes uv_link_source_t and uv_link_observer_t).

There is also an Implementation guide for implementing custom types of uv_link_t.

Error reporting

Having multiple independent implementations of uv_link_t interface, it is a natural question to ask: how does uv_link_t handle error code conflict?

The answer is that all error codes returned by uv_link_... methods are actually prefixed with the index of the particular link in a chain. Thus, even if there are several similar links in a chain, it is possible to get the pointer to the uv_link_t instance that have emitted it:

int uv_link_errno(uv_link_t** link, int err);
const char* uv_link_strerror(uv_link_t* link, int err);

Foreword: gypkg

gypkg is recommended to be used when embedding uv_link_t in the C project. There are not too many source files to put into a Makefile or some other build file, but the convenience that gypkg provides, pays off very quickly!

Installation (node.js v6 is required):

npm install -g gypkg

Init

mkdir project
cd project
gypkg init
vim project.gyp
{
  "variables": {
    "gypkg_deps": [
      "git://github.com/libuv/libuv.git@^1.9.0 => uv.gyp:libuv",
      "git://github.com/indutny/uv_link_t@^1.0.0 [gpg] => uv_link_t.gyp:uv_link_t",
    },
  },

  # Some other GYP things
}

Building

gypkg build
ls -la out/Release