Running node.js + DTrace on FreeBSD
Preface #
Tracing node.js activity and detecting performance problems and bottlenecks has always been an important topic for many people in the community. Though, various ways to do this were available, including: systemtap, ETW and perfctr on Windows. The most complete tracing support was done by Joyent guys for the DTrace tool which works best on their Illumos fork, called SmartOS.
Fortunately, since 9.0 version, FreeBSD maintainers have started fixing and tweaking their DTrace implementation too (which is actually a backport from Solaris). Considering that FreeBSD is much easier to install and is much more usable as a primary OS for developers, being able to do flamegraphs for node.js on it is something that I highly desired at the time.
What was broken #
Sadly, it wasn't working out-of-the-box. After the installation of FreeBSD in a
VirtualBox has finished, I immediately tried to build and dtrace node with a
jstack()
utility (see my previous blog post on that topic), but it did
not work out. After some struggling it became obvious that /dev/dtrace/helper
has pretty narrow permissions and the node, running under non-root user, wasn't
able to register itself within the system's DTrace module.
Calling sudo chmod 666 /dev/dtrace/helper
improved the situation, but not that
much: version 0.11 of node.js was crashing, and version v0.10 was still not
registering it's ustack helper and DTrace provider (see USDT docs). This
problem was a bit tougher than the helper permissions, and took awhile to fix.
How node interacts with DTrace #
First of all, a bit of information about how node.js compiles it's DTrace helpers and how they are used by the system. There were some slides on that topic (how DTrace USDT interacts with kernel), but at that moment I figured it out by reading the implementation's source.
There are two .d
files in the node.js source tree: src/node_provider.d
and
src/v8ustack.d
. The former declares a node
DTrace USDT provider and the
latter exports an ustack helper Both of these files are compiled with
dtrace -G -s <file.d> -o <out.o>
, which in fact does the following thing:
- Compile all D-language chunks into DIFs (DTrace Intermediate Format
- Encode them in the DOF (DTrace Object File) format
- Put them into the special
._SUNW_dof
ELF section - Link it to the internally-stored
drti.o
, providingdtrace_dof_init
anddtrace_dof_fini
helper functions.
These functions are called on the executable's initialization and
deinitialization (surprisingly!), each making an ioctl()
syscall on the
/dev/dtrace/helper
. Specifically, dtrace_dof_init()
loads and verifies the
._SUNW_dof
section of an ELF-file and registers it with-in the kernel.
How I fixed it #
I decided to investigate the node.js v0.11 crashes. It was crashing on a
this line, so it was either an ELF symbol problem or a DOF string
problem. Initially, I found that the DOF did not contain the STRTAB section
that dtri.o
was searching for, but it turned out to be a slightly bigger
problem. Since node.js has two separate .d
files, it has two DOFs for
each of them in it's ._SUNW_dof
section, but the drti.o
was loading only
one! After all I came up with a following patch:
commit c46786f47483e7fc218727aa52cf6b2278a45053
Author: Fedor Indutny <fedor.indutny@gmail.com>
Date: Mon Feb 17 01:16:13 2014 +0400
dtrace: proper symbol fixup and import in drti
Application may contain multiple DOFs, merged into one ._SUNW_dof ELF
section. Process all of them and fix symbols only in those ones that
actaully define a provider. Use proper strtab for resolving symbols.
diff --git a/cddl/contrib/opensolaris/lib/libdtrace/common/drti.c b/cddl/contrib/opensolaris/lib/libdtrace/common/drti.c
index 3b4a38c..e47cfb4d 100644
--- a/cddl/contrib/opensolaris/lib/libdtrace/common/drti.c
+++ b/cddl/contrib/opensolaris/lib/libdtrace/common/drti.c
@@ -20,6 +20,7 @@
*/
/*
* Copyright 2008 Sun Microsystems, Inc. All rights reserved.
+ * Copyright 2013 Voxer Inc. All rights reserved.
* Use is subject to license terms.
*/
@@ -144,7 +145,8 @@ dtrace_dof_init(void)
Lmid_t lmid;
#else
u_long lmid = 0;
- dof_sec_t *sec;
+ dof_sec_t *sec, *secstart, *dofstrtab, *dofprobes;
+ dof_provider_t *dofprovider;
size_t i;
#endif
int fd;
@@ -152,12 +154,13 @@ dtrace_dof_init(void)
#if !defined(sun)
Elf *e;
Elf_Scn *scn = NULL;
- Elf_Data *symtabdata = NULL, *dynsymdata = NULL;
+ Elf_Data *symtabdata = NULL, *dynsymdata = NULL, *dofdata = NULL;
+ dof_hdr_t *dof_next = NULL;
GElf_Shdr shdr;
int efd, nprobes;
char *s;
+ char *dofstrtabraw;
size_t shstridx, symtabidx = 0, dynsymidx = 0;
- unsigned char *dofstrtab = NULL;
unsigned char *buf;
int fixedprobes = 0;
#endif
@@ -209,7 +212,9 @@ dtrace_dof_init(void)
} else if (shdr.sh_type == SHT_PROGBITS) {
s = elf_strptr(e, shstridx, shdr.sh_name);
if (s && strcmp(s, ".SUNW_dof") == 0) {
- dof = elf_getdata(scn, NULL)->d_buf;
+ dofdata = elf_getdata(scn, NULL);
+ dof = dofdata->d_buf;
+ break;
}
}
}
@@ -219,6 +224,9 @@ dtrace_dof_init(void)
close(efd);
return;
}
+
+ while ((char *) dof < (char *) dofdata->d_buf + dofdata->d_size) {
+ dof_next = (void *) ((char *) dof + dof->dofh_filesz);
#endif
if (dof->dofh_ident[DOF_ID_MAG0] != DOF_MAG_MAG0 ||
@@ -290,34 +298,49 @@ dtrace_dof_init(void)
* We are assuming the number of probes is less than the number of
* symbols (libc can have 4k symbols, for example).
*/
- sec = (dof_sec_t *)(dof + 1);
+ secstart = sec = (dof_sec_t *)(dof + 1);
buf = (char *)dof;
for (i = 0; i < dof->dofh_secnum; i++, sec++) {
- if (sec->dofs_type == DOF_SECT_STRTAB)
- dofstrtab = (unsigned char *)(buf + sec->dofs_offset);
- else if (sec->dofs_type == DOF_SECT_PROBES && dofstrtab)
+ if (sec->dofs_type != DOF_SECT_PROVIDER)
+ continue;
+
+ dofprovider = (void *) (buf + sec->dofs_offset);
+ dofstrtab = secstart + dofprovider->dofpv_strtab;
+ dofprobes = secstart + dofprovider->dofpv_probes;
+
+ if (dofstrtab->dofs_type != DOF_SECT_STRTAB) {
+ fprintf(stderr, "WARNING: expected STRTAB section, but got %d\n",
+ dofstrtab->dofs_type);
break;
-
- }
- nprobes = sec->dofs_size / sec->dofs_entsize;
- fixsymbol(e, symtabdata, symtabidx, nprobes, buf, sec, &fixedprobes,
- dofstrtab);
- if (fixedprobes != nprobes) {
- /*
- * If we haven't fixed all the probes using the
- * symtab section, look inside the dynsym
- * section.
- */
- fixsymbol(e, dynsymdata, dynsymidx, nprobes, buf, sec,
- &fixedprobes, dofstrtab);
- }
- if (fixedprobes != nprobes) {
- fprintf(stderr, "WARNING: number of probes "
- "fixed does not match the number of "
- "defined probes (%d != %d, "
- "respectively)\n", fixedprobes, nprobes);
- fprintf(stderr, "WARNING: some probes might "
- "not fire or your program might crash\n");
+ }
+ if (dofprobes->dofs_type != DOF_SECT_PROBES) {
+ fprintf(stderr, "WARNING: expected PROBES section, but got %d\n",
+ dofprobes->dofs_type);
+ break;
+ }
+
+ dprintf(1, "found provider %p\n", dofprovider);
+ dofstrtabraw = (char *)(buf + dofstrtab->dofs_offset);
+ nprobes = dofprobes->dofs_size / dofprobes->dofs_entsize;
+ fixsymbol(e, symtabdata, symtabidx, nprobes, buf, dofprobes, &fixedprobes,
+ dofstrtabraw);
+ if (fixedprobes != nprobes) {
+ /*
+ * If we haven't fixed all the probes using the
+ * symtab section, look inside the dynsym
+ * section.
+ */
+ fixsymbol(e, dynsymdata, dynsymidx, nprobes, buf, dofprobes,
+ &fixedprobes, dofstrtabraw);
+ }
+ if (fixedprobes != nprobes) {
+ fprintf(stderr, "WARNING: number of probes "
+ "fixed does not match the number of "
+ "defined probes (%d != %d, "
+ "respectively)\n", fixedprobes, nprobes);
+ fprintf(stderr, "WARNING: some probes might "
+ "not fire or your program might crash\n");
+ }
}
#endif
if ((gen = ioctl(fd, DTRACEHIOC_ADDDOF, &dh)) == -1)
@@ -330,7 +353,12 @@ dtrace_dof_init(void)
}
(void) close(fd);
+
#if !defined(sun)
+ /* End of while loop */
+ dof = dof_next;
+ }
+
elf_end(e);
(void) close(efd);
#endif
Although, node.js v0.11 has stopped crashing after applying it to the kernel
source code and rebuilding libdtrace
, it still wasn't registering an ustack
helper and a provider (sudo dtrace -l
did not contain any
node<pid>:::
probes).
While reading FreeBSD's source code further, I found an environment
variable DTRACE_DOF_INIT_DEBUG
that helped me to take a deeper look into
what was happening for both node.js v0.10 and v0.11. After setting it to
DTRACE_DOF_INIT_DEBUG=1
node.js has started printing following things to the
stderr:
dtrace DOF node: DTrace ioctl failed for DOF at 0x804c00000:
Argument list too long
This was totally uninformative, and I started grepping through a DTrace kernel
module with a hope to find some clues to this errors. Argument list too long
is a verbose description of the E2BIG
errno, and luckily the first place
where it is used was the place that I needed to fix. Basically, for the security
purpose kernel limits the size of the DOF that could be loaded in it's memory.
This limit is set to the 128 KB by default, and the node.js now has
significantly bigger ustack helper (7 MB for v0.11). Instead of just raising it
to a higher value, I decided to export sysctl
variable to make it configurable
without rebuilding the kernel. Running node again after this tweaks gave me:
dtrace DOF node: DTrace ioctl failed for DOF at 0x804c00000:
Invalid argument
This failure was even more vague, since it meant that the EINVAL
was returned
somewhere, and there was tons of places where it could have happened. After
inserting tons of debug prints in all possible places in kernel, I have isolated
it down to this place. Indeed, both of node DOFs contained a lot of
actions and the default limit (16 * 1024) was way to small for it. Exporting
another sysctl variable has solved all problems and running node.js has finally
printed this:
dtrace DOF node: DTrace ioctl succeeded for DOF at 0x8052c2c2c
dtrace DOF node: DTrace ioctl succeeded for DOF at 0x804c00000
dtrace DOF node: found provider 0x8052c3000
Just to confirm it, I checked the dtrace -l
output and (yikes!) it was there
too:
48986 node909 node ... gc-done
48987 node909 node ... gc-start
48988 node909 node ... http-client-request
48989 node909 node ... http-client-response
48990 node909 node ... http-server-request
48991 node909 node ... http-server-response
48992 node909 node ... net-server-connection
48993 node909 node ... net-socket-read
48994 node909 node ... net-socket-write
48995 node909 node ... net-stream-end
How to apply all this patches #
I have came up with this instruction for fixing your FreeBSD installation to make node.js DTrace helpers work. Just a brief in-line description:
- Apply these patches to
/usr/src
- Rebuild and install kernel:
sudo make buildkernel && sudo make installkernel
- Reboot
sudo shutdown -r
- Raise sysctl limits:
sudo sysctl -w kern.dtrace.helper_actions_max=16000
sudo sysctl -w kern.dtrace.dof_maxsize=8000000
- Clone node.js:
git clone git://github.com/joyent/node && cd node && git checkout v0.10
- Configure it:
./configure --prefix=... --with-dtrace
- Build and install it:
gmake -j24 && gmake install
- Make DTrace device accessible to non-root users:
sudo chmod 666 /dev/dtrace/helper
- Verify that node.js DTrace probes are inserted:
DTRACE_DOF_INIT_DEBUG=1 /path/to/node
.
Thanks for reading this, and please let me know if any of these patches don't work for you!
- Previous: SMIs and Doubles
- Next: Bud - a TLS "swiss knife"