01-Aug-2008 21:18

dtrace progress 20080801


The blog may have been quiet for a couple of weeks, but only because I was on holiday.

On holiday, with a 32-bit Linux laptop - so I managed to clean up and fix a number of 32-bit issues with dtrace. It now works quite well on Ubuntu 8.

I have just put out a new release - and fixed the accidentally broken 64-bit dtrace userland binary.

I have been working on getting the D functions: stack() and ustack() to work. For 32-bit kernels, ustack seems to work - but I have a lot of work to do to get the Psymtab.c code to lookup the symbol entries and print them symbollically. I hit issues with libelf.so not seeming to work properly for me, so need to investigate that issue.

The kernel stack is probably wrong and a pain due to -fomit-frame-pointer use in the Ubuntu kernel - the goal is to achieve what a kernel stack trace can do, along with symbols. (dtrace reads /proc/kallsyms so we should be able to see something).

The Sun code walks the stack looking at signals and interrupts - and I ripped out that code to get it to work at all. I will need to spend more effort here.

I need to get the 64-bit kernel and user stack dumps working, as that will give a huge amount of functionality to probe the system. If i can get the symtabs to work - that will be a great milestone.

Unfortunately, Unix has gotten itself in a mess with the ELF file format: the simple libelf library has difficulty handling 32 or 64 bit files given that the source process may be 32 or 64 bit itself, and so libraries such as <gelf.h> seem to exist to try and hide the word size issue.

Given that Solaris is a pure ELF system, and much of the symtab lookup on a Linux system is embedded in gdb (along with stack tracing), means that we end up with a bit of an emulation on top of an emulation in dtrace - but so far, no big issues (other than concern over the multitudes of libelf.so variants in the wild).

dtrace is still not ready for a prime time production system - it may work for you, it may not, but hopefully, over time, the port will become more robust, and work in most areas.

I hit some issues with accessing /proc/PID/mem where we can't access certain areas of memory - and I think this may be a bug in Linux kernels, but I need to work out if its me at fault or Linux.


Posted by Paul Fox | Permalink

13-Jul-2008 16:59

dtrace progress 20080713


dtrace quality is improving, but more things need to be done. I realised much of libproc.a wasnt being linked in due to the order of link libraries. This was fixed, but caused me a torrent of new compilation fixes to be exercised. libproc.a links in but using much of the Solaris /procfs code, so this needs work.

I first noticed this in trying to get the:

ustack();

function to do something sane.

Fixed an issue with syscall tracing being too raw - the return of a syscall ideally needs to be -1 (arg0) and arg1 can be used to access the raw return value. This avoids differences among the platforms.

Trying to work through my own bug list and tests, so I can start looking at fasttrap.

A goal of fasttrap will be to put some dtrace probes into CRiSP (why? because I can!), which will let me better understand dtrace and userland probings. There may be a lot of things that could be done here, but I need to understand how it all fits together first.

There are still reliability issues with the port, but syscall tracing works - quite well, but havent validated every syscall. (We still know that tracing read may cause issues).


Posted by Paul Fox | Permalink

05-Jul-2008 09:27

Dtrace for Linux - Sun has spilled the beans !


Bryan Cantrill at Sun has talked about the dtrace project. References here and here.

I guess I need to watch my inbox now :-)

A lot of debate surrounds GPL vs CDDL and why dtrace cannot be done. I dont believe thats the case - and I am not asserting anything good or bad about either license, just trying to live within the realms of what each group believes in.

Back to the plot...

Dtrace on 64 + 32 bit platforms works well for the basics. I fixed the cyclic timer stuff (sort of) - so that now we dont put undue stress on the system being monitored (e.g. /var/log/messages isnt filling up with debug statements quite so much).

I have slightly reorganised the tree and removed the need to manually locate libgcc.a (needed for the 64-bit mod/div stuff). A perl script will now locate it for you.

The profile (profile + ticks) probe code is now linked in but I need to work out what the omni timer stuff is doing.

Mojmir Svoboda gave a couple of better Linux calls to use - but kernel 2.6.24 doesnt compile whereas 2.6.24.4 and above does. (Strange because it compiles on 2.6.23.1). I need to clean up the calling sequence to kmem_cache_create() to resolve this.

Lex/yacc hit me again. I wrote a script (make test) which runs all the demo D scripts and validates for syntax errors, and thats kicking up lexer issues in the way predicates are parsed. My fight with flex should be over and if it happens again, out goes dt_lex.l and a plain portable C version will be written to get rid of the confusion of input(), unpuc() and YY_INPUT.

I found the PID provider (didnt notice it was even missing, since I am not a dtrace expert!). Its hiding in fasttrap (ah! now I know what it does). This wants hooks into the kernel. Watch this space for a solution which doesnt break the GPL or require a kernel recompile.

I looked at ON Solaris 20060823 and found a later version of dtrace. Not much changed (I think) in that release but have started upgrading the sources and comments to match. Sun will have made fixes for good reason, so better to grab them ASAP and get the diff/merge pain over with.

I am still trying to make daily updates or updates when I feel the release 'works' vs having uncovered issues in porting/compiling, so keep an eye on the download area and grab sources if you want to play.

It should be a matter of 'make all; make load'.

Heres some music for you to listen to ...

/home/fox/src/dtrace@vmubuntu: dtrace -f open
dtrace: description 'open' matched 2 probes
CPU     ID                    FUNCTION:NAME
  0     21                       open:entry
  0     22                      open:return
  0     21                       open:entry

Posted by Paul Fox | Permalink

29-Jun-2008 22:09

dtrace progress 20080629


It works...again!

Spent last couple of weeks trying to get 32-bit dtrace to work and after a lot of mishaps and sillynesses, it works on Ubuntu 8 kernel (untouched by human hand).

A new release is available and I can go back to validating the 64-bit port, but also testing/checking out functionality.

Still lots of things to do to dtrace - I fixed the eat cpu as fast as possible issue, but need to get the cyclic timers emulation working else dtrace will give up after 30s or so as it things the system will have lost responsiveness.


Posted by Paul Fox | Permalink

22-Jun-2008 21:57

dtrace progress 20080622


I havent updated the blog in a while, but thought it worth putting out a brief update.

dtrace for 64b Linux works (sort of). Theres still some rough edges, and am trying to get the clock (cyclic.c) to work, otherwise dtrace will give up after 30s thinking we have hung the system.

I am taking a diversion to get the 32b version working - this has been a trying experience (deficiencies in Linux + VMware). E.g. at present 64-bit divide/module in the kernel is a pain, as the way the kernel is compiled precludes use of these functions in the kernel.

I can do dtrace -l on a 32b kernel, and off to fix the next set of 'things'.

The 32b port is delaying making further progress, but at least various portability issues have now been resolved.


Posted by dtrace progress 20080622 | Permalink

07-Jun-2008 18:02

dtrace for linux progress


Progress has again been slightly slow - i realised that i was missing system call tracing, which isnt in the /dev/sdt driver, but scattered around the FreeBSD and Solaris kernels, fairly obviously.

Linux kernel 2.6 tries to protect the system call table from patching by marking the syscall table as read only and protected. I need to write the code to unprotect this (I have some sample code which modifies the CPUs CR3 register, but i would like to use the kernel routines to unprotect the page - when I find them).

I have added a new driver to the kit -- /dev/syscall, although I dont think its for userland consumption. This will drop the dtrace probes for all listed system calls, although I note the new kernels allow custom syscalls to be installed, so i will have to trawl these data structures.

Am just trying to debug some instability - even with the syscall driver disabled. Hopefully soon, we will have something decent to play with.

Stay tuned...


Posted by Paul Fox | Permalink

31-May-2008 00:04

dtrace progress 20080530 .. so close !


Progress on dtrace is coming in waves now.

I had issues getting to the DTRACEIOC_ENABLE ioctl - after a bit of head scratching, this was resolved, and now we get to DTRACEIOC_GO.

Even better, we get past DTRACEIOC_GO.

What does this mean?

It means the dtrace binary talks to the kernel, can pass any D script in, and then wait for the buffered information to be made available.

I found out again, my dtrace.c kernel code was outdated, so spent some time merging the latest OpenSolaris code in - still some stuff to merge, but lots of useful things in there, such as 128-bit arithmetic for tracking big counters, and some validation checks on the way memory is accessed. (Validating memory doesnt help me, because i default to enabling probing to anywhere - need to fix this at some point so that hackers and accidents can't bring the system down).

Now we are past the DTRACEIOC_GO, i found some issues with userland dtrace - stubs i hadnt coded. Thats partially done (gethrtime() and a partial pthread cond wait function).

Heres an example 'session':

/home/fox/src/dtrace/drivers/dtrace@vmubuntu: dtrace -v -f journal_invalidatepage
dtrace: description 'journal_invalidatepage' matched 6 probes

Stability attributes for description journal_invalidatepage:

        Minimum Probe Description Attributes
                Identifier Names: Unstable
                Data Semantics:   Unstable
                Dependency Class: Common

        Minimum Statement Attributes
                Identifier Names: Stable
                Data Semantics:   Stable
                Dependency Class: Common

Not much to look at - i get no output, even after Ctrl-C. (Plus that arbitrary probe isnt interesting really).

I believe we are now firmly in phase 3 of the task: phase I was to build dtrace cmd + driver. Phase II was to get to a point where we dont crash the kernel and the userland command is functional.

Phase III is the point where the whole thing can actually start reporting something/anything.

Phase IV will be examination of SDT an PID providers so we can do really useful stuff.

Stay tuned.

PS I make releases each night if i feel there is progress or important bug fix.es

http://www.crisp.demon.co.uk/tools.html


Posted by Paul Fox | Permalink

22-May-2008 22:18

Dtrace for Linux .. Progress


Slowly getting there. Last week was near zero progress as I attempt to do some CRiSP catch up work, and track a near impossible bug to find. (CRiSP is valgrind pure - no detectable memory corruptions of significance, yet people are reporting strange bugs. Armed with logs, they dont help - Whoof! Out of nowhere - a GPF/Core dump; oh well).

Anyway, back at the dtrace camp - some progress. The startup code is wrong in the kernel - my driver was missing some of the subtlety of the dtrace_attach()/dtrace_open() code, so by the time cmd/dtrace tries to do an ioctl(DTRACE_ENABLE), we hit some null pointers.

I've now protected myself against this. (Such a kernel panic causes a reboot to be required); I know what i need to debug, just some more linux kernel searching to validate I am calling the correct api (dev_set_drvdata).

Had a near panic last night when my vmware/ubuntu refused to boot. I think ubuntu screwed up the /boot/grub/menu.lst - so i was booting a virgin kernel without an initrd ram disk. Fortunately, one of the many menu items was available, so i have been able to make more progress.


Posted by Paul Fox | Permalink

19-May-2008 22:37

dtrace progress 20080519


Progress slow last week - had to do some CRiSP work....

Am spending (too much) time investigating why Ubuntu bison/flex combination doesnt work compared to Fedora 8 bison/flex.

Unfortunately, these tools shoot themselves in the foot - they try to be compatible with old yacc/lex, but are just sufficiently different that a trivial issue becomes very difficult to debug.

Have always disliked lex since it provides so little utility, and debugging it - especially when the lex definition 'just works'.

I can see Apple, in the Darwin code, have hit the same issue, but somehow my issue is very subtle.

Oh well.

Once the portability issue is resolved, I can go back to the driver and just move things along, before I forget how this all works.


Posted by Paul Fox | Permalink

11-May-2008 11:14

dtrace progress


Look at the output below. Real dtrace, real symbols!

Finally able to access the modules and kallsyms to find /dev/fbt entry points to patch in the kernel. This is definitely a milestone - as now, in theory, dtrace can start patching the kernel to insert probes. I have yet to try this - next on my list, to see what actually happens.

Note that we only seem to have a subset of available kernel probes because -- I dont know! Maybe these are the only modules I am loading, and appear to be missing the kernel syms (maybe I need to modify the fbt driver to not just enumerate every module, but to enumerate every kernel/kallsyms entry).

But this gives a huge blast to move forward and start debugging D scripts.

I have truncated the output below - its showing 515 probe points in the kernel (a stripped down linux 2.6.24-4 kernel).

   ID   PROVIDER            MODULE                          FUNCTION NAME
    1     dtrace                                                     BEGIN
    2     dtrace                                                     END
    3     dtrace                                                     ERROR
    4        fbt         dtracedrv                         ctf_close entry
    5        fbt         dtracedrv                         ctf_close return
    6        fbt         dtracedrv                     ctf_func_args entry
    7        fbt         dtracedrv                     ctf_func_args return
    8        fbt        freq_table    cpufreq_frequency_table_target entry
    9        fbt        freq_table    cpufreq_frequency_table_target return
   10        fbt              dock      register_hotplug_dock_device entry
   11        fbt              dock      register_hotplug_dock_device return
   12        fbt              dock    unregister_hotplug_dock_device entry
   13        fbt              dock    unregister_hotplug_dock_device return
   14        fbt        parport_pc        parport_pc_unregister_port entry
   15        fbt        parport_pc             parport_pc_probe_port entry
   16        fbt        parport_pc             parport_pc_probe_port return
   17        fbt           parport    parport_ieee1284_ecp_read_data entry
   18        fbt           parport    parport_ieee1284_ecp_read_data return
   19        fbt           parport    parport_ieee1284_epp_read_data entry
   20        fbt           parport    parport_ieee1284_epp_read_data return
   21        fbt           parport      parport_ieee1284_read_nibble entry
   22        fbt           parport      parport_ieee1284_read_nibble return
   23        fbt           parport        parport_ieee1284_read_byte entry
   24        fbt           parport        parport_ieee1284_read_byte return
   25        fbt           parport                parport_wait_event entry
   26        fbt           parport                parport_wait_event return
   27        fbt           parport           parport_register_driver entry
   28        fbt           parport           parport_register_driver return
   29        fbt           parport    parport_ieee1284_epp_read_addr entry
   30        fbt           parport    parport_ieee1284_epp_read_addr return
   31        fbt           parport                   parport_release entry
   32        fbt           parport             parport_announce_port entry
   33        fbt           parport         parport_unregister_device entry
   34        fbt           parport     parport_ieee1284_write_compat entry
   35        fbt           parport     parport_ieee1284_write_compat return
   36        fbt           parport   parport_ieee1284_epp_write_data entry
   37        fbt           parport         parport_unregister_driver entry
   38        fbt           parport                  parport_put_port entry
   39        fbt           parport                  parport_put_port return
   40        fbt           parport   parport_ieee1284_epp_write_addr entry
   41        fbt           parport               parport_remove_port entry
   42        fbt           parport               parport_remove_port return
   43        fbt           parport             parport_register_port entry
   44        fbt           parport             parport_register_port return
   45        fbt           parport   parport_ieee1284_ecp_write_addr entry
   46        fbt           parport   parport_ieee1284_ecp_write_addr return
   47        fbt           parport   parport_ieee1284_ecp_write_data entry
   48        fbt           parport   parport_ieee1284_ecp_write_data return
   49        fbt          i2c_core                    i2c_new_device entry
   50        fbt          i2c_core                    i2c_new_device return
   51        fbt          i2c_core                         i2c_probe entry
   52        fbt          i2c_core                         i2c_probe return
   53        fbt          i2c_core          i2c_add_numbered_adapter entry
   54        fbt          i2c_core          i2c_add_numbered_adapter return
...

Posted by Paul Fox | Permalink

10-May-2008 17:31

dtrace progress - at last !


$ dtrace -l
   ID   PROVIDER            MODULE                          FUNCTION NAME
    1     dtrace                                                     BEGIN
    2     dtrace                                                     END
    3     dtrace                                                     ERROR

Hooray! We went thru a ton of code in the kernel to dig that out! Now -- to find out what happened to my /dev/fbt entries...


Posted by Paul Fox | Permalink

05-May-2008 18:47

dtrace for linux progress 20080505


Another bank holiday over...dtrace one step closer...

I have decided to consolidate the four drivers into a single dtracedrv.ko driver, to avoid lots of fluff with inter-driver symbol resolution. Having separate drivers causes issues at link-time and leads to a hairy order of dependency as the modules are loaded.

Now we have a single dtracedrv.ko.

Any why is is dtracedrv.ko and not dtrace.ko ?!

Because i havent finished getting the makefiles to work. There is a file called dtrace.c which has most of the kernel guts in it - but not all of it, and the linux kbuild software gets confused if i want my driver to be called by the same name as a dependent source file.

I have also added changed-line support to CRiSP whilst I am at it. Someone had asked for the ability to add a comment in column 73 of a line that is modified (presumably COBOL or Fortran code), and this nearly works. Just need to add the setup menus.

Now...off to get 'dtrace -l' to give me something to probe !


Posted by Paul Fox | Permalink

04-May-2008 11:13

dtrace progress 20080504


This week has seen good progress on dtrace for Linux. Much of the week has seen the /dev/fbt driver move to a point of full functionality.

/dev/fbt is used to allow monitoring around all functions in the kernel, by establishing probes on the entry and exit from a function. This, in theory, gives rise to thousands of probes. (/dev/sdt is needed for high level actions like process fork/death, etc - and will come later).

The GPL/CDDL issue in fbt is resolved by using a helper from user land to allow investigation of the running kernel.

The main thing to get to now is to see if the full dtrace userland path can be executed - eg, for 'dtrace -l' to work to see some probes, and then try some simple D scripts to see if the right things happen.

The pid provider will be needed to get access to current process properties, and hopefully wont be a big deal.

To date, the only kernel expectation is that we have reasonable compile defaults (eg modules, kallsyms, etc) and we havent had to touch or break the kernel.

dtrace wont compile without access to the running kernel sources, but its beginning to look good.


Posted by Paul Fox | Permalink

30-Apr-2008 22:45

dtrace progress ... cddl vs gpl


Progress on the dtrace implementation is continuing and my understanding of how dtrace works is improving immensely.

I have been enabling bits of ioctl() dtrace driver to try and get the cmd/dtrace binary to talk to the kernel. The next step is to find something to trace, so i have been following and understanding the 'providers' which are like 'device drivers' for dtrace. Eg fbt.c provides access to the symbol table in the kernel. This is the core mechanism for being able to monitor any part of the kernel.

Linux contains a symbol table (eg /proc/kallsyms).

But, its locked out for access by non-GPL drivers. I am studying what it does to work out a way to tunnel into it (alternatively, the symtab can be fed from user space into the driver -- more work, but may be a better alternative).

Enter the license wars. fbt.c can gain access to the public functions of kallsyms.c but only if it is declared to be a GPL driver. I am using Sun's implementation which is CDDL, so i need to find a way in. I have a way by declaring myself as GPL but that amounts to putting Sun's code under CDDL, which i do not have the right to do.

For the moment, I am being dirty and taking that route so I can experiment with the core technology and avoid too much licensing-legalese. I dont mind tackling that on its own basis but theres lots of bits to get together to make it all work.

There is a way to technically break the hurdle - which would involve yet another driver (up to three so far -- dtrace, fasttrap and fbt. More provider drivers are forthcoming, so adding another one will not be a big issue).

Its interesting to understand the legalese here - even if I am not a politician or a bigot (I think). I just want to solve the problem.

Once we have a symtab, I can work more on the dtrace driver to ensure 'dtrace -l' shows something useful, and then to try a real D script.

Still a long way to go. I am trying hard to avoid any changes to the kernel source - this will simplify deployment and installation.


Posted by Paul Fox | Permalink

27-Apr-2008 10:38

dtrace progress 27 Apr 2008


Better progress now my Ubuntu compiled kernel boots properly. Been doing various tidyups to ensure everything compiles on a clean system. Changes to the include files were breaking user land dtrace and vice versa. Now stable - bit messy and will try and clean up as I understand how I want to drive this.

One way is to split the main header file -- linux_types.h -- into a separate file for kernel side and user side stuff.

dtrace -l now simply segfaults - bad calling convention, so need to fix that and have started the switch over to make cmd/dtrace use the /dev/dtrace driver entrypoint.

Hope to make more progrss today and regularly update the website with a latest snapshot of the driver.


Posted by Paul Fox | Permalink

26-Apr-2008 12:11

Dtrace progress 20080426


Over the last few weeks, dtrace progress has been coming along. I now have a driver I can load in the kernel and am spending time diligently trying to step through it (initially debugging with printk() statements).

The last week was mostly wasted trying to get a VMware guest with Ubuntu 7 running so when i crash the kernel, I wont lose any work. This was incredibly painful due to the way the udev filesystem works, and the initram disk is setup. I would keep booting a 2.6.24.4 Linux kernel, and have the system not be useful because it couldnt find the hard drive.

Using google, i finally found the update-initramfs script (rather than mkinitramfs) and the kernel is bootable and good for recompiling the driver and testing.

The initial test is simply to load the driver, and do:


cat /dev/dtrace

And check /var/log/messages for the debug printks.

Now this is working, I can get back to user-to-kernel debugging and try and get cmd/dtrace to talk to drivers/dtrace.

You can watch out for daily source code snapshots on http://www.crisp.demon.co.uk/tools.html

If you grab the code, you are on your own for now - until I feel comfortable its ready for prime time.

If anyone wants to contribute - please feel free. There are going to be some fiddly bits later on to resolve (such as kernel hooking and requiring a custom dtrace kernel).

More later.


Posted by Paul Fox | Permalink

20-Apr-2008 21:35

Dtrace now loads into the kernel


Yup - now we have something to play with. The plumbing now needs to be set up for open/read/write so we can start the hard graft and the likely kernel smashing which will ensue.

The current release builds on Ubuntu 7 releases (2.6.22) and Linux kernel 2.6.24.

More later


Posted by Paul Fox | Permalink

19-Apr-2008 23:10

Dtrace Progress


Today marks the day when the kernel driver dtracedrv.ko finally builds. Theres still a lot of work to get the Linux port of dtrace working, but at least (in theory) it can be loaded into a kernel.

Anyone following this progress should be careful to avoid crashing a machine they care about.


Posted by Paul Fox | Permalink

09-Apr-2008 23:09

dtrace progress


the dt_lex.l file now compiles properly and works, so the dtrace binary can now parse scripts ok.

back to the kernel driver trying to reduce compiler errors.


Posted by Paul Fox | Permalink

08-Apr-2008 22:18

dtrace progress 20080407


Some progress being made ... a long road ahead.

Trying to get a very simple command line program to parse and am hitting issues with flex vs Sun's lex code. (String parsing is going to the terminal and waiting for input).

This in turn is causing a parser error and dtrace aborts.

Still about 200 kernel driver for dtrace issues to resolve before getting close to a dtrace.o kernel module. Hopefully most of these are the same root cause compile issues.

Will try and blog each day on useful progress.


Posted by Paul Fox | Permalink