Tuesday, January 31, 2012

Generating Stack Traces from C++ Exceptions

C++ exceptions can be a double edged sword. While they are a fantastic tool for gracefully recovering from errors, when the error cannot be recovered from or is completely unexpected, the nature of exception handling tends to obfuscate the cause of the error. Effective error analysis, particularly in a postmortem environment, requires information about both the direct cause of the error (known at the throw site, and captured in both the exception type and error message) and proximate cause, which requires information layered throughout the call stack to determine. C++'s standard exception model lacks the facilities to capture this type of information.

Tools such as boost::exception, which allow additional information to be attached to exceptions as they propagate upward, can help, but only if the programmers are can be bothered enclose every function in a try-catch block which embeds context information and rethrows the exception. That's far from an ideal solution.

In many other programming languages with exception-based error handling, there exists a mechanism to extract a stack trace from the exception object. Typically that's sufficient context to analyze the error. Unfortunately because there is no standard C++ ABI, this can't be written into the language specification. Nonetheless, on many platforms it is possible to implement call stack capturing within the C++ exception model. Here I will focus on Windows.

Windows provides a robust API (DbgHelp) for debugging as part of the core Win32 SDK. We can use this to implement a class to save the current call stack as an array of 64-bit frame pointers. We can also implement a function to take that array of frame pointers and transform it into a human-readable stack trace - if PDBs are available, it will include symbol names. Otherwise, it will simply list the frame pointer addresses, which are still useful (for example, when debugging a minidump.)

class sym_handler
{
public:
    static sym_handler& get_instance()
    {
        static sym_handler instance;
        return instance;
    }

    std::string get_symbol_info(uint64_t addr)
    {
        std::stringstream ss;
        DWORD64 displacement64;
        DWORD displacement;
        char symbol_buffer[sizeof(SYMBOL_INFO) + 256];
        SYMBOL_INFO* symbol = reinterpret_cast<SYMBOL_INFO*>(symbol_buffer);
        symbol->SizeOfStruct = sizeof(SYMBOL_INFO);
        symbol->MaxNameLen = 255;

        IMAGEHLP_LINE64 line;
        line.SizeOfStruct = sizeof(IMAGEHLP_LINE64);

        ss << boost::format("[0x%08X] ") % addr;
        if (m_initialized)
        {
            if (SymFromAddr(GetCurrentProcess(),
                            addr,
                            &displacement64,
                            symbol))
            {
                ss << symbol->Name;
                if (SymGetLineFromAddr64(GetCurrentProcess(),
                                            addr,
                                            &displacement,
                                            &line))
                {
                    ss << (boost::format(" (%s:%d)") % line.FileName % line.LineNumber).str();
                }
            }
        }
        return ss.str();
    }

    void capture_stack_trace(CONTEXT* context, uint64_t* frame_ptrs, size_t count, size_t skip)
    {
        if (m_initialized)
        {
            CONTEXT current_context;
            if (!context)
            {
                RtlCaptureContext(&current_context);
                context = &current_context;
            }

            DWORD machine_type;
            STACKFRAME64 frame;
            ZeroMemory(&frame, sizeof(frame));
            frame.AddrPC.Mode = AddrModeFlat;
            frame.AddrFrame.Mode = AddrModeFlat;
            frame.AddrStack.Mode = AddrModeFlat;
#ifdef _M_X64
            frame.AddrPC.Offset = context->Rip;
            frame.AddrFrame.Offset = context->Rbp;
            frame.AddrStack.Offset = context->Rsp;
            machine_type = IMAGE_FILE_MACHINE_AMD64;
#else
            frame.AddrPC.Offset = context->Eip;
            frame.AddrPC.Offset = context->Ebp;
            frame.AddrPC.Offset = context->Esp;
            machine_type = IMAGE_FILE_MACHINE_I386;
#endif
            for (size_t i = 0; i < count + skip; i++)
            {
                if (StackWalk64(machine_type,
                                GetCurrentProcess(),
                                GetCurrentThread(),
                                &frame,
                                context,
                                NULL,
                                SymFunctionTableAccess64,
                                SymGetModuleBase64,
                                NULL))
                {
                    if (i >= skip)
                    {
                        frame_ptrs[i - skip] = frame.AddrPC.Offset;
                    }
                }
                else
                {
                    break;
                }
            }
        }
    }

private:
    sym_handler()
    {
        m_initialized = SymInitialize(GetCurrentProcess(), NULL, TRUE) == TRUE;
    }

    ~sym_handler()
    {
        if (m_initialized)
        {
            SymCleanup(GetCurrentProcess());
            m_initialized = false;
        }
    }

    bool m_initialized;
};

With that out of the way the first thing to do is define an exception class which will be the base for all custom exceptions used in our application. We will derive from boost::exception so we can use its mechanisms for attaching arbitrary data to exception objects. While that isn't strictly necessary - we could easily store the stack trace directly within the my_exception object - it's cleaner. We'll also define a template class which can be used to create a hierarchy of exception types without needing to write any extra code, by specifying the parent class as a template parameter.

class my_exception : virtual public std::exception,
                     virtual public boost::exception
{
public:
    virtual const char* what() const throw()
    {
        return m_message;
    }

protected:
    my_exception(const char* msg)
    {
        strcpy_s(m_message, msg);
        *this << attach_stack_trace(stack_trace(NULL, 2));
    }

    char m_message[256];
};

template <typename Tag, typename Base = my_exception>
class error : public Base
{
public:
    error(const char* msg)
    : Base(msg)
    {
    }
};

The my_exception constructor attaches a stack_trace object to itself upon construction. Here's the code for attach_stack_trace:

class stack_trace
{
public:
    stack_trace(CONTEXT* context, size_t skip)
    {
        ZeroMemory(m_frame_ptrs, sizeof(m_frame_ptrs));
        sym_handler::get_instance().capture_stack_trace(context,
                                                        m_frame_ptrs,
                                                        max_frame_ptrs,
                                                        skip);

    }

    std::string to_string() const
    {
        std::stringstream ss;
        for (size_t i = 0;
             i < max_frame_ptrs && m_frame_ptrs[i];
             ++i)
        {
            ss << sym_handler::get_instance().get_symbol_info(m_frame_ptrs[i]) << "\n";
        }
        return ss.str();
    }

private:
    static const size_t max_frame_ptrs = 16;
    uint64_t m_frame_ptrs[max_frame_ptrs];
};

typedef boost::error_info<stack_trace, stack_trace> attach_stack_trace;

inline std::string to_string(const stack_trace& trace)
{
    return trace.to_string();
}

Now we can test it out:

typedef error<struct invalid_parameter_tag> invalid_parameter_error;

void foo(int x)
{
    if (x < 0) BOOST_THROW_EXCEPTION(invalid_parameter_error("x must be >= 0"));
}

int WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR, int)
{
    try
    {
        foo(-1);
    }
    catch (std::exception& e)
    {
        OutputDebugStringA(boost::diagnostic_information(e).c_str());
    }
    return 0;
}
   

Because I'm using the boost::exception framework, I use the BOOST_THROW_EXCEPTION macro to throw. It's entirely possible to implement this kind of system without using boost, but I find it more convenient. In any case, here's the output:

failfast.cpp(299): Throw in function void __cdecl foo(int)
Dynamic exception type: class boost::exception_detail::clone_impl<class error<struct invalid_parameter_tag,class my_exception> >
std::exception::what: x must be >= 0
[class stack_trace * __ptr64] = [0x13F2D4664] my_exception::my_exception (c:\dev\failfast\failfast\failfast.cpp:202)
[0x13F2D4C50] error<invalid_parameter_tag,my_exception>::error<invalid_parameter_tag,my_exception> (c:\dev\failfast\failfast\failfast.cpp:214)
[0x13F2D3AF1] foo (c:\dev\failfast\failfast\failfast.cpp:299)
[0x13F2D3BE1] WinMain (c:\dev\failfast\failfast\failfast.cpp:308)
[0x13F306656] __tmainCRTStartup (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c:547)
[0x13F30634E] WinMainCRTStartup (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c:371)
[0x7775652D] BaseThreadInitThunk
[0x7788C521] RtlUserThreadStart

That's a fantastic amount of information. Even without PDBs, we'd still have access to the call stack frame pointers, and it's simple enough to query a debugger for the symbol names. Unfortunately there's a big problem with this approach - it only attaches the stack trace to exceptions derived from my_exception. If an exception is being thrown elsewhere, for example from STL or Boost, and it propagates to the top-level catch block, it will be neearly impossible to determine its cause.

One possible approach would be to not try to catch them at all - just let the application crash. The crash dump will have the stack trace at the throw site. If you can get away with it, this works well enough, but it precludes you from doing any cleanup. Also, if you have any catch-all-and-rethrow blocks for disposing non-RAII resources, the crash site will be at the rethrow, which generally won't have usable information.

Another approach would be to take the opposite tack and try to catch everything - surround all calls to library functions which could potentially throw with a try/catch block, then rethrow one of our custom exceptions. Although the stack trace wouldn't extend into the library code, unless the bug is within the library that wouldn't be necessary. This is good practice in general because it allows us to transform generic exceptions (such as out_of_range) into specific ones (too_many_foos_in_the_foo_queue), which we might be able to handle gracefully without terminating.

In practice, however, trying to catch all exceptions thrown by 3rd party code isn't practical. And it's precisely those which slip through the cracks which having a stack trace would be most valuable for diagnosing. Fortunately, there is a workaround that will allow us to capture stack traces for uncaught exceptions, although the process is a little arcane and dips into some of the undocumented nethers of the Win32 API. If that hasn't scared you off, here's how it works: when a C++ exception is thrown without a corresponding catch block to catch it, an SEH exception is raised. When an SEH exception is raised, Windows checks for any registered handlers (i.e. __except blocks); if none accept the exception, it calls the UnhandledExceptionFilter function.

Two things to note: the top-level exception filter is run prior to the stack being unwound. Second, we can use SetUnhandledExceptionFilter to install our own callback function. It turns out, we can also throw C++ exceptions from that callback function, and since the stack remains intact, they will propegate upward the same way that the original exception would have.

First, let's define an exception that will be thrown when an exception that would not have been caught is thrown:

typedef error<struct unhandled_exception_tag> unhandled_exception;

Next, we'll define our unhandled exception filter:

LONG WINAPI RethrowCppExceptions(EXCEPTION_POINTERS* e)
{
    if (e->ExceptionRecord->ExceptionCode == 0xe06d7363)
    {
        BOOST_THROW_EXCEPTION(unhandled_exception("Unhandled C++ exception")
                              << attach_original_exception_type(e->ExceptionRecord->ExceptionInformation[2]));
    }
    return EXCEPTION_CONTINUE_SEARCH;
}

The magic number (0xe06d7363) is the exception code assigned to SEH exceptions which are triggered from uncaught C++ exceptions being thrown. There's another part to this function which I haven't explained yet - attach_original_exception_type. One of the interesting things that happens when SEH takes over for an uncaught C++ exception is that the address of the exception's constructor is passed as a parameter to the SEH exception. Since a constructor has the same name as the type, by examining this symbol we can determine what the type of the original exception was. We can do this either at runtime (if PDBs are available) or in a postmortem debugger.

class original_exception_type
{
public:
    original_exception_type(uint64_t addr)
    : m_addr(addr)
    {
    }

    std::string to_string() const
    {
        return sym_handler::get_instance().get_symbol_info(m_addr);
    }

private:
    uint64_t m_addr;
};

typedef boost::error_info<original_exception_type, original_exception_type> attach_original_exception_type;

inline std::string to_string(const original_exception_type& type)
{
    return type.to_string();
}

All that remains is to register the unhandled exception filter. This is actually a little on the tricky side - Windows only allows one top-level exception filter to be installed at a time. And debuggers like to overwrite this, so they can be notified when a SEH exception is thrown. Which means that simply calling SetUnhandledExceptionFilter won't work when we're being debugged (although it will work otherwise.) Having radically different program behavior in a
debugger isn't acceptable, but fortunately there's an easy workaround.

int WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR, int)
{
    SetUnhandledExceptionFilter(RethrowCppExceptions);
    AddVectoredContinueHandler(1, RethrowCppExceptions);
    // ...
}

AddVectoredContinueHandler won't do anything normally - the unhandled exception filter will rethrow the exception before it can run. But when the program is being debugged, it will be called by the debugger's filter function, and can rethrow from there.

Let's verify that everything is working as expected:

void foo()
{
    std::vector<int> v;
    v.at(1);
}

int WINAPI WinMain(HINSTANCE, HINSTANCE, LPSTR, int)
{
    SetUnhandledExceptionFilter(RethrowCppExceptions);
    AddVectoredContinueHandler(1, RethrowCppExceptions);
    try
    {
        foo();
    }
    catch (my_exception& e)
    {
        OutputDebugStringA(boost::diagnostic_information(e).c_str());
    }

    return 0;
}

This program outputs:

FailFast.cpp(228): Throw in function long __cdecl RethrowCppExceptions(struct _EXCEPTION_POINTERS *)
Dynamic exception type: class boost::exception_detail::clone_impl<class error<struct unhandled_exception_tag,class my_exception> >
std::exception::what: Unhandled C++ exception
[class original_exception_type * __ptr64] = [0x790AA430] TI3?AVout_of_rangestd
[class stack_trace * __ptr64] = [0x13F4E37E6] error<unhandled_exception_tag,my_exception>::error<unhandled_exception_tag,my_exception> (c:\dev\failfast\failfast\failfast.cpp:214)
[0x13F4E2B7E] RethrowCppExceptions (c:\dev\failfast\failfast\failfast.cpp:228)
[0x7787A59F] vsprintf_s
[0x7786DE03] RtlDeleteTimer
[0x778B1278] KiUserExceptionDispatcher
[0x7FEFDE1CACD] RaiseException
[0x792F1345] CxxThrowException
[0x7904F976] std::_Xout_of_range
[0x13F4E2C66] WinMain (c:\dev\failfast\failfast\failfast.cpp:309)
[0x13F4EE123] __tmainCRTStartup (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c:547)
[0x7775652D] BaseThreadInitThunk
[0x7788C521] RtlUserThreadStart

There are a few things we need to be careful of when using this technique. First, there absolutely must be a top-level catch block in every thread. If not, any uncaught exception will result in an infinite loop. (It shouldn't be too difficult to work around this if needed - for example, if a third party library creates its own threads and throws exceptions on them without catching them. A thread-local boolean variable can track if the exception filter has already run and avoid rethrowing if true - this will simply crash the application, instead of hanging.)

Second, if you use the catch-all-and-rethrow idiom anywhere in your code, you may need to refactor it to catch only exceptions derived from my_exception. When you catch-all and rethrow, the SEH exception will be raised from the site of the last rethrow, which will greatly reduce the usefulness of the stack trace. However, you need to be careful of this type of scenario:

try
{
    try
    {
        foo();                //    throws my_exception
        my_vector.at(-1);    //    throws std::out_of_range
    }
    catch (my_exception& e)
    {
        cleanup();
        throw;
    }
}
catch (std::out_of_range&)
{
    //    handle out of range error
}

If foo throws an std::out_of_range exception, then cleanup() will never be called. You can avoid this situation by adopting a rule that you will only ever catch non-my_exception objects immediately, and if you need to defer handling them, rethrow a my_exception-derived type. For example:

try
{
    try
    {
        foo();
        try
        {
            my_vector.at(-1);
        }
        catch (std::out_of_range&)
        {
            throw my_out_of_range();
        }
    }
    catch (my_exception& e)
    {
        cleanup();
        throw;
    }
}
catch (my_out_of_range&)
{
    // handle out of range error
}

Although it can be messy, in practice this should be a very rare scenario.

Now that we have the framework in place to generate stack traces, the question becomes how to access this information. If your application is set up to use Windows Error Reporting, the easiest way to do this is to simply cause a crash in the top-level catch handler. Microsoft actually provides a function to do this - RaiseFailFastException. And since minidumps normally don't contain heap data, you'll want to first copy the stack trace to a buffer on the stack.

    catch (my_exception& e)
    {
        char buffer[4096];
        ZeroMemory(buffer, sizeof(buffer));
        try
        {
            boost::diagnostic_information(e).copy(buffer, sizeof(buffer), 0);
        }
        catch (...)
        {
        }
        OutputDebugStringA(buffer);
        RaiseFailFastException(NULL, NULL, 0);
    }
   

The inner try/catch block is extremely important, because without it, if boost::diagnostic_information throws, the application will hang.

We can test out our minidump debugging capabilities by forcing Windows to store local dump files. To do this, edit the registry key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting\LocalDumps. Add the following values, editing as needed:



Next, run the application. It will crash, as expected, and a minidump will be created in the folder specified under DumpFolder. Fire up WinDbg and open up the minidump, then press Alt+6 to open up the call stack window. You'll see something that looks like this:



Double-click on the second item, which should be the top-level catch block. Finally, to view the contents of the buffer, enter the following command:

0:000> .printf "%ma", buffer
FailFast.cpp(228): Throw in function long __cdecl RethrowCppExceptions(struct _EXCEPTION_POINTERS *)
Dynamic exception type: class boost::exception_detail::clone_impl<class error<struct unhandled_exception_tag,class my_exception> >
std::exception::what: Unhandled C++ exception
[class original_exception_type * __ptr64] = [0x7887A430] TI3?AVout_of_rangestd
[class stack_trace * __ptr64] = [0x13F8F37E6] error<unhandled_exception_tag,my_exception>::error<unhandled_exception_tag,my_exception> (c:\dev\failfast\failfast\failfast.cpp:214)
[0x13F8F2B7E] RethrowCppExceptions (c:\dev\failfast\failfast\failfast.cpp:228)
[0x777D9450] UnhandledExceptionFilter
[0x778F43B8] MD5Final
[0x778785A8] _C_specific_handler
[0x77889D0D] RtlDecodePointer
[0x778791AF] RtlUnwindEx
[0x778797A8] RtlRaiseException
[0x7FEFDE1CACD] RaiseException
[0x79071345] CxxThrowException
[0x7881F976] std::_Xout_of_range
[0x13F8F2C66] WinMain (c:\dev\failfast\failfast\failfast.cpp:309)
[0x13F8FE123] __tmainCRTStartup (f:\dd\vctools\crt_bld\self_64_amd64\crt\src\crtexe.c:547)
[0x7775652D] BaseThreadInitThunk
[0x7788C521] RtlUserThreadStart

If the minidump was generated without PDBs available, you won't get symbol names, but you can query the debugger using the ln command:

0:000> ln 13F8F2C66
c:\dev\failfast\failfast\failfast.cpp(309)+0x1e
(00000001`3f8f2c00)   FailFast!WinMain+0x66   |  (00000001`3f8f2c90)   FailFast!std::basic_string<char,std::char_traits<char>,std::allocator<char> >::basic_string<char,std::char_traits<char>,std::allocator<char> >

And now you have everything you need to track down those pesky exceptions.

No comments:

Post a Comment