Microsoft Windows Help Centre Handles Malformed Escape Sequences Incorrectly ---------------------------------------------------------------------------- Help and Support Centre is the default application provided to access online documentation for Microsoft Windows. Microsoft supports accessing help documents directly via URLs by installing a protocol handler for the scheme "hcp", a typical example is provided in the Windows XP Command Line Reference, available at http://technet.microsoft.com/en-us/library/bb490918.aspx. Using hcp:// URLs is intended to be safe, as when invoked via the registered protocol handler the command line parameter /fromhcp is passed to the help centre application. This flag switches the help centre into a restricted mode, which will only permit a whitelisted set of help documents and parameters. This design, introduced in SP2, is reasonably sound. A whitelist of trusted documents is a safe way of allowing interaction with the documentation from less-trusted sources. Unfortunately, an implementation error in the whitelist allows it to be evaded. URLs are normalised and unescaped prior to validation using MPC::HTML::UrlUnescapeW(), which in turn uses MPC::HexToNum() to translate URL escape sequences into their original characters, the relevant code from helpctr.exe 5.1.2600.5512 (latest at time of writing) is below. .text:0106684C Unescape: .text:0106684C cmp di, '%' ; di contains the current wchar in the input URL. .text:01066850 jnz short LiteralChar ; if this is not a '%', it must be a literal character. .text:01066852 push esi ; esi contains a pointer to the current position in URL to unescape. .text:01066853 call ds:wcslen ; find the remaining length. .text:01066859 cmp word ptr [esi], 'u' ; if the next wchar is 'u', this is a unicode escape and I need 4 xdigits. .text:0106685D pop ecx ; this sequence calculates the number of wchars needed (4 or 2). .text:0106685E setz cl ; i.e. %uXXXX (four needed), or %XX (two needed). .text:01066861 mov dl, cl .text:01066863 neg dl .text:01066865 sbb edx, edx .text:01066867 and edx, 3 .text:0106686A inc edx .text:0106686B inc edx .text:0106686C cmp eax, edx ; test if I have enough characters in input to decode. .text:0106686E jl short LiteralChar ; if not enough, this '%' is considered literal. .text:01066870 test cl, cl .text:01066872 movzx eax, word ptr [esi+2] .text:01066876 push eax .text:01066877 jz short NotUnicode .text:01066879 call HexToNum ; call MPC::HexToNum() to convert this nibble (4 bits) to an integer. .text:0106687E mov edi, eax ; edi contains the running total of the value of this escape sequence. .text:01066880 movzx eax, word ptr [esi+4] .text:01066884 push eax .text:01066885 shl edi, 4 ; shift edi left 4 positions to make room for the next digit, i.e. total <<= 4; .text:01066888 call HexToNum .text:0106688D or edi, eax ; or the next value into the 4-bit gap, i.e. total |= val. .text:0106688F movzx eax, word ptr [esi+6]; this process continues for the remaining wchars. .text:01066893 push eax .text:01066894 shl edi, 4 .text:01066897 call HexToNum .text:0106689C or edi, eax .text:0106689E movzx eax, word ptr [esi+8] .text:010668A2 push eax .text:010668A3 shl edi, 4 .text:010668A6 call HexToNum .text:010668AB or edi, eax .text:010668AD add esi, 0Ah ; account for number of bytes (not chars) consumed by the escape. .text:010668B0 jmp short FinishedEscape .text:010668B2 .text:010668B2 NotUnicode: .text:010668B2 call HexToNum ; this is the same code, but for non-unicode sequences (e.g. %41, instead of %u0041) .text:010668B7 mov edi, eax .text:010668B9 movzx eax, word ptr [esi] .text:010668BC push eax .text:010668BD call HexToNum .text:010668C2 shl eax, 4 .text:010668C5 or edi, eax .text:010668C7 add esi, 4 ; account for number of bytes (not chars) consumed by the escape. .text:010668CA .text:010668CA FinishedEscape: .text:010668CA test di, di .text:010668CD jz short loc_10668DA .text:010668CF .text:010668CF LiteralChar: .text:010668CF push edi ; append the final value to the normalised string using a std::string append. .text:010668D0 mov ecx, [ebp+unescaped] .text:010668D3 push 1 .text:010668D5 call std::string::append .text:010668DA mov di, [esi] ; fetch the next input character. .text:010668DD test di, di ; have we reached the NUL terminator? .text:010668E0 jnz Unescape ; process next char. This code seems sane, but an error exists due to how MPC::HexToNum() handles error conditions, the relevant section of code is annotated below. .text:0102D32A mov edi, edi .text:0102D32C push ebp .text:0102D32D mov ebp, esp ; function prologue. .text:0102D32F mov eax, [ebp+arg_0] ; fetch the character to convert. .text:0102D332 cmp eax, '0' .text:0102D335 jl short CheckUppercase ; is it a digit? .text:0102D337 cmp eax, '9' .text:0102D33A jg short CheckUppercase .text:0102D33C add eax, 0FFFFFFD0h ; atoi(), probably written val - '0' and optimised by compiler. .text:0102D33F jmp short Complete .text:0102D341 CheckUppercase: .text:0102D341 cmp eax, 'A' .text:0102D344 jl short CheckLowercase ; is it an uppercase xdigit? .text:0102D346 cmp eax, 'F' .text:0102D349 jg short CheckLowercase .text:0102D34B add eax, 0FFFFFFC9h ; atoi() .text:0102D34E jmp short Complete .text:0102D350 CheckLowercase: .text:0102D350 cmp eax, 'a' .text:0102D353 jl short Invalid ; lowercase xdigit? .text:0102D355 cmp eax, 'f' .text:0102D358 jg short Invalid .text:0102D35A add eax, 0FFFFFFA9h ; atoi() .text:0102D35D jmp short Complete .text:0102D35F Invalid: .text:0102D35F or eax, 0FFFFFFFFh ; invalid character, return -1 .text:0102D362 Complete: .text:0102D362 pop ebp .text:0102D363 retn 4 Thus, MPC::HTML::UrlUnescapeW() does not check the return code of MPC::HexToNum() as required, and therefore can be manipulated into appending unexpected garbage onto std::strings. This error may appear benign, but we can use the miscalculations produced later in the code to evade the /fromhcp whitelist. Assuming that we can access arbitrary help documents (full details of how the MPC:: error can be used to accomplish this will be explained below), we must identify a document that can be controlled purely from the URL used to access it. After browsing the documents available in a typical installation, the author concluded the only way to do this would be a cross site scripting error. After some careful searching, a candidate was discovered: hcp://system/sysinfo/sysinfomain.htm?svr=