Using WinAFL to Fuzz Hangul(HWP) AppShield

After attending "Vulnerability Discovery and Triage Automation" training at POC 2017, I decided to give WinAFL a shot and fuzz Hangul (HWP) which is a Korean word processor. I'd like to thank @richinseattle for helping me through the WinAFL fuzzing process.

I knew from the past researches that Hangul has integrated a security module named HncAppShield which scans .hwp files for malicious payloads etc before parsing and showing the document to user. Because this module is relatively new and is not developed by Hancom, it seemed to me a good fuzzing target.

Version #1

To use WinAFL, you have to locate specific function to fuzz since it uses persistent fuzzing mode by instrumenting target function to run in a loop without restarting the process. In HncAppShield case, it is a DLL so I created a simple loader to load the DLL and call AppShield_InspectMalware() with fuzzing input. AppShield_InpectMalware() is an exported function which receives file path as an argument. I chose this function to fuzz with WinAFL at first.

 Exported functions of HncAppShield

Exported functions of HncAppShield

#include <stdio.h>
#include <Windows.h>
#include <iostream>

extern "C" __declspec(dllexport) int fuzz_hwp(wchar_t* hwp);

typedef int(*INSPECT)(wchar_t* filename);
INSPECT AppShield_InspectMalware;

wchar_t* charToWChar(const char* text)
{
    size_t size = strlen(text) + 1;
    wchar_t* wa = (wchar_t*)malloc(sizeof(wchar_t) * size);
    mbstowcs(wa, text, size);
    return wa;
}

int fuzz_hwp(wchar_t* filename)
{
    AppShield_InspectMalware(filename);
    return 0;
}

int main(int argc, char** argv)
{
    HINSTANCE HncAppShield = LoadLibrary(L"HncAppShield.dll");
    int isDetected = 0;
    if (HncAppShield) {
        AppShield_InspectMalware = (INSPECT)GetProcAddress(HncAppShield, (LPCSTR)1);
        isDetected = fuzz_hwp(charToWChar(argv[1]));
    }
    printf("[Malware result] %d\n", isDetected);
    return isDetected;
}

I compiled the loader(HncAppShieldLoader.exe) and started fuzzing with WinAFL using the following command.

afl-fuzz.exe -i in -o out -D D:\DynamoRIO\bin32 -t 10000 -- -coverage_module AppShieldDLL.dll -fuzz_iterations 5000 -target_module HncAppShieldLoader.exe -target_method fuzz_hwp -nargs 1 -- .\HncAppShieldLoader.exe @@

 Version #1 running

Version #1 running

WinAFL was working, with about 20 ~ 30 executions per second on my desktop right after the start.

Version #2

Version #1 of the loader was working properly finding new edges, but it was slow and kept showing process nudging messages. This process nudging slowed down fuzzing process like under 1 execs/sec. I needed to get rid of this. 

 Version #1 of the loader slowing down due to nudging

Version #1 of the loader slowing down due to nudging

I didn't know what caused this nudging at first, so Richard gave some comments that this might be due to files or some other resources not properly released and that I must hook more deep into the module to avoid this issue. So I started analyzing what happens after calling AppInspect_Malware() to see if there are other places in the module that can be fuzzed. As you can see by following, there are unnecessary operations at the beginning of AppInspect_Malware().

 Unnecessary actions that need to be ignored

Unnecessary actions that need to be ignored

Getting temporary path, deleting files inside HNC temporary folder and all the checks along the way before actually scanning can all be ignored. If you skip these actions, fuzzing will become more faster and avoid nudging issue. I spent some time reverse engineering the module to find a function that receives HWP file path as an argument and deep enough to avoid unnecessary file system interaction. Then I stumbled upon some function.

2018-01-28 20;09;28.PNG

This function receives a directory path as argument and loads each file inside the directory to memory in a loop. If the filename includes name of the known HWP storage type, it scans for malicious payloads. So I got more deep into the actual scanning logic !

However, I had actual HWP files as fuzzing input but the new function I found required a directory filled with files which have storage type as its filename. From my analysis, I knew that this function actually receives temporary path as argument. Then there must be some function that I have skipped which creates these storage named files inside the temporary path. I have revisited all previous functions that deal with HWP file path and temporary path.

 Function that unpacks HWP file into stream files

Function that unpacks HWP file into stream files

I found a exact function that filled my requirements. It receives input HWP filename and temporary path as arguments. After debugging this function, I confirmed that functions in red boxes are the actual function that unpacks given HWP file to stream files.

Now I can directly call this function from my loader to unpack any HWP file of my choice to a certain directory, and tell the scanning function I found earlier to scan the directory. I updated my loader like the following.

#include <stdio.h>
#include <Windows.h>

extern "C" __declspec(dllexport) int fuzz_hwp(wchar_t* hwp);

// Function Pointers Definition
typedef int(*OPENSTORAGE)(wchar_t*);
typedef BOOL(*HWP_FILE_CHECK1)(wchar_t*);
typedef int(*HWP_FILE_CHECK2)(wchar_t*);
typedef int(*HWP_DUMP)(wchar_t*, int*);
typedef int(*HWP_DUMP_WORKBOOK)(wchar_t*, int, int*);
typedef int(*SCAN_DIRECTORY)();
typedef int(*DELETE_TEMP_FOLDER)();

HWP_FILE_CHECK1 hwp_file_check1;
HWP_FILE_CHECK2 hwp_file_check2;
HWP_DUMP hwp_dump;
HWP_DUMP_WORKBOOK hwp_dump_workbook;
OPENSTORAGE open_storage;
SCAN_DIRECTORY scan_directory;
DELETE_TEMP_FOLDER delete_temp_folder;

wchar_t* output;
wchar_t* input;

wchar_t* get_filename(wchar_t* name)
{
    wchar_t fname[40];
    wchar_t ext[10];
    wchar_t* res = NULL;

    _wsplitpath(name, NULL, NULL, fname, ext);
    res = (wchar_t*)malloc((wcslen(fname) + wcslen(ext) + 1) * sizeof(wchar_t));
    wcscpy(res, fname);
    wcscat(res, ext);
    return res;
}

int filter_exception(int code, PEXCEPTION_POINTERS ex) {
    printf("Filtering %x\n", code);
    return EXCEPTION_EXECUTE_HANDLER;
}

wchar_t* charToWChar(const char* text)
{
    size_t size = strlen(text) + 1;
    wchar_t* wa = (wchar_t*)malloc(sizeof(wchar_t) * size);
    mbstowcs(wa, text, size);
    return wa;
}

int dump_storage(wchar_t* filename) {
    int flag = 0;
    wchar_t destname[MAX_PATH];
    wchar_t* destname_p;
    wchar_t* n;
    wchar_t* filep;

    wcscpy(destname, output);
    wcscat(destname, L"\\");
    n = get_filename(filename);
    wcscat(destname, n);
    free(n);
    destname_p = destname;
    printf("[+] Target file : %ls\n", filename);
    printf("[+] Destination : %ls\n", destname_p);

    __try{
        if (open_storage(filename) || hwp_file_check1(filename))
        {
            printf("[+] Hwp file detected \n");
            __asm { MOV EBX, destname_p }
            hwp_dump(filename, &flag);
            __asm { SUB ESP, 8 }
        }
        else if (hwp_file_check2(filename))
        {
            __asm {MOV ECX, destname_p}
            hwp_dump_workbook(filename, 0, &flag);
        }
        else {
            printf("[!] Invalid file\n");
            return -1;
        }

        printf("[+] Dumped %ls \n", filename);
    }
    __except (filter_exception(GetExceptionCode(), GetExceptionInformation())) {
        printf("[!] Exception Raised While Dumping : %ls\n", filename);
    }
    
}

int fuzz_hwp(wchar_t* filename)
{
    int res = 0;
    dump_storage(filename);
    __asm { MOV ECX, output }
    res = scan_directory();
    __asm { MOV EDI, output }
    delete_temp_folder();
    return res;
}

int main(int argc, char** argv)
{
    int res = 0;
    printf("[-] argv[1] : %s\n", argv[1]);
    printf("[-] argv[2] : %s\n", argv[2]);

    HINSTANCE HncAppShield = LoadLibrary(L"HncAppShield.dll");
    if (HncAppShield) {
        hwp_file_check1 = (HWP_FILE_CHECK1)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) + 0x9640);
        hwp_file_check2 = (HWP_FILE_CHECK2)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) + 0x1c840);
        hwp_dump = (HWP_DUMP)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) + 0xc70);
        hwp_dump_workbook = (HWP_DUMP_WORKBOOK)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) + 0xb70);
        open_storage = (OPENSTORAGE)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) + 0x9eb0);
        scan_directory = (SCAN_DIRECTORY)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) + 0x3b50);
        delete_temp_folder = (DELETE_TEMP_FOLDER)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) - 0x300);
    }
    else {
        printf("[!] HncAppShield.dll not found\n");
    }
    // set input path
    input = charToWChar(argv[1]);
    printf("[+] Input : %ls\n", input);

    // set output path
    output = charToWChar(argv[2]);
    printf("[+] Output Folder : %ls\n", output);

    res = fuzz_hwp(input);
    printf("result : %d\n", res);
    return 0;
}

New loader basically unpacks fuzzing input file to an output folder, scans it and removes files in the output folder.

So after choosing a better target function, I re-executed WinAFL and got better performance like below.

 Version #2 running with better performance

Version #2 running with better performance

Final Version

Although Version #2 had better performance, process nudging was still present similar to Version #1 of the loader. I needed to make one last adjustment of the target function to really start to run fuzzing.

Until this point, my fuzzing input was a HWP file, which is an OLE file. It has several storages and streams inside and like I explained, HncAppShield actually unpacks the HWP file into streams and scans them separately. I suspected that maybe during the unpack process some event happens that slows fuzzing process. 

I started debugging again to see what happens, and found out that some of my initial fuzzing input HWP files throws C++ exceptions while unpacking. I tried to catch exceptions and ignore them inside my loader, but this didn't entirely clear the issue. So last thing I could do was skip the unpacking process and directly fuzz streams inside HWP files.

I created a tool that calls the unpacking function inside HncAppShield to create corpus of streams from HWP files. I downloaded bunch of HWP files online and used my tool to dump streams and organize them by storage types. Now I can fuzz the function(MEM_InspectStorage function discussed earlier) that actually scans HWP streams on memory by giving loaded memory address with stream data and storage type as input. WinAFL finally doesn't have to unpack HWP files at fuzzing time anymore. Following diagram gives high overview of how we got to this point. Note that amount of work that is done by WinAFL has decreased.

 Fuzzing target of each loader version

Fuzzing target of each loader version

So now I have a final version of the loader. It loads stream files to memory and calls the memory scanning function. It's performance was up to around 80 execs per second without process nudging. 

 Final version running

Final version running

afl-fuzz.exe -i BodyText -o out_bodytext -D D:\DynamoRIO\bin32 -t 10000 -- -coverage_module AppShieldDLL.dll -fuzz_iterations 5000 -target_module HncAppShieldLoader.exe -target_method fuzz_storage -nargs 2 -- .\HncAppShieldLoader.exe @@ BodyText

#include <stdio.h>
#include <Windows.h>
#include <iostream>

extern "C" __declspec(dllexport) int fuzz_storage(wchar_t* filename, char* storageType);
extern "C" __declspec(dllexport) int fuzz_hwp(wchar_t* hwp);

typedef int(*INSPECT)(wchar_t* filename);
typedef int(*MEMINSPECT)(void*, int, char*);

INSPECT AppShield_InspectMalware;
MEMINSPECT memory_inspect;

wchar_t* charToWChar(const char* text)
{
    size_t size = strlen(text) + 1;
    wchar_t* wa = (wchar_t*)malloc(sizeof(wchar_t) * size);
    mbstowcs(wa, text, size);
    return wa;
}

int fuzz_hwp(wchar_t* filename)
{
    AppShield_InspectMalware(filename);
    return 0;
}

int fuzz_storage(wchar_t* filename, char* storageType)
{
    FILE* input;
    int fileSize = 0;
    int readBytes = 0;
    void* fileContents = 0;
    int ret = 0;

    if (!_wfopen_s(&input, filename, L"r+b")){
        if (input)
        {
            fseek(input, 0, SEEK_END);
            fileSize = ftell(input);
            if (fileSize != -1)
            {
                fseek(input, 0, 0);
                fileContents = malloc(fileSize);
                if (fileContents)
                {
                    readBytes = fread(fileContents, 1, fileSize, input);
                    if (readBytes != fileSize){
                        OutputDebugStringW(L"[!] File read error\n");
                    }

                    ret = memory_inspect(fileContents, readBytes, storageType);
                }
            }
        }
        free(fileContents);
        fclose(input);
    }
    
    return ret;
}

int main(int argc, char** argv)
{
    HINSTANCE HncAppShield = LoadLibrary(L"HncAppShield.dll");
    int isDetected = 0;
    if (HncAppShield) {
        AppShield_InspectMalware = (INSPECT)GetProcAddress(HncAppShield, (LPCSTR)1);
        memory_inspect = (MEMINSPECT)((char*)GetProcAddress(HncAppShield, (LPCSTR)1) + 0x3620);
        
        if (argc == 2) {
            isDetected = fuzz_hwp(charToWChar(argv[1]));
        }
        if (argc == 3) {
            isDetected = fuzz_storage(charToWChar(argv[1]), argv[2]);
        }
    }
    printf("[Malware result] %d\n", isDetected);
    return isDetected;
}

Conclusion

I was immediately getting unique crashes and new edges, so I let it run for about a week. After a week, I had couple hundreds of unique crash files but most of them seem to crash inside a same function. Using BugId to triage, resulted small set of unique crash files. I reported some issues to KISA(KRCERT) and currently waiting for their official reply.

HncAppShield is mostly composed of memory reads and compare, so I doubt that there are highly valuable vulnerabilities like code execution bugs. However you guys might be able to find security check bypasses which can be severe if combined with vulnerabilities in the main HWP module. I hope this post helps others who are new to WinAFL, or who are interested in HncAppShield itself.