Malware Intelligence

Technical Analysis of Bumblebee Malware Loader

August 4, 2022

min

Table of Content

Example H2

Malware loaders are essentially remote access trojans (RATs) that establish communication between the attacker and the compromised system. Loaders typically represent the first stage of a compromise. Their primary goal is to download and execute additional payloads, from the attacker-controlled server, on the compromised system without detection.

Researchers at ProofPoint have discovered a new malware loader called Bumblebee. The malware loader is named after a unique user agent string used for C2 communication. It has been observed that adversaries have started using Bumblebee to deploy malware such as CobaltStrike beacons and Meterpreter shells. Threat group TA578 has also been using Bumblebee the loader in their campaigns.

This article explores and decodes Bumblebee malware loader’s:

Technical features
Logic flow
Exploitation process
Network maintenance
Unique features

Campaign Delivery

Adversaries push ISO files through compromised email (reply) chains, known as thread hijacked emails, to deploy the Bumblebee loader. ISO files contain a byte-to-byte copy of low-level data stored on a disk. The malicious ISO files are delivered through Google Cloud links or password protected zip folders.

ISO file retrieved from Google Cloud (“storage.googleapis.com”)

ISO file retrieved from password protected zip files

The ISO files contain a hidden DLL with random names and an LNK file. DLL (Dynamic Link Library) is a library that contains codes and data which can be used by more than one program at a time. LNK is a filename extension in Microsoft Windows for shortcuts to local files.

The LNK file often contains a direct link to an executable file or metadata about the executable file, without the need to trace the program’s full path. LNK files are an attractive alternative to opening a file, and thus an effective way for threat actors to create script-based attacks. The target location for the LNK files is set to run rundll32.exe, which will call an exported function in the associated DLL. If the “show hidden items” option is not enabled on the victim’s system, DLLs may not be visible to the user.

Bumblebee Loader Analysis

The analyzed sample (f98898df74fb2b2fad3a2ea2907086397b36ae496ef3f4454bf6b7125fc103b8) is a DLL file with exported functions.

Exported functions in the sample DLL file

Both the exported functions, IternalJob and SetPath, execute the function sub_180004AA0.

InternalJob executing the function sub_180004AA0 SetPath executing the function sub_180004AA0

Entropy of the DLL

The entropy of a file measures the randomness of the data in the file. Entropy can be used to determine whether there is hidden data or suspicious scripts in the file. The scale of entropy is from 0 (not random) to 8 (totally random). High entropy values indicate that there is encrypted data stored in the file, while lower values indicate the decryption and storage of payload in different sections during runtime.

The peak is spread across the data segments of the DLL file. It is highly possible that this peak was caused by the presence of packed data in the data segments of the sample DLL. This indicates that the malware, at some point in runtime, will fetch the data from the data segment and unpack it for later use.

Unpacking and Deploying Payload (Function sub_180004AA0)

The exported function sub_180004AA0 is a critical component in unpacking and deploying the main payload on the target system.

The function sub_180003490 serves as the unpacker for the main payload.

Function sub_180003490

Function sub_180003490 contains 2 functions of interest:

sub_1800021D0: This function routine is responsible for allocating heap memory.

sub_1800029BC: This function writes the embedded data, in the data segment of the DLL sample, into the newly allocated heap memory. The packed payload is fetched from the data segment and written into allocated heap memory. The code segment highlighted in the image below is responsible for transferring the data.

Function sub_1800029BC

The assembly code highlighted yellow transfers the embedded data (packed payload) from the data segment of DLL to an intermediate CL register.
The assembly code highlighted red transfers the data from CL to the allocated heap. During runtime, the heap memory continues to get filled with the packed payload embedded within the DLL samples.

Function sub_180002FF4

After dumping the packed payload in the allocated memory, the control goes back to sub_180004AA0 and function sub_180002FF4 is executed.

Function sub_180002FF4

Function sub_180002FF4 performs the following operations:

Allocates new heap memory.
Transfers previously dumped packed payload into newly allocated memory.
Deallocates previously allocated memory.

After the control returns to sub_180004AA0 function sub_180004180 is executed.

Function sub_180004180

Function sub_180004180

Three functions encapsulated in Function sub_180004180

Function sub_180004180 has 3 functions:

sub_180001670: This function is responsible for allocating multiple heap memories to the malware. The malware later dumps the unpacked MZ file into one of the allocated memories.
sub_180003CE4: This function is responsible for unpacking previously dumped packed payload in the process heap and dumps it into one of the memories allocated by sub_180001670.
sub_180001A84: This function is responsible for deallocating memory.

Hook Implementation

Hooking refers to a range of techniques used to modify the behavior of an operating system, software, or software component, by intercepting the function calls, events, or communication between software components. The code which handles such intercepted function calls, events, or communication is called a hook.

Right after the Bumblebee loader unpacks the main payload in the memory, it hooks a few interesting functions exported by ntdll.dll (a file containing NT kernel functions, susceptible to cyberattacks) through an in-line hooking technique. The in-line hooks play a significant role in the execution of the final payload. The trigger mechanism, for the deployment of the payload, shows the creativity of the malware developer. Function sub_180001000 is responsible for implementing the in-line hooks.

Function sub_180001000 initially saves the addresses of 3 detour functions used for hooking. The detour functions are responsible for hijacking control flow in hooked Windows functions. After storing the addresses, sub_1800025EC is executed to resolve the addresses of the target API (Application Programming Interface) functions for hooking.

Detour functions in sub_180001000 function

sub_1800025EC loads ntdll.dll in the address space of the loader process using function LoadLibraryA. Following the loading of the ntdll, function GetProcAddress is used to resolve the addresses of functions:

NtOpenFile
NtCreateSection
NtMapViewOfSection

LoadLibraryA and GetProcAddress functions

After obtaining the addresses to memory pages of the detour functions for hooking, the loader uses function VirtualProtect to change the memory permissions of the target pages. After changing the permissions, the loader writes the in-line hooks in sub_180002978. Then VirtualProtect is called again to restore the page permissions.

VirtualProtect and sub_180002978 functions

The data passed to VirtualProtect at runtime is shown in the image below. The call to VirtualProtect changes the ntdll.NtOpenFile page permission to 0x40 (PAGE_EXECUTE_READWRITE).

Data passed/call to VirtualProtect function

After changing the page permissions of ntdll.NtOpenFile, the loader modifies the initial sequence of bytes in the NtOpenFile API by executing function sub_180002978.

sub_180002978 function modifying the NtOpenFile API

In-line hooking involves the following steps:

ntdll.NtOpenFile before (hooking) execution of sub_180002978 function

After sub_180002978 is executed, a call to NtOpenFile makes the malware code jump to location 1800023D4 (detour). This is how malicious in-line hooks change the execution flow of APIs.

Call to NtOpenFile making the malware jump to 1800023D4

After writing the hook, VirtualProtect is used again to restore the page permission of ntdll.NtOpenFile to 0x20 (PAGE_EXECUTE_READ).

VirtualProtect function used to restore page permission of ntdll.NtOpenFile

The process of changing memory permission and writing in-line hooks is repeated in a do-while loop, for the rest of the target functions, NtCreateSection and NtMapViewOfSection.

Do-while loop repeating the permission and hooks process for other target functions

Summary of Hooked Functions

After successful hooking, whenever target functions are called in the address space of the loader process, the control flow is transferred to the in-line the respective hook addresses:

Target Function	In-line Hook (Detours)
ntdll.NtOpenFile	1800023D4
ntdll.NtCreateSection	1800041EC
ntdll.NtMapViewOfSection	180001D4C

Loading gdiplus.dll is Unique to Bumblebee

The final function executed by the loader is sub_1800013A0. The malware uses the function LoadLibraryW to load the DLL module. It then uses the function GetProcAddress to obtain the address of a specific function exported by the library loaded.

This plays a crucial step in deployment of the main payload on the victim system. Unlike TTPs (Tactics, Techniques, and Procedures) of common malware loaders, this is where the Bumblebee loader gets creative.

Function sub_1800013A0 with LoadLibraryW and GetProcAddress functions

The module gdiplus.dll is loaded into the process memory address space. Gdiplus.dll is an important module, containing libraries that support the GDI Window Manager, in the Microsoft Windows OS.

Runtime execution of function sub_1800013A0

The module gdiplus.dll is executed in the last function of the malware loader. This is the first instance in which the unpacked MZ payload is used directly by the loader. Hence, the loading of this module appears suspicious. Also, an unusual base address (0x1d54fd0000) is assigned to the loaded gdiplus.dll module.

Unusual base address assigned to gdiplus.dll

By further examining the suspicious memory, it was found that the address is a mapped page with RWX permission in the loader address space. This is a classic use case of hollowing where the module content is replaced with unpacked malicious artifacts.

Address as a mapped page with RWX permission

But in our analysis so far we have not come across any code that does the hollowing. Then how did the malware change the contents of the gdiplus.dll? Interestingly this is where the malware developer decided to get creative! The hooking seen earlier is responsible for hollowing the loaded module with the unpacked payload. More details about the same are covered in the following section.

Investigating the Hooks and the Trigger

As seen in the previous section, the malware hooks 3 specific APIs:

NtOpenFile
NtCreateSection
NtMapViewOfSection

The API selection is not random. The internal working of loading any DLL via LoadLibrary API uses the 3 functions mentioned above. Hooking these functions gives the malware the flexibility to deploy the unpacked payload covertly. This feature makes it difficult for researchers to hunt the main payload.

The detour function at 0x180001D4C is used to hook function NtMapViewOfSection, which lays the groundwork for hollowing the loaded module (in this case, gdiplus.dll) with the unpacked Bumblebee binary. The detour function is capable of the following actions:

Section object creation via NtCreateSection API
Mapping of the view of gdiplus.dll to loader address space via NtMapViewOfSection
Writing the unpacked payload into the mapped view of gdiplus.dll
Deallocating heap memory that holds unpacked payload from earlier steps

The implementation of the detour function at 0x180001D4C, shows the use of a pointer to the NtCreateSection API, for creating a section object to be used later in mapping the gdiplus.dll module.

Pointer to NtCreateSection API

After creating a section object, the detour function calls NtMapViewOfSection, via a pointer. Now the view for the section is created by the system. The function sub_180002E74 is responsible for filling the mapped view with an unpacked payload.

Pointer to NtMapViewOfSection along with sub_180002E74 function

The address of the mapped view, returned by NtMapViewOfSection pointer in the loader process, which is 0x1D54F5D0000, is the same address seen while examining the process modules.

Address of the mapped view returned by NtMapViewOfSection

Unusual base address assigned to “gdiplus.dll” as seen earlier

The mapped view starts from 0x1D54F5D0000. The loader dumps the unpacked payload here, hollowing gdiplus.dll. Hence, the final Bumblebee payload stays hidden inside the loaded module gdiplus.dll.

Right after mapping the view, the detour function executes sub_180002E74 to initiate the writing of the unpacked binary.

Function sub_180002E74 responsible for filling the mapped view with the final payload

The hooks get activated as soon as the loader loads the gdiplus.dll module via LoadLibraryW API. Then the payload is covertly loaded into the gdiplus.dll module. The final payload is a DLL, hence the loader has to explicitly call an exported function to trigger the execution.

In this case, the loader obtains the address of exported function SetPath via function GetProcAddress. The control is then transferred to the final payload by the final call to SetPath, by providing the loader program name as argument.

Loader obtains the address of exported function “SetPath” via GetProcAddress

The image below shows the function SetPath exported by the unpacked Bumblebee payload.

SetPath Function

Bumblebee Main Payload Analysis

The core malicious component of the bumblebee is executed in the memory, when the hollowed gdiplus.dll is loaded via the LoadLibrary API. When the module is loaded into memory, the function DllMain creates a new thread and executes sub_180008EC0 routine.

The DllMain function of the bumblebee payload

sub_180008EC0 routine is quite a large function that is responsible for all the malicious activities performed by Bumblebee on the compromised system.

Function sub_180008EC0 logic flow

Anti VM Checks

The first activity performed by sub_180008EC0 is to check for a virtual machine (VM) environment. If the function returns True, then Bumblebee shuts itself down by executing the ExitProcess function.

sub_18003DA0 performs VM check

The VM checking routine is. Rigorous. It employs various techniques to ensure that the malware is not running in a sandbox environment used by security researchers. Some of the interesting features are:

Iterating through running processes via functions CreateToolHelp32Snapshot, Process32FirstW, and Process32NextW.

Malware functions which help in iterating through running processes

Each running process is compared to a list of program names.

Running process being compared to the list of program names

The malware also checks for specific usernames used in sandboxed environments to confirm the absence of a VM.

Malware checking for specific usernames

The VM check routine also enumerates active system services running via the OpenSCManagerW API. The names of common services used by VM softwares are stored in an array.

Enumerating active system services running via OpenSCManagerW

It also scans the system directory for common drivers and library files used by VM applications.

System check for common drivers and library files used by popular VM applications

The routine also checks for named pipes to identify the presence of VM.

Checking for named pipes

These are a few examples of techniques employed by the malware to identify analysis environments. It also has other functionalities built such as the use of WMI and registry functionalities to identify hardware information to check for the presence of VM environments installed on the target system.

Event Creation

After VM checks, if it is secure to continue, the malware creates an event. The event ID is 3C29FEA2-6FE8-4BF9-B98A-0E3442115F67. This is used for thread synchronization.

The event created by the malware

Persistence

The malware uses wsript.exe as a persistence vector to run the malware each time the user logs into the system. The VB instruction is written into a .vbs file. This is performed when the C2 sends the “ins” command as a task to execute on the system.

Wsript.exe

VB instruction written into a .vbs file

Token Manipulation

The malware performs token manipulation to escalate its privilege on the target system by granting the malware process a SeDebugPrivilege. With this privilege the malware can perform arbitrary read/write operations.

Malware is given the “SeDebugPrivilege”

The malware is capable of performing code injections to deploy malicious code in running processes using various APIs. The malware dynamically retrieves the addresses of the APIs needed for the code injection. The core bumblebee payload comes with embedded files which areinjected into the running process to further attack the victim.

List of APIs used to perform code injections

Code Injection Via NtQueueApcThread

When the malware receives the command along with a DLL buffer, which gets injected, the malware starts scanning for a list of processes on the system. One of the executables in the list is randomly chosen to inject the malicious DLL.

Malware looking for the list of processes on the system

List of executables

Following the code injection, the malware:

Creates a process from the previously selected executable image via COM (Component Object Model), in which access to an object’s data is received through interfaces, in a suspended state.
Enumerates through the running process via the CreateToolhelp32Snapshot API to find the newly spawned process created in the previous step.
When the process is found, the malware manipulates the token and acquires the SeDebugPrivilege token to perform further memory manipulation.
If token manipulation is successful, the malware injects a shellcode into the process to make it go to sleep.

Malware creating a process and injecting shellcode into it

Function sub_180037A80 is responsible for performing the shellcode injection into the spawned process in the suspended state.

Function sub_180037A80

After injecting the shellcode into the process, the malware resumes the process. It then executes function sub_18003A9BC to finally inject malicious DLL by creating multiple memory sections and views.

Executing sub_18003A9BC function to inject malicious DLL

The DLL code is executed via the NtQueueApcThread API, which is dynamically resolved during the execution.

DLL code executed via NtQueueApcThread API

C2 Network

Command and Control Infrastructure, also known as C2 or C&C, is a collection of tools and techniques used to maintain contact with a compromised system of devices after the initial access has been gained. The IP address of the C2 can be retrieved from the payload code as shown below.

Retrieving the IP address of C2

The C2 periodically sends out tasks to the agent to be executed on the system. The malware extensively uses WMI (Windows Management Infrastructure) to collect basic victim information like domain name and user name, and sends the compromised information to the C2. The C2 distinguishes active agents based on the client ID assigned to each one.

Data transferred in C2 communication

Interestingly, the user agent string used by the malware for communication is “bumblebee”.

Outbound Traffic

Data transferred out of the compromised system

Client Parameters

client-id
group_name
sys_version
User name
client_version

Inbound Traffic

Commands received by the compromised system

Client Parameters

response_status
tasks

Commands Supported

The task field in the C2 response will contain one of the following commands:

Command	Description
dex	Downloads executable
sdl	Kill Loader
ins	Persistence
dij	DLL inject

A Tale of Bundled DLLs and Hooks

The core payload comes with two DLLs embedded in the binary. The purpose and function of both the DLLs are the same, but one is 32 bit and the other is 64 bit. These are used to perform further hooking and control flow manipulations.

DLL Signatures (SHA256)

32 bit: B9534DDEA8B672CF2E4F4ABD373F5730C7A28FE2DD5D56E009F6E5819E9E9615
64 bit: 1333CC4210483E7597B26042B8FF7972FD17C23488A06AD393325FE2E098671B

In this section we will look into the inner workings of embedded 32 bit DLL. The module looks for a specific set of functions in ntdll.dll, kernel32.dll, kernelbase.dll, and advapi32.dll to later remove any hooks present in the code. This will also remove any EDR/AV (Endpoint Detection and Response/ Antivirus) implemented hooks used for monitoring.

Functions in ntdll.dll checked for existing hooks

Functions in kernel32.dll checked for existing hooks

In kernelbase32.dll following functions are checked for any already existing hooks:

Functions in kernelbase32.dll checked for existing hooks

Functions in advapi32.dll checked for existing hooks

The Unhooking Mechanism

The unhooking process involves the following steps:

The module retrieves handles to target DLLs via the GetModuleHandleW API. The handle returned by the API is for the DLL loaded in the memory by the malware process, i.e. the process responsible for executing the bumble loader, which is rundll32.exe.
Then the malware constructs the absolute path for target DLLs via the LetSystemDirectoryA API, to access the system32 directory, where all system DLLs are located.
A pointer to NtProtectVirtualMemory is computed following the DLL path generation.
Function sub_10005B90 is called to do the unhooking. Parameters passed to the function are:
- First Arg: Absolute path to target DLL
- Second Arg: Handle to already loaded target DLL
- Third Arg: Offset to array holding target functions exported by the target DLL
- Fourth Arg: Null
- Fifth Arg: Pointer to NtProtectVirtualMemory

Steps for Unhooking Mechanism

Function sub_10005B90 performs the following operations:

Maps fresh copy of the target DLL from the hard disk to address space of the malware process via functions CreateFileA, CreateFileMappingA, and MapViewOfFile.
Calls function sub_10005D40 to perform unhooking. The following data is passed to the function:
- First Arg: Mapped Address of fresh copy of DLL
- Second Arg: Same as sub_10005B90
- Third Arg: Same as sub_10005B90
- Fourth Arg: Same as sub_10005B90
- Fifth Arg: Same as sub_10005B90
After unhooking, the mapped view is released via the UnMapViewOfFile API.

Operations performed by function sub_10005B90

The logic used for unhooking is straightforward. The malware compares the target function in the loaded module in memory against the function defined in the mapped module via MapViewOfFile. If both the codes don’t match, the content from the mapped module is written to the loaded module, to restore the state to that of the mapped version from the hard disk.

The malware goes through the exports of the loaded DLL and performs a string match against the set of function names stored as an array in a loop. The sub_10005930 is responsible for string matching.

String match against the set of function names

When the function name in the array of the malware matches the exported function from the loaded module, the flag is set to [v8] and breaks from the loop. This occurs in the following steps:

The malware stores the addresses of functions from both modules(loaded and mapped).
Then the loaded and mapped function codes are checked for hooks, by identifying dissimilarities in the code. If the loaded code is the same as the mapped one, it breaks from the loop and continues to iterate through the remaining functions.

Malware matches the exported function

If the loaded code is not the as same as the mapped code, then the following operations are performed by the malware for unhooking:

VirtualQueryEx API is called to retrieve the base address of the page containing the target function.
Then NtProtectVirtualMemory API is used for changing permissions of the page containing the function code (READ_WRITE_EXECUTE).
VirtualQuery is used again to check for permission; whether the page is writable or not.
Function sub_10005890 is called to restore the loaded module with the contents of the mapped module. Now the functions in the mapped and loaded modules are in the same state.

Malware does not match the exported function

After clearing all the hooks in the selected functions, the malware installs hooks.

Functions RaiseFailFastException from kernel32.dll and api-ms-win-core-errorhandling-l1-1-2.dll are hooked. Then the detour function sub_100057F0 hijacks the control flow when the above functions are called by the system after hooking is done by the malware.

Installing hooks

Function sub_100057F0 simply returns the call.

Function sub_100057F0

The embedded DLL has a hooking strategy similar to that of the Bumblebee loader. Various functions used by the system, while loading a DLL module, are hooked and wups.dll is loaded to trigger the chain.

Hooking of the functions used while loading DLL and loading of wups.dll

Target API	Detour Function
ZwMapViewOfSection	sub_10004C50
ZwOpenSection	sub_10004FF0
ZwCreateSection	sub_10004BC0
ZwOpenFile	sub_10004F20

Code Upgrades In The Wild

After analyzing many samples in the wild we observed code modifications in the loader.

Prominent code modifications done in Bumblebee loader ever since its discovery

The extreme left sample in the image above is the one we have covered in this report. As we can see from the logic flow of the loader, the malware developer has modified the loader code in the other two samples. All the samples observed in the wild are 64 bit DLL modules with an exported function that has a randomly generated string as the function name. This can be justified by the fact that code plays a major role in whether the malware is detected by security products. To circumvent this hurdle, malware developers make changes to the code and the malware design.

Newer loader samples in the wild contain various payloads, such as cobaltStrike beacons and Meterpreter shells, unlike the custom bumblebee payload seen in the first generation.