Malware loaders are essentially remote access trojans (RATs) that establish communication between the attacker and the compromised system. Loaders typically represent the first stage of a compromise. Their primary goal is to download and execute additional payloads, from the attacker-controlled server, on the compromised system without detection.
Researchers at ProofPoint have discovered a new malware loader called Bumblebee. The malware loader is named after a unique user agent string used for C2 communication. It has been observed that adversaries have started using Bumblebee to deploy malware such as CobaltStrike beacons and Meterpreter shells. Threat group TA578 has also been using Bumblebee the loader in their campaigns.
This article explores and decodes Bumblebee malware loader’s:
- Technical features
- Logic flow
- Exploitation process
- Network maintenance
- Unique features
Adversaries push ISO files through compromised email (reply) chains, known as thread hijacked emails, to deploy the Bumblebee loader. ISO files contain a byte-to-byte copy of low-level data stored on a disk. The malicious ISO files are delivered through Google Cloud links or password protected zip folders.
The ISO files contain a hidden DLL with random names and an LNK file. DLL (Dynamic Link Library) is a library that contains codes and data which can be used by more than one program at a time. LNK is a filename extension in Microsoft Windows for shortcuts to local files.
The LNK file often contains a direct link to an executable file or metadata about the executable file, without the need to trace the program’s full path. LNK files are an attractive alternative to opening a file, and thus an effective way for threat actors to create script-based attacks. The target location for the LNK files is set to run rundll32.exe, which will call an exported function in the associated DLL. If the “show hidden items” option is not enabled on the victim’s system, DLLs may not be visible to the user.
Bumblebee Loader Analysis
The analyzed sample
(f98898df74fb2b2fad3a2ea2907086397b36ae496ef3f4454bf6b7125fc103b8) is a DLL file with exported functions.
Exported functions in the sample DLL file
Both the exported functions, IternalJob and SetPath, execute the function sub_180004AA0.
InternalJob executing the function sub_180004AA0 SetPath executing the function sub_180004AA0
Entropy of the DLL
The entropy of a file measures the randomness of the data in the file. Entropy can be used to determine whether there is hidden data or suspicious scripts in the file. The scale of entropy is from 0 (not random) to 8 (totally random). High entropy values indicate that there is encrypted data stored in the file, while lower values indicate the decryption and storage of payload in different sections during runtime.
The peak is spread across the data segments of the DLL file. It is highly possible that this peak was caused by the presence of packed data in the data segments of the sample DLL. This indicates that the malware, at some point in runtime, will fetch the data from the data segment and unpack it for later use.
Unpacking and Deploying Payload (Function sub_180004AA0)
The exported function sub_180004AA0 is a critical component in unpacking and deploying the main payload on the target system.
The function sub_180003490 serves as the unpacker for the main payload.
Function sub_180003490 contains 2 functions of interest:
sub_1800021D0: This function routine is responsible for allocating heap memory.
sub_1800029BC: This function writes the embedded data, in the data segment of the DLL sample, into the newly allocated heap memory. The packed payload is fetched from the data segment and written into allocated heap memory. The code segment highlighted in the image below is responsible for transferring the data.
- The assembly code highlighted yellow transfers the embedded data (packed payload) from the data segment of DLL to an intermediate CL register.
- The assembly code highlighted red transfers the data from CL to the allocated heap. During runtime, the heap memory continues to get filled with the packed payload embedded within the DLL samples.
After dumping the packed payload in the allocated memory, the control goes back to sub_180004AA0 and function sub_180002FF4 is executed.
Function sub_180002FF4 performs the following operations:
- Allocates new heap memory.
- Transfers previously dumped packed payload into newly allocated memory.
- Deallocates previously allocated memory.
After the control returns to sub_180004AA0 function sub_180004180 is executed.
Three functions encapsulated in Function sub_180004180
Function sub_180004180 has 3 functions:
- sub_180001670: This function is responsible for allocating multiple heap memories to the malware. The malware later dumps the unpacked MZ file into one of the allocated memories.
- sub_180003CE4: This function is responsible for unpacking previously dumped packed payload in the process heap and dumps it into one of the memories allocated by sub_180001670.
- sub_180001A84: This function is responsible for deallocating memory.
Hooking refers to a range of techniques used to modify the behavior of an operating system, software, or software component, by intercepting the function calls, events, or communication between software components. The code which handles such intercepted function calls, events, or communication is called a hook.
Right after the Bumblebee loader unpacks the main payload in the memory, it hooks a few interesting functions exported by ntdll.dll (a file containing NT kernel functions, susceptible to cyberattacks) through an in-line hooking technique. The in-line hooks play a significant role in the execution of the final payload. The trigger mechanism, for the deployment of the payload, shows the creativity of the malware developer. Function sub_180001000 is responsible for implementing the in-line hooks.
Function sub_180001000 initially saves the addresses of 3 detour functions used for hooking. The detour functions are responsible for hijacking control flow in hooked Windows functions. After storing the addresses, sub_1800025EC is executed to resolve the addresses of the target API (Application Programming Interface) functions for hooking.
Detour functions in sub_180001000 function
sub_1800025EC loads ntdll.dll in the address space of the loader process using function LoadLibraryA. Following the loading of the ntdll, function GetProcAddress is used to resolve the addresses of functions:
LoadLibraryA and GetProcAddress functions
After obtaining the addresses to memory pages of the detour functions for hooking, the loader uses function VirtualProtect to change the memory permissions of the target pages. After changing the permissions, the loader writes the in-line hooks in sub_180002978. Then VirtualProtect is called again to restore the page permissions.
VirtualProtect and sub_180002978 functions
The data passed to VirtualProtect at runtime is shown in the image below. The call to VirtualProtect changes the ntdll.NtOpenFile page permission to 0x40 (PAGE_EXECUTE_READWRITE).
Data passed/call to VirtualProtect function
After changing the page permissions of ntdll.NtOpenFile, the loader modifies the initial sequence of bytes in the NtOpenFile API by executing function sub_180002978.
sub_180002978 function modifying the NtOpenFile API
In-line hooking involves the following steps:
ntdll.NtOpenFile before (hooking) execution of sub_180002978 function
- After sub_180002978 is executed, a call to NtOpenFile makes the malware code jump to location 1800023D4 (detour). This is how malicious in-line hooks change the execution flow of APIs.
Call to NtOpenFile making the malware jump to 1800023D4
- After writing the hook, VirtualProtect is used again to restore the page permission of ntdll.NtOpenFile to 0x20 (PAGE_EXECUTE_READ).
VirtualProtect function used to restore page permission of ntdll.NtOpenFile
- The process of changing memory permission and writing in-line hooks is repeated in a do-while loop, for the rest of the target functions, NtCreateSection and NtMapViewOfSection.
Summary of Hooked Functions
After successful hooking, whenever target functions are called in the address space of the loader process, the control flow is transferred to the in-line the respective hook addresses:
|Target Function||In-line Hook (Detours)|
Loading gdiplus.dll is Unique to Bumblebee
The final function executed by the loader is sub_1800013A0. The malware uses the function LoadLibraryW to load the DLL module. It then uses the function GetProcAddress to obtain the address of a specific function exported by the library loaded.
This plays a crucial step in deployment of the main payload on the victim system. Unlike TTPs (Tactics, Techniques, and Procedures) of common malware loaders, this is where the Bumblebee loader gets creative.
Function sub_1800013A0 with LoadLibraryW and GetProcAddress functions
The module gdiplus.dll is loaded into the process memory address space. Gdiplus.dll is an important module, containing libraries that support the GDI Window Manager, in the Microsoft Windows OS.
Runtime execution of function sub_1800013A0
The module gdiplus.dll is executed in the last function of the malware loader. This is the first instance in which the unpacked MZ payload is used directly by the loader. Hence, the loading of this module appears suspicious. Also, an unusual base address (0x1d54fd0000) is assigned to the loaded gdiplus.dll module.
Unusual base address assigned to gdiplus.dll
By further examining the suspicious memory, it was found that the address is a mapped page with RWX permission in the loader address space. This is a classic use case of hollowing where the module content is replaced with unpacked malicious artifacts.
Address as a mapped page with RWX permission
But in our analysis so far we have not come across any code that does the hollowing. Then how did the malware change the contents of the gdiplus.dll? Interestingly this is where the malware developer decided to get creative! The hooking seen earlier is responsible for hollowing the loaded module with the unpacked payload. More details about the same are covered in the following section.
Investigating the Hooks and the Trigger
As seen in the previous section, the malware hooks 3 specific APIs:
The API selection is not random. The internal working of loading any DLL via LoadLibrary API uses the 3 functions mentioned above. Hooking these functions gives the malware the flexibility to deploy the unpacked payload covertly. This feature makes it difficult for researchers to hunt the main payload.
The detour function at 0x180001D4C is used to hook function NtMapViewOfSection, which lays the groundwork for hollowing the loaded module (in this case, gdiplus.dll) with the unpacked Bumblebee binary. The detour function is capable of the following actions:
- Section object creation via NtCreateSection API
- Mapping of the view of gdiplus.dll to loader address space via NtMapViewOfSection
- Writing the unpacked payload into the mapped view of gdiplus.dll
- Deallocating heap memory that holds unpacked payload from earlier steps
The implementation of the detour function at 0x180001D4C, shows the use of a pointer to the NtCreateSection API, for creating a section object to be used later in mapping the gdiplus.dll module.
Pointer to NtCreateSection API
After creating a section object, the detour function calls NtMapViewOfSection, via a pointer. Now the view for the section is created by the system. The function sub_180002E74 is responsible for filling the mapped view with an unpacked payload.
Pointer to NtMapViewOfSection along with sub_180002E74 function
The address of the mapped view, returned by NtMapViewOfSection pointer in the loader process, which is 0x1D54F5D0000, is the same address seen while examining the process modules.
Address of the mapped view returned by NtMapViewOfSection
Unusual base address assigned to “gdiplus.dll” as seen earlier
The mapped view starts from 0x1D54F5D0000. The loader dumps the unpacked payload here, hollowing gdiplus.dll. Hence, the final Bumblebee payload stays hidden inside the loaded module gdiplus.dll.
Right after mapping the view, the detour function executes sub_180002E74 to initiate the writing of the unpacked binary.
Function sub_180002E74 responsible for filling the mapped view with the final payload
The hooks get activated as soon as the loader loads the gdiplus.dll module via LoadLibraryW API. Then the payload is covertly loaded into the gdiplus.dll module. The final payload is a DLL, hence the loader has to explicitly call an exported function to trigger the execution.
In this case, the loader obtains the address of exported function SetPath via function GetProcAddress. The control is then transferred to the final payload by the final call to SetPath, by providing the loader program name as argument.
Loader obtains the address of exported function “SetPath” via GetProcAddress
The image below shows the function SetPath exported by the unpacked Bumblebee payload.
Bumblebee Main Payload Analysis
The core malicious component of the bumblebee is executed in the memory, when the hollowed gdiplus.dll is loaded via the LoadLibrary API. When the module is loaded into memory, the function DllMain creates a new thread and executes sub_180008EC0 routine.
The DllMain function of the bumblebee payload
sub_180008EC0 routine is quite a large function that is responsible for all the malicious activities performed by Bumblebee on the compromised system.
Function sub_180008EC0 logic flow
Anti VM Checks
The first activity performed by sub_180008EC0 is to check for a virtual machine (VM) environment. If the function returns True, then Bumblebee shuts itself down by executing the ExitProcess function.
sub_18003DA0 performs VM check
The VM checking routine is. Rigorous. It employs various techniques to ensure that the malware is not running in a sandbox environment used by security researchers. Some of the interesting features are:
- Iterating through running processes via functions CreateToolHelp32Snapshot, Process32FirstW, and Process32NextW.
Malware functions which help in iterating through running processes
- Each running process is compared to a list of program names.
Running process being compared to the list of program names
- The malware also checks for specific usernames used in sandboxed environments to confirm the absence of a VM.
Malware checking for specific usernames
- The VM check routine also enumerates active system services running via the OpenSCManagerW API. The names of common services used by VM softwares are stored in an array.
Enumerating active system services running via OpenSCManagerW
- It also scans the system directory for common drivers and library files used by VM applications.
System check for common drivers and library files used by popular VM applications
- The routine also checks for named pipes to identify the presence of VM.
Checking for named pipes
These are a few examples of techniques employed by the malware to identify analysis environments. It also has other functionalities built such as the use of WMI and registry functionalities to identify hardware information to check for the presence of VM environments installed on the target system.
After VM checks, if it is secure to continue, the malware creates an event. The event ID is 3C29FEA2-6FE8-4BF9-B98A-0E3442115F67. This is used for thread synchronization.
The event created by the malware
The malware uses wsript.exe as a persistence vector to run the malware each time the user logs into the system. The VB instruction is written into a .vbs file. This is performed when the C2 sends the “ins” command as a task to execute on the system.
VB instruction written into a .vbs file
The malware performs token manipulation to escalate its privilege on the target system by granting the malware process a SeDebugPrivilege. With this privilege the malware can perform arbitrary read/write operations.
Malware is given the “SeDebugPrivilege”
The malware is capable of performing code injections to deploy malicious code in running processes using various APIs. The malware dynamically retrieves the addresses of the APIs needed for the code injection. The core bumblebee payload comes with embedded files which areinjected into the running process to further attack the victim.
List of APIs used to perform code injections
Code Injection Via NtQueueApcThread
When the malware receives the command along with a DLL buffer, which gets injected, the malware starts scanning for a list of processes on the system. One of the executables in the list is randomly chosen to inject the malicious DLL.
Malware looking for the list of processes on the system
List of executables
Following the code injection, the malware:
- Creates a process from the previously selected executable image via COM (Component Object Model), in which access to an object’s data is received through interfaces, in a suspended state.
- Enumerates through the running process via the CreateToolhelp32Snapshot API to find the newly spawned process created in the previous step.
- When the process is found, the malware manipulates the token and acquires the SeDebugPrivilege token to perform further memory manipulation.
- If token manipulation is successful, the malware injects a shellcode into the process to make it go to sleep.
Malware creating a process and injecting shellcode into it
Function sub_180037A80 is responsible for performing the shellcode injection into the spawned process in the suspended state.
After injecting the shellcode into the process, the malware resumes the process. It then executes function sub_18003A9BC to finally inject malicious DLL by creating multiple memory sections and views.
Executing sub_18003A9BC function to inject malicious DLL
The DLL code is executed via the NtQueueApcThread API, which is dynamically resolved during the execution.
DLL code executed via NtQueueApcThread API
Command and Control Infrastructure, also known as C2 or C&C, is a collection of tools and techniques used to maintain contact with a compromised system of devices after the initial access has been gained. The IP address of the C2 can be retrieved from the payload code as shown below.
Retrieving the IP address of C2
The C2 periodically sends out tasks to the agent to be executed on the system. The malware extensively uses WMI (Windows Management Infrastructure) to collect basic victim information like domain name and user name, and sends the compromised information to the C2. The C2 distinguishes active agents based on the client ID assigned to each one.
Data transferred in C2 communication
Interestingly, the user agent string used by the malware for communication is “bumblebee”.
Data transferred out of the compromised system
- User name
Commands received by the compromised system
The task field in the C2 response will contain one of the following commands:
A Tale of Bundled DLLs and Hooks
The core payload comes with two DLLs embedded in the binary. The purpose and function of both the DLLs are the same, but one is 32 bit and the other is 64 bit. These are used to perform further hooking and control flow manipulations.
DLL Signatures (SHA256)
- 32 bit: B9534DDEA8B672CF2E4F4ABD373F5730C7A28FE2DD5D56E009F6E5819E9E9615
- 64 bit: 1333CC4210483E7597B26042B8FF7972FD17C23488A06AD393325FE2E098671B
In this section we will look into the inner workings of embedded 32 bit DLL. The module looks for a specific set of functions in ntdll.dll, kernel32.dll, kernelbase.dll, and advapi32.dll to later remove any hooks present in the code. This will also remove any EDR/AV (Endpoint Detection and Response/ Antivirus) implemented hooks used for monitoring.
Functions in ntdll.dll checked for existing hooks
Functions in kernel32.dll checked for existing hooks
In kernelbase32.dll following functions are checked for any already existing hooks:
Functions in kernelbase32.dll checked for existing hooks
Functions in advapi32.dll checked for existing hooks
The Unhooking Mechanism
The unhooking process involves the following steps:
- The module retrieves handles to target DLLs via the GetModuleHandleW API. The handle returned by the API is for the DLL loaded in the memory by the malware process, i.e. the process responsible for executing the bumble loader, which is rundll32.exe.
- Then the malware constructs the absolute path for target DLLs via the LetSystemDirectoryA API, to access the system32 directory, where all system DLLs are located.
- A pointer to NtProtectVirtualMemory is computed following the DLL path generation.
- Function sub_10005B90 is called to do the unhooking. Parameters passed to the function are:
- First Arg: Absolute path to target DLL
- Second Arg: Handle to already loaded target DLL
- Third Arg: Offset to array holding target functions exported by the target DLL
- Fourth Arg: Null
- Fifth Arg: Pointer to NtProtectVirtualMemory
Steps for Unhooking Mechanism
Function sub_10005B90 performs the following operations:
- Maps fresh copy of the target DLL from the hard disk to address space of the malware process via functions CreateFileA, CreateFileMappingA, and MapViewOfFile.
- Calls function sub_10005D40 to perform unhooking. The following data is passed to the function:
- First Arg: Mapped Address of fresh copy of DLL
- Second Arg: Same as sub_10005B90
- Third Arg: Same as sub_10005B90
- Fourth Arg: Same as sub_10005B90
- Fifth Arg: Same as sub_10005B90
- After unhooking, the mapped view is released via the UnMapViewOfFile API.
Operations performed by function sub_10005B90
The logic used for unhooking is straightforward. The malware compares the target function in the loaded module in memory against the function defined in the mapped module via MapViewOfFile. If both the codes don’t match, the content from the mapped module is written to the loaded module, to restore the state to that of the mapped version from the hard disk.
The malware goes through the exports of the loaded DLL and performs a string match against the set of function names stored as an array in a loop. The sub_10005930 is responsible for string matching.
String match against the set of function names
When the function name in the array of the malware matches the exported function from the loaded module, the flag is set to [v8] and breaks from the loop. This occurs in the following steps:
- The malware stores the addresses of functions from both modules(loaded and mapped).
- Then the loaded and mapped function codes are checked for hooks, by identifying dissimilarities in the code. If the loaded code is the same as the mapped one, it breaks from the loop and continues to iterate through the remaining functions.
Malware matches the exported function
If the loaded code is not the as same as the mapped code, then the following operations are performed by the malware for unhooking:
- VirtualQueryEx API is called to retrieve the base address of the page containing the target function.
- Then NtProtectVirtualMemory API is used for changing permissions of the page containing the function code (READ_WRITE_EXECUTE).
- VirtualQuery is used again to check for permission; whether the page is writable or not.
- Function sub_10005890 is called to restore the loaded module with the contents of the mapped module. Now the functions in the mapped and loaded modules are in the same state.
Malware does not match the exported function
After clearing all the hooks in the selected functions, the malware installs hooks.
Functions RaiseFailFastException from kernel32.dll and api-ms-win-core-errorhandling-l1-1-2.dll are hooked. Then the detour function sub_100057F0 hijacks the control flow when the above functions are called by the system after hooking is done by the malware.
Function sub_100057F0 simply returns the call.
The embedded DLL has a hooking strategy similar to that of the Bumblebee loader. Various functions used by the system, while loading a DLL module, are hooked and wups.dll is loaded to trigger the chain.
Hooking of the functions used while loading DLL and loading of wups.dll
|Target API||Detour Function|
Code Upgrades In The Wild
After analyzing many samples in the wild we observed code modifications in the loader.
Prominent code modifications done in Bumblebee loader ever since its discovery
The extreme left sample in the image above is the one we have covered in this report. As we can see from the logic flow of the loader, the malware developer has modified the loader code in the other two samples. All the samples observed in the wild are 64 bit DLL modules with an exported function that has a randomly generated string as the function name. This can be justified by the fact that code plays a major role in whether the malware is detected by security products. To circumvent this hurdle, malware developers make changes to the code and the malware design.
Newer loader samples in the wild contain various payloads, such as cobaltStrike beacons and Meterpreter shells, unlike the custom bumblebee payload seen in the first generation.
Indicators of Compromise (IoCs)