**Research by: Alexey Bukhteyev**
XLoader is a widely observed malicious loader with information-stealing capabilities. It first surfaced in 2020 as a rebrand of the FormBook code base, a well-known and capable information stealer, and has since undergone substantial hardening and feature growth. In addition to the Windows variant, its developers also marketed a macOS build, though it appears far less prevalent in the wild.
XLoader is a prime example of malware that is extremely difficult to analyze. It combines several layers of protection: customized encryption with additional mixing steps, encrypted blocks disguised as valid but meaningless assembly code, obfuscated API calls, injections into system processes, and a wide set of sandbox evasion techniques. In addition, XLoader encrypts its network traffic, and hides real C2 addresses among dozens of decoys and fake domains.
An important feature of XLoader is its ongoing development. The authors release new versions regularly, changing internal mechanisms beyond recognition and adding new anti-analysis methods. As a result, previous research quickly becomes outdated. In earlier versions, extracting the configuration required pulling out a few keys using intricate algorithms. At the same time, obtaining the decrypted data only required peeling off two layers of obfuscation and encryption. Version 5 introduced a built-in packer, and in versions 6 and 7 analysts had to work through dozens of chained functions that decrypt each other, extracting intermediate keys at every stage. For someone new to XLoader, the entry barrier has become very high: on top of the analysis itself, extra time is needed for onboarding. By the time one research cycle is completed, the next iteration of the malware may already be out – and if there are significant changes, another time-consuming investigation is required.
When we began this research, XLoader version 8.0 had just been discovered. It seemed the XLoader developers were winning the race. But with the rise of generative models, we asked ourselves: can AI change the rules of the game and help us analyze such complex malware more quickly? To explore this, we applied generative AI assistance in two ways: by directly integrating with our analysis tools through a live MCP connection, and by leveraging ChatGPT’s project and file-upload capabilities to work from exported data. Each approach turned out to have distinct benefits, and together they allow us to solve reverse engineering tasks more effectively.
In this study, we focus on the second approach and show how ChatGPT without MCP can be effectively used for reverse engineering tasks, using one of the latest XLoader samples as an example.
To defend against XLoader, it is critical to extract up-to-date Indicators of Compromise (IoCs) from each new version — real C2 domains and URLs, cryptographic keys, and version identifiers. These IoCs feed into detection signatures and help track active campaigns. The primary way to obtain IoCs is by extracting and decrypting the malware’s configuration data from samples.
The challenge is that XLoader’s constantly shifting tactics break automated extraction tools and scripts almost as soon as they’re developed. The malware authors frequently tweak encryption schemes and packing methods specifically to thwart these efforts. An automated config extractor that worked yesterday might fail today, meaning each major version demands a fresh reverse-engineering cycle.
Sandboxes offer little relief:
In short, a sandbox does not solve the problem. It does not provide a reproducible dump or a complete set of IoCs.
The most reliable method is still static analysis: unpack everything, function by function, decrypt the config, and extract the IoCs. The downside is that doing this manually for each new version is slow and painstaking. This is where we hoped generative AI could act as a force multiplier.
In recent months, many reverse engineers began integrating LLMs with IDA Pro via the Model Context Protocol (MCP) to create an AI-assisted workflow. This agentic approach allows a model to interface directly with the disassembler and debugger, but it has its own practical challenges. For example, some MCP client setups lack certain ChatGPT interface features (like Projects or file uploads), and they still rely on maintaining a live IDA session and stable connection.
We explored two complementary workflows to apply GPT-5 to unraveling XLoader:
Each approach has its own strengths. MCP offers an agentic, interactive workflow, whereas the offline pipeline provides a self-contained analysis that’s easy to share and reproduce. These approaches aren’t mutually exclusive — you can use both, picking the appropriate tool for each task.
The idea of hooking an LLM into IDA isn’t new. For example, researchers at Cisco Talos demonstrated an IDA integration with an LLM acting as a “reverse engineering sidekick”. In our setup, we used MCP to bridge ChatGPT with IDA Pro and also interface with the x64dbg debugger and a VMware virtual machine. This gave LLM a live window into the malware’s execution.
**Figure 1** – Integration of an LLM with the reverse engineering environment through MCP.
This live integration, in addition to static analysis and annotating IDA database, enabled the AI to perform these actions:
However, the MCP approach isn’t without drawbacks:
For many scenarios, these issues are manageable and the benefits of live interaction outweigh the hassles. Some solutions, such as the MCP SuperAssistant browser extension, reduce friction by bringing the ChatGPT interface and MCP connectivity together. Recently, ChatGPT introduced a Developer Mode that can use MCP directly, without third-party plugins. Regardless of whether you use a plugin or the built-in mode, the workflow still depends on a live MCP session tied to a running toolchain and stable connection.
If any of the requirements listed above are difficult to fulfill, for example, you can’t keep IDA running constantly, or you need to easily share analysis progress with a colleague who doesn’t have the same setup, then a different approach might be preferable. That’s why we developed the “offline” data pipeline as an alternative.
Our second approach ditches the live connection entirely. Here AI acts as a self-reliant analyst working from a full static snapshot of the sample.
The workflow is straightforward: we exported everything we could from our IDA Pro database into a structured format (JSON and text). This includes the disassembly and decompiler output of every function, the list of cross-references, the readable strings, and even the original binary itself. We uploaded the .zip file to ChatGPT.
For example, our export bundle included these files:
“`
ida_export.zip ├── meta.json # basic info (sample name, hashes, image base, etc.) ├── index.json # lookup tables mapping names/EAs to function indices ├── functions.jsonl # NDJSON: disassembly, xrefs, bytes, prototypes, etc ├── strings.jsonl # list of strings in the binary and their references ├── data.jsonl # globals, arrays, named data references ├── decomp/ # decompiled pseudocode for functions (if available) │ ├── func_or_sub_XXXXXXXX.c │ ├── func_or_sub_YYYYYYYY.c │ └── … └── sample.bin # the malware sample itself
“`
In practice, it is better to upload the archive to a ChatGPT project. Files attached only in the chat can disappear after a session restart, while files in a project stay available for the whole engagement, and can be reused in different chats.
We also wrote an initial prompt explaining how the data is organized and how the AI should format its outputs (for example, proposing new function names and comments in a machine-readable JSON that we could import back into IDA). Essentially, we taught the AI how to read the phonebook we gave it, and how we wanted its notes recorded.
Below is an approximation of our prompt:
““
You are my reverse-engineering copilot. I will upload a ZIP produced by an IDA Pro 9 exporter. It contains: – meta.json – index.json “`json { “by_name”: { “”: “0x40XXXXXX”, … }, “by_ea”: { “0x40XXXXXX”: , … } } “` – functions.jsonl (NDJSON; one function per line, with mnemonics/operands already plain text) “`json { “ea”: “0x40XXXXXX”, “name”: “func_or_sub_XXXXXXXX”, “prototype”: “int __cdecl …”, // if available “ranges”: [[“0xstart”,”0xend”]], // function address range(s) “xrefs_in”: [“0x…”,”0x…”], // callers (function start) “xrefs_out”: [{“ea”:”0x…”,”name”:”…”},…],// callees (from call sites) “comments”: [{“ea”:”0x…”,”kind”:”…”,”text”:”…”}], “bb”: [{“start”:”0x…”,”end”:”0x…”,”succ”:[“0x…”]},…], // basic blocks “insn”: [ {“ea”:”0x…”,”bytes”:”8BEC”,”mnem”:”mov”,”opstr”:”…”,”size”:2,”cmt”:null}, … ], “bytes_concat”: “….”, // all function bytes hex, no spaces “decomp_path”: “decomp/_.c”, // if Hex-Rays available } “` – decomp/*.c // optional // Optional: – strings.jsonl // readable strings with code xrefs – data.jsonl // named globals/arrays – data_index.json “`json { “by_name”: { “g_DomainKeys”: “0x40YYYYYY”, “var_X”: “0x40ZZZZZZ”, … } } “` – sample.bad // malware sample binary ## On upload (INIT) 1) Parse meta.json & index.json. 2) Stream functions.jsonl just enough to build fast lookups by EA and by name, and to count functions; do NOT eagerly load all decomp/*.c. 3) Reply with an INIT REPORT: – file_name, imagebase, hashes (MD5/SHA256/CRC32), compiler (if present) – total function count and number with decomp_path – confirm you’ll use Canvas artifacts for tracking changes (see below) … ## Live suggestions (function-level, stored on Canvas) Keep human-readable **suggestions.json** (full JSON, not JSONL) with only proposed renames/comments (no auto-apply). Schema: “`json { “meta”: { “file_name”:””, “imagebase”:”0xXXXXXXXX”, “input_sha256″:”” }, “changes”: [ { “ea”: “0xXXXXXXXX”, // function start EA (required) “name”: “sub_XXXXXXXX”, // current name (optional precondition) “new_name”: “ai_better_name”, // MUST start with “ai_” “comments”: [ // only new/changed comments (optional) { “kind”:”func”|”func_rep”|”anterior”|”repeatable”, “text”:”…”, “ea”:”0xYYYYYYYY”, “mode”:”set”|”append” } ] } ] } “` …
““
After it was set up, this pipeline allowed ChatGPT to perform deep static analysis entirely within its own environment. We could ask it to find cryptographic algorithms, trace complex control flows, or even write and execute a Python script to decrypt some data from sample.bin. Many such tasks can be done without any new information from us – the AI works off the data we provided, verifying its logic by running Python scripts as needed. If there is an error, it fixes the script and reruns the tests, repeating this until the result converges. Compared to our previous approach, all these steps (analysis, code, test, correction) run in a single loop without dozens of local MCP calls. Naturally, this works well when using GPT-5 in the “thinking” mode.
This approach had several clear benefits:
That said, the offline approach isn’t a universal magic wand. There were cases where we still needed to resort to actual debugging (and therefore MCP), for example, to confirm a guessed key or to dump something that our static analysis missed. In addition, while analyzing other malware families, we encountered situations where continuous work in IDA was required, involving constant modifications to the live database. Previously, we would have needed to export the database after every iteration of changes. In this instance, the MCP-based approach turned out to be a better i.e. more convenient alternative.
Unsurprisingly, using an AI with an offline IDA export wasn’t without hiccups. We encountered a few issues with AI’s performance and solved them by adding strict rules to the prompt.
> Sometimes the model tried to invent missing data, for example, encryption keys that were computed dynamically at runtime. To prevent such “hallucinations”, we enforced an evidence-first rule: every numeric value and every algorithm must be backed by a quote from the export (functions.jsonl, decomp/*.c, or data.jsonl) with the exact EA address. If the data is not there, the model must produce a not-found report that explains where it looked and why nothing was found.
>
> ## Provenance & no-fabrication – Any *specific* numeric/structural claim (modulus, key length, magic multipliers like 0x66666667, loop bounds) MUST be backed by direct evidence from the uploaded data: – Quote the exact line(s) from `functions.jsonl` (insn/mnem/opstr/bytes) or from `decomp/*.c`, and cite EA(s). – If the claim is not literally present, mark it **UNPROVEN** and offer a concrete verification plan. – If you revise a claim, explicitly state what changed and show the new evidence quote. No silent edits.
> For example, a string-decryption routine was expected to return printable text, but due to a mistake in extracting the key, the output was corrupted. To make the output “look right,” the model applied Base64. We banned any cosmetic transformations (such as Base64) used just to make results look valid. Instead, the model must find the actual error in the keys or in the algorithm and rerun the tests until the output is correct.
>
> Verification contract * Define acceptance criteria from the task (properties/invariants). * Run self-checks (lengths, wrap-around, bounds monotonicity, step counts, round-trip where applicable). * Do NOT transform outputs to “look right”; if a check fails, proceed to the recovery loop.
> Early on, the model sometimes requested data we had already provided. We fixed this with a local-first rule: search in the archive files first. It should produce a not-found report only if the data is truly missing.
>
> ## Local-first data usage – Treat the uploaded dataset as the primary source of truth. – Before requesting any bytes/strings/keys from the user, attempt to obtain them from the uploaded files – Never ask the user for blobs that are present in data.jsonl/strings.jsonl or are trivially recoverable from functions.jsonl. – Only if a needed EA/function is absent from the snapshot, say so and propose next steps (e.g., MCP call).
With these precautions in place, our AI “assistant” became a reliable analyst for the static portions of the work. In the next sections, we show how it performed on the real challenges within XLoader 8.0, such as decrypting the payload and API resolution and working with occasional MCP-powered dynamic checks.
When working with older ChatGTP models such as o3, getting the right result required splitting the task into many small steps and explicitly telling the model what to do, down to pointing out exact code addresses and the algorithms to apply. Without this level of detail, the output was unpredictable. This approach was closer to “text-based programming” and required deep engagement on our side.
With GPT-5, however, we can pose broader and more abstract tasks. Below we show an example of XLoader’s built-in crypter analysis with a mixed approach: using the IDA export as the main data source, and MCP+x64dbg for result verification.
For this task we took a recently discovered XLoader sample with SHA256: **77db3fdccda60b00dd6610656f7fc001948cdcf410efe8d571df91dd84ae53e1**. For the entire process we used GPT-5 in the “Thinking” mode.
After we gave the AI-assistant the instructions for processing the data, we received a short report:
**Figure 2** – IDA export initial report.
Next, we deliberately formulated the tasks for the AI assistant as if we knew nothing at all about the sample under analysis, assuming this would reflect the actions of someone unfamiliar with XLoader.
The first prompt was written in the most abstract way possible:
“`
Perform an initial analysis of the sample starting from the entry point and provide a short report.
“`
Processing this simple prompt took 8 minutes and 46 seconds. As a result, our assistant correctly identified the RC4 implementation and concluded that the sample was packed. It is worth noting that, based only on the data available to the model, it suggested that the sample looked similar to XLoader. At the same time, there was nothing in the archive or in the initial prompt that explicitly pointed to this.
**Figure 3** – Initial analysis report: Entry point analysis.
In addition, it detected API call obfuscation. While the assistant was not fully able to deobfuscate all API calls during the quick triage, in some cases it inferred the function being called from the context and its signature.
**Figure 4** – Initial analysis report: Presumed call to the VirtualProtectEx function.
It also successfully identified the point where execution is handed over to the decrypted code.
**Figure 5** – Initial analysis report: Call to the original entry point in the decrypted code.
At this stage, our priority was to reach the payload as quickly as possible. We therefore focused on this goal by first asking the assistant to find all cryptographic function calls, and then to analyze how exactly the payload was decrypted.
We found out that the main payload block goes through two rounds of RC4: first, an RC4 decryption of the entire buffer, and then a second pass in 256-byte chunks using a different key.
**Figure 6** – Initial analysis report: Description of the two rounds of RC4 encryption.
In addition, the assistant managed to collect:
The next step was to obtain the real-time keys and verify the result. At this point, we turned to MCP. In one of the steps, we also asked the assistant to read a section of decrypted data to validate the correctness of the static decryption.
As a result, it obtained the following keys:
It also obtained a section of decrypted code:
**Figure 7** – AI-controlled debugging in x86dbg.
After the final keys were read from memory, we asked the assistant to identify where and by which algorithms the keys were generated, and to verify the analysis was correct using the real-time data obtained in the previous step.
“`
Find how the Stage-1 and Stage-2 RC4 keys are calculated: source key material and algorithms. Please note that ALL the required data is available to you in IDA export. Check your assumptions using the captured realtime values. Start with the Stage-1 key.
“`
In the end, AI produced a working script that unpacked the analyzed sample. Unfortunately, the script was not universally applicable, as the patterns it used to locate the keys were tightly bound to this particular sample. As a result, it failed when we tried to apply it to samples from other versions, requiring further manual fine-tuning.
Excluding the final step of creating a generic unpacker, the entire analysis took about 40 minutes and required 39 MCP calls. The table below lists the prompts we used and the time spent on each analysis step.
After manually creating a function at address _0x00430CB3_ (original entry point: OEP) which we named _oep_start_, we opened the unpacked sample in IDA and applied the export script again. We also created a new project to analyze the unpacked sample.
Even before starting a deep dive, it was clear that IDA failed to recognize a large portion of the code, and many of the identified functions did not look valid.
**Figure 8** – Unpacked XLoader sample in IDA.
This may indicate that the code is obfuscated in some way, or that the functions are encrypted. In fact, we know that XLoader uses on-the-fly function decryption, as we mentioned in the introduction. For some functions, multi-layer encryption is applied.
For the sake of the experiment, we wanted to see if the AI assistant could determine all this on its own, without our guidance. We started the analysis with the same type of prompt we used when analyzing the packed sample.
“`
Perform an initial analysis of the sample starting from the `oep_start` (`0x00430CB3`) and provide a short report.
“`
After initial analysis, we identified:
**Figure 9** – Triage report and the “Stage-1 builder” function (a function decryptor stub).
Next, we asked the assistant to focus on the logic around the so-called **stage-1 builder** ( _sub_429143_) and to locate cross-references of the functions involved. The AI assistant identified 90 similar functions. These functions derive 6-byte head and tail markers, use those markers to locate a target region in memory, overwrite the markers with six `NOP` instructions ( `90 90 90 90 90 90`), transform the region, and then transfer execution to a hardcoded address unique to each wrapper.
**Figure 10** – Results of the function encryption scheme analysis, together with the renamed functions from the analysis.
The assistant also implemented inline Python snippets and decrypted one of the encrypted functions, providing us with all the data and the keys, as well as the part of the decrypted code:
**Figure 11** – Report on the successful decryption of one of the functions (0x00418DB3).
Interestingly, in this case, the use of MCP wasn’t even necessary, as the validity of the extracted keys can be easily verified by AI: if it’s possible to locate the start and end markers of the code after decryption, it means the keys and the algorithm were recovered correctly. Additionally, we can see that the decrypted data doesn’t appear to be random (it contains sequences like `00 00` and `ff ff`), which suggests the function was indeed decrypted correctly.
The AI performed very well in reimplementing the algorithms, including the modified RC4 with additional tweaks, as well as in locating the keys within the provided sample. It also successfully implemented functions for detecting 6-byte markers.
However, it was unable to fully implement a universal script capable of decrypting all functions without human assistance. The issue arose in locating all the XOR modifiers required to construct the 20-byte effective RC4 key.
The challenge lies in the fact that the effective RC4 key is derived by XOR-ing its 4-byte components with a 4-byte modifier, which is unique for each encrypted function and is calculated this way:
“`
seed_external ^ seed_internal ^ 0x6CFC3E60
“`
While `seed_internal` is always located within the wrapper function near the markers, the assistant was unable to implement a universal method for finding `seed_external` (see Figure 13 below), as it could be placed in various locations within the calling function and might be deliberately mixed with other constants.
We had to manually modify the script to ensure it could correctly locate all external seeds. Additionally, we modified the rules for locating the remaining constants to make the script truly robust and capable of working with other samples as well. Therefore, the AI significantly reduced the time required for analysis and script development, but at this stage, it could not fully replace a human.
It is clear that the creators of XLoader deliberately complicated the key construction process by scattering crucial constants across multiple functions, to the extent that even AI was unable to develop an algorithm to locate it. We are not disclosing how we derive the keys, so as not to give the XLoader developers any advantage.
Finally, after applying the script, we obtained 51 functions decrypted in the first pass. Many of the decrypted functions also contained similar calls to encrypted functions. Applying the script three times in a row, we got a total of 77 decrypted functions out of the 90 initially found.
**Figure 14** – Decrypted function example with a patched 6-byte head marker (six NOP instructions).
After loading the resulting sample into IDA, we can see that a significant number of code blocks are still unrecognized:
**Figure 15** – Significant number of code blocks remain unrecognized by IDA.
During a quick review, we also identified several functions that still remain encrypted:
**Figure 16** – Functions that remained encrypted after applying the decryption script.
It’s also worth noting that in the scheme described above, encrypted functions are located using 6-byte sequences, which are replaced with six NOP instructions after decryption. This implies that the function must have a valid, unencrypted prologue. At the same time, in Figure 16, on the right we can see an encrypted function that lacks a valid prologue. This likely indicates that a different decryption method was used.
We recreated exporting the database and loaded it into ChatGPT. We initiated the analysis with the following prompt and uploaded the decryption log of the 77 functions:
“`
In the analyzed scheme, encrypted functions are located using 6-byte sequences, which are replaced with six NOP instructions after decryption. This implies that the function must have a valid, unencrypted prologue. Some of the encrypted functions do not have a valid prologue (e.g., `sub_407293`, `sub_411053`, `sub_415343`). This likely indicates that a different method is used for it. Try to find it.
“`
It’s worth noting that we used a little trickery by pointing out the absence of a valid prologue in the prompt. Without this observation, the AI assistant was unable to identify the additional decryptors.
**Figure 17** – Second decryption/patching scheme discovered.
Instead of 6-byte tags, the newly discovered scheme computes 4-byte _head_ and _tail_ markers using XOR. For example:
**Figure 18** – Calculation of the head marker using XOR.
The head marker anchors one byte before the real entry. After decryption, the code writes a canonical prologue ( `55 8B EC`) at the buffer pointer plus one byte and patches the tail with `90 90 90 90`.
**Figure 19** – Patching the head and tail markers with valid instructions.
The function body is decrypted in two layers using `ai_xform_sub_rc4_sub`.
`ai_rc4key20_build` `0x00404543`
**Example:**
Wrapper `0x00415943` decrypts function `0x00415343` using:
`4c ed 39 65,` `a7 32 ca 6a` `sub_404543, then XOR 0x36)`
**Function decryption scheme III**
In the latest scheme, 4-byte markers are also used to find the start and end of a function. As in the previous method, two layers of encryption are applied, and the same key is used to decrypt the first layer. However, the 20-byte key for the second layer is embedded within the wrappers (unique for each encrypted function) and is modified using a 4-byte value.
**Example:**
Wrapper `0x00418dc3` decrypts function `0x0040d543` using:
`9c da 5e e8,` `0e e7 b2 36` `sub_404543, then XOR 0x36)`
Additionally, a separate function may be used to handle the decryption of the second layer and the patching process.
**Figure 21** – Function used to handle the decryption of the second layer and the patching process.
We identified three distinct function decryption schemes in XLoader:
`55 8B EC`
It is worth noting that to implement universal static decryptors, we still had to break the task down into smaller steps ourselves: locating wrapper functions, extracting 20-byte keys, recovering 4-byte modifiers, and identifying and calculating marker positions. We combined them into a single decryptor only after confirming that each step worked reliably and produced the correct data for every function. At the same time, AI significantly reduced the time required to implement regular expressions (even though they had to be adjusted manually) as well as during the analysis and implementation of cryptographic algorithms.
With each decryption iteration we obtained a new batch of decrypted functions, some of which contained the keys required for decrypting additional functions. By applying all three decryptors sequentially over four iterations, we ultimately succeeded in decrypting 101 functions.
Unfortunately, it was not possible to accurately measure the time spent on this task, as it required a considerable amount of additional work and manual corrections. However, this stage turned out to be the most complex and time-consuming for both us and the LLM.
We also got _suggestions.json_ with the suggested names for the analyzed functions. This is very useful, because we can keep this file for other analysis sessions and easily import it in the current IDB, or even in a new IDB (after decrypting the functions) without the need to diff it with the old database.
We now have a fully decrypted sample, which allows us to continue the analysis using the same methodology.
Now that we have a fully decrypted sample, we can apply the same approach to it. We first load _suggestions.json_ and then perform an export. As stated earlier, even during the very first analysis of the sample (before we had obtained the decrypted functions) the assistant pointed out the presence of obfuscated API calls. The import table in this sample is empty and there are no plaintext strings that might contain library or function names.
As we created a new clean session by uploading the decrypted sample, we decided to test how reproducible the analysis results were.
Therefore, in the very first prompt, without providing any hints, we asked the assistant to identify the API call obfuscation mechanisms. As in the previous case, we specified that the analysis should begin at the OEP rather than at the _start_ function, so that the AI assistant would not get bogged down analyzing the packer.
“`
The IAT is empty, no plaintext strings in the sample. Determine how the sample resolves and invokes Windows APIs. Start the analysis from `oep_start` (`0x00430CB3`).
“`
Four minutes later, we received a description of the algorithm:
**Figure 22** – Description of one of the API resolution algorithms.
Interestingly, during the first analysis (when part of the functions was still encrypted) and the second analysis, our AI assistant identified different functions responsible for API hash decryption. In the first case, it pointed to `sub_404603` (later renamed to `ai_apiid_decrypt_salt`), while in the second case it identified only `sub_4045B3` ( `ai_apiid_decrypt`).
Next, we used the following simple prompt to generate a script for API call deobfuscation:
“`
Implement an IDAPython script for deobfuscating API calls. Annotate the resolver and every call site with Module!Function, original ID, and EA; IDA 9+. Log all deobfuscation attempts.
“`
We got a script with the following functionality:
As the assistant did not have access to IDA, we had to test the script manually. If there were errors, we sent the results back to the chat and asked for corrections. It took five iterations and about 20 additional minutes to obtain a fully working version.
Therefore, we also asked the assistant to analyze an alternative path:
“`
Analyze routine `sub_404603` as an alternate API-hash decrypter. Recover its algorithm. Find call sites. Extend the IDAPython deobfuscator.
“`
It took another five iterations (sending back errors and corrections) and 14 minutes before we obtained a fully functional script.
XLoader uses the same hash-decryption mechanism to look for sandbox artifacts, virtual machines, and processes typical of a researcher’s environment. While fixing issues, we also added dictionary-based hash brute forcing (loading the wordlist from a separate file), which let us automatically annotate not only functions but also certain strings corresponding to specific evasion techniques:
**Figure 24** – Deobfuscated API function and string identifiers.
As a bonus, we received a summary of how API resolution works, describing two different methods:
Overall, it took roughly one hour of the AI assistant’s work to go from the first prompt to a fully functional API deobfuscation script. This figure does not include local testing or the time spent writing prompts. For this task, the human effort was minimal.
The table below summarizes the prompts we used and the time required for each step:
**Additional protection of critical API calls**
While reviewing the API call deobfuscation results, we noticed that some functions are invoked through an interesting wrapper which was originally hidden among the encrypted functions at address `0x0040AC93` ( `ai_dec_func_16`).
This function acts as a _secure-call trampoline_: it temporarily encrypts nearly the entire image before invoking a function pointer and then decrypts those same regions once the call returns. Only a tiny “island” (the space between the call-site’s return address and a marker) remains unencrypted by the per-call XOR so that execution can proceed.
**Figure 26** – “Secure-call trampoline” decompiled function explained by LLM.
What makes this mechanism notable? Because the function stays encrypted nearly the entire time, it is difficult to even detect its existence without static decryption. In memory dumps, it appears only in encrypted form.
At the same time, if some security software or a sandbox hooks API calls protected by this wrapper and tries to analyze or dump the process memory at the time of the call, the mechanism effectively shields the malicious code.
In total, 20 functions are protected this way, including NTAPI routines related to processes, threads, memory, and file operations, as well as several WinSock functions. The full list is shown in the image below:
**Figure 27** – The list of API calls protected by “secure-call trampoline.”
The sample contains no readable strings. At the same time, the code features a series of short routines that share the same skeleton: the prologue allocates a small stack frame and initializes a local array (its size varies by function). This is followed by a recurring trio of calls with the same order and identical argument signatures:
**Figure 28** – Example of an encrypted string in XLoader code.
Let’s analyze one of these functions. Because we started a fresh session for this task, we instructed the AI assistant to trust the existing comments that briefly describe each function’s behavior, so it doesn’t reanalyze routines that were already covered.
“`
Analyze the functionality starting from `sub_405773`. Recurse into its callees. Trust the comments in the disassembly.
“`
As a result, we determined that this function decrypts a string using the algorithm implemented in one of the previously decrypted routines ( `ai_dec_func14`). Note how the AI built the call graph and described each routine’s behavior from a single short prompt, relying solely on the data in the previously prepared archive:
**Figure 29** – Call graph and description of the string decryptor stub.
With a simple prompt, we readily obtain the encryption keys and the decrypted string.
“`
Decrypt the string from `sub_405773`.
“`
**Figure 30** – String decryption result with derived key and ciphertext.
Now that we are confident the AI assistant understands how the encrypted strings are stored, knows how the key is derived, and already analyzed and correctly reimplemented the decryption algorithm (verifying it on real data), we can move on to implementing a script to decrypt the remaining strings.
This time we used a slightly more detailed prompt because we wanted specific information to appear both in the comments and in the console output:
“`
Implement an IDAPython script (IDA 9+) that decrypts strings. Requirements: – Find and annotate every call to the decrypter function `sub_4050F3` with the decrypted string. – For each string, output debug info: encrypted buffer, XOR tweak byte (`BL`), length, and decrypted bytes. – Print all binary data in hex.
“`
This time we were lucky and immediately got a working script which we used to decrypt 175 strings:
**Figure 31** – Decrypted strings in IDA.
The total time required for the analysis was about 20 minutes:
We now have a decrypted sample with deobfuscated strings and API calls that we can analyze like a regular binary. However, XLoader is not that simple: some data remains encrypted even at this stage.
Extracting lists of Indicators of Compromise (IoCs) is always a critical task in malware analysis. Network indicators, such as domain names and URLs, are especially important because they help detect and classify malware through traffic analysis. That is why extracting domains is essential — even though some may be decoys or bait, or currently inactive but intended for later use.
Among the recovered strings, we see 64 Base64-encoded entries. Looking at the version history of XLoader, we find that starting from version 2.8 it began storing encrypted domain names in Base64 form. Without a doubt, these 64 Base64 strings represent domain names that we must decrypt. As early as version 4, XLoader added two additional layers of modified RC4 encryption with different keys, making the decryption more complicated. In later versions, this process became even more complex. In total, to reach the decrypted domain names we need to peel off at least five layers after first identifying where and how the keys are initialized: decrypt the functions that initialize encrypted strings, decrypt the strings themselves (which we already did earlier), base64-decode the results, and apply two more layers of decryption.
At the same time, obtaining the keys for each layer is the most difficult part, as the different pieces of data needed to generate them are scattered across multiple functions, making them hard to locate.
Before moving on, we updated our string deobfuscation script so that it also renames the functions responsible for retrieving decrypted strings and assigns a prefix `ai_dec_domain_{NN}` to all functions handling Base64-encoded strings. We then exported the database to prepare for domain decryption.
All calls to the `ai_dec_domain_{NN}` functions occur inside a single function, `ai_dec_func_0` ( `0x00404913`):
**Figure 32** – Domain generation function.
We start the analysis of this function using the simplest prompt:
“`
Analyze `ai_dec_func_0` (0x00404913)
“`
As a result of our analysis, we obtained a detailed description showing that `ai_dec_func_0` (later renamed to `ai_dec_func_0_domain_tag_generate`) is XLoader’s stage-1 domain builder. For a given domain index (1..64), it pulls the matching seed string from `ai_dec_domain_NN`, base64-decodes it, then runs a keyed RC4-with-diff transform whose key is a 20-byte secret stored at `ctx+0x23D0` (where `ctx` is a global structure that stores keys, function addresses, and other data), and byte-XORed with the domain index. The result is re-encoded to base64 and written to the output buffer. This is an intermediate artifact, not the final ASCII domain.
A special “token” branch is enabled when `a5==222` is used for generating a path string for a URL. It emits a short pattern `/<4-byte token>/` from a static table and applies the same keyed transform.
Next, we asked the assistant to reproduce the transformations implemented in `ai_dec_func_0_domain_tag_generate`. Because our script already retrieved the decrypted strings and added them as comments in the previous step, we requested that these strings also be used alongside the repeated string decryption:
Comments in the disassembly of `ai_dec_domain_NN` contain base64-encoded encrypted domain names.
“`
Take the one from `ai_dec_domain_01`, then transform it with `ai_dec_func_0` and return the final string.
“`
As a result, our assistant was unable to locate the key at _ctx+0x23D0_ on its own:
**Figure 34** – AI was unable to locate the key.
That was not surprising, as none of the exported data contained references to the structure’s fields and can only be located by the offset in the structure, or captured in the debugger.
**Figure 35** – Initialization of the RC4 key for decrypting the first layer.
We therefore had to manually locate where this key was initialized and provide that information to the AI assistant.
“`
You can find `ctx+0x23D0` initialization in `ai_dec_func_4` (`0x00407293`), also check `sub_404453`. After the keys were extracted, we tried again:Now you have the required key. Take the sting from `ai_dec_domain_01`, then transform it with `ai_dec_func_0` and return the final string.
“`
As a result, we obtained an intermediate value for the string returned by `ai_dec_domain_01` ( `1Qo1bG/xbpI2gGY8lCzWBw==`) after the first decryption layer, which was re-encoded in Base64 as ( `Qvm75Acm5NpYTbnYXdcvBw==`):
**Figure 36** – Result of reproducing the domain generation function’s behavior for index 2.
Because the string is still encrypted, and we do not yet know what happens to it next, we need to investigate further:
“`
The string you returned is still encrypted. Trace the complete transformation chain from `ai_dec_func_0` (case 2, string `Qvm75Acm5NpYTbnYXdcvBw==`) to the final ASCII domain. Discover any remaining layers, locate and derive all required keys/parameters from the context/initializers, and cite function names with EAs used. Output the final domain, and a concise step-by-step pipeline, print all keys/IVs as hex. If any value is missing, state exactly what it is and where to read it.
“`
As a result, we discovered that the obtained Base64 string is decoded again and then decrypted by a second layer using a 20-byte key generated inside the function `ai_dec_func_11` ( `0x004095F3`). This key is additionally XORed with _SALT_DWORD_. However, the initialization of `SALT_DWORD` is missing from `ai_dec_func_11`, and the assistant was therefore unable to retrieve it on its own.
**Figure 37** – AI was unable to locate the Stage-2 key.
As in the previous case, we manually recovered the missing value. The complication was that instead of offset _0x25C8_, the base _0x2000_ was used, with _0x5C8_ added later, which made the search a bit more difficult.
We provided not only the address of the function where the initialization of `SALT_DWORD` was assumed to occur, but also the relevant code snippets:
“`
You can find SALT_DWORD in `ai_dec_func_20` (`0x00411053`). Please verify and continue: .text:004111B3 81 C7 00 20 00 00 add edi, 2000h … .text:0041182B C7 87 C8 05 00 00 00 A6 mov dword ptr [edi+5C8h], 0C6EA600h
“`
During the analysis, our AI assistant confirmed that the key was correct and successfully decrypted the domain corresponding to _ai_dec_domain_01_:
**Figure 38** – Result of successful domain decryption.
Now the AI assistant can automatically decrypt all the domains. Let’s verify this by asking it to decrypt the first 16 domains:
“`
Decrypt domains from `ai_dec_domain_00` to `ai_dec_domain_15`. Output: table (index, src_base64, final_domain). Take per-function base64 from `ai_dec_domain_NN` disassembly comments.
“`
In the end, we obtained a table with the fully decrypted domain names:
Let’s recall that at the very beginning of the analysis we discovered a separate branch of `ai_dec_func_0` that activates when the last value equals 222. In this case, a 4-character tag is generated which later becomes part of the URL. Previously, this tag was the same for all domains and unique to a malware campaign. Now, however, each domain has its own tag.
We tried to decrypt the tags for the first 16 domains using the following prompt:
“`
Reproduce the output of `ai_dec_func_0` for a5=222 and domain index 0..15.
“`
In response to the prompt, the assistant provided a description of how the malware generates those 4-character tags, and the table containing the first 16 tags.
The table below summarizes the prompts we used, and the time required for each step:
From its initial appearance, XLoader has always been a moving target, with each new version raising the bar for security analysts and defenders. XLoader began as a two-layer puzzle but evolved into a maze of nested decryptors, scattered key material, and runtime-only code. For years, this meant that by the time researchers fully unraveled a sample, attackers were already one step ahead with the next version.
Generative AI changes this balance. Combining cloud-based analysis and occasional MCP-assisted runtime checks, we delegated a large part of the mechanical reverse engineering to LLM. Instead of spending hours rewriting decryption routines by hand, we asked our AI model to do it and received working prototypes in minutes.
The use of AI doesn’t eliminate the need for human expertise. XLoader’s most sophisticated protections, such as scattered key derivation logic and multi-layer function encryption, still require manual analysis and targeted adjustments. But the heavy lifting of triage, deobfuscation, and scripting can now be accelerated dramatically. What once took days can now be compressed into hours.
For defenders, this is more than a productivity boost. Faster turnaround means fresher IoCs, quicker detection updates, and a shorter window of opportunity for attackers. For researchers, it lowers the entry barrier to analyzing some of the most complex malware families in the wild.
Our research shows that with the right workflows, generative models can already serve as a force multiplier — helping security defenders keep pace with threats that were once considered prohibitively time-consuming to analyze.
However, it’s too soon to declare victory, as we expect malware authors to adapt their techniques in response to AI-assisted analysis. And in turn, we’ll need to come up with the next game-changer.
Check Point Threat Emulation and Harmony Endpoint provide comprehensive coverage of attack tactics, file types, and operating systems and protect against the attacks and threats described in this report.
