Skip to content

【Apple Silicon】IDA Pro,😁,俺回來了!

Published: at 09:00

Contents

前言

曾經在旮旯給木兒網站上看過俺發表的技術報告的羣友們,

你好啊!👋
俺是妳們的老熟人,這邊名字就先省略了。

回想起俺不停投稿技術文章的那兩年,正值俺讀研最閒的時光。眾所周知,「數理最優化」在你日的研究院四處氾濫,而我對這種空穴來風 + 空中樓閣的研究毫無興趣。對俺而言,親自動手開發和發掘軟件內部的奧祕才更加有趣。

俺投入了大量精力閱讀逆向工程教材,拜讀各路大佬的博客文章,然後整日泡在 IDA 裏面,研究某個大型軟件中極其微小的模塊(指序列號驗證 😁。俺本是通信網路專攻出身,那些計算機科學的知識都來自於野雞自學,因此遭遇需要編譯原理和操作系統深厚功底才能破解的硬骨頭時,俺便感到力不從心,,,

不過即便是小打小鬧,俺依然學到了不少東西,比如當我看到 push ebpmov ebp, esp 時,俺已經知道這究竟是什麼玩意了。在那之後,俺還拉上了一些優秀羣友,進行了幾個小型研究課題,成果也發表在了旮旯給木兒網站上。

函數入口
函數入口

那些文章轉載到 Telegram 後,引起了不少技術愛好者的關注,還有羣友詢問是否有後續內容。俺很高興大家能喜歡俺們的研究,也很想繼續寫下去滿足妳們的好奇心,但與我組隊的羣友已如星奏般消失,俺也忙於水論文畢業,無暇顧及。

現在事情似乎有了轉機。俺畢業了,進入了一個 IT 工作崗位。當時有位老工程師問我,如果不考慮工作內容,最想從事什麼?俺說,俺想研究電腦病毒。他大驚失色,問道:現在還有人能看懂彙編嗎??俺同樣震驚,反問:妳們大學都不教的嗎 ☝🏻️🥶❓

說實話,研究電腦病毒的薪資並沒有俺現在崗位來得高;而且電腦病毒也在式微,當下的黑客不是在研究如何盜取加密貨幣,就是在擴大自己的 DDoS 軍火庫,因為這樣來錢最快。可是俺相信,工程師出身的人都能理解,每個 nerd 內心深處都有一團反叛的黑色火焰,哪怕是被徹底規訓的熱本人,都懷有一顆想要打破一切露露追求極限的心。而逆向工程恰恰就是能滿足這種陰暗願望的領域,,,

給學習版 IDA Pro 用戶的忠告

扯了半天,現在開始講 IDA Pro 😅
根據俺在各大 Telegram 羣裏混跡多年的觀察,IDA 的盜版氾濫成災,很多人都能隨便在網上下載到來歷不明的壓縮包。再加上技術圈子的人內心多少都有點小陰暗,盜版 IDA 幾乎人手一份,以至於連小學生都可以打開 IDA 對着 Galgame.exe 一頓 F5 輸出,彷彿逆向工程的門檻已經降到了有手就行。

但是俺需要澄清幾個事實:

首先,也是最重要的是,IDA 并非免费软件。从某种程度上说,Hex-Rays 的员工要靠卖 IDA 来开工资。

IDA 用户应了解以下几项事实。IDA 是 Hex-Rays 公司的旗舰产品。因此,他们对于未经授权就使用 IDA 的做法深恶痛绝。过去,该公司发现,盗版 IDA 的发布与公司销售量的下滑有着直接的因果关系。为此,IDA 的前发行公司 DataRescue 甚至将盗版者的姓名张贴在它的“耻辱堂”(Hall of Shame) 中。为打击盗版,IDA 采用了几项反盗版技术,并实施了许可限制。

——《IDA Pro 权威指南(第2版)》(Chris Eagle)

其次,雖然 IDA 降低了逆向工程的門檻,但這並不意味着技能水平就提升了。俺嘗試過許多號稱 IDA 替代品的軟件,像是 Ghidra 和 Binary Ninja,前者是開源拖拉機,我拿來反編譯俺自己寫的 Hello World 都能報錯。後者確實提供了能簡化彙編代碼的 IL,有其獨到之處,但價格同樣不菲,社區討論也不如 IDA 活躍,故這兩者距離取代 IDA 還有幾光年之遙。

真上頭
真上頭

現在的逆向社區過度依賴盜版 IDA,這個劇情似曾相識——人人都用盜版 Photoshop,最終人人都逃不了被 Adobe 收割的命運。我很擔心,如果俺們過分依賴 IDA 的功能,先不談技術進步問題,萬一哪天盜版 IDA 突然全部失效,俺們豈不是要遭受致命打擊?所以在這個意義上,我非常支持多多扶持競爭對手,俺也確實在行動——俺通過在 Steam 上定期購買許可證送給羣友的方式來給 Krita 捐款。只是目前 IDA 的競爭對手,一個是扶不上牆的爛泥,另一個的價格也令人望而卻步。

開源軟件 be like
開源軟件 be like

不要過於依賴工具,要提升自己的姿勢水平,更要有敬畏之心,,,

啓動 macOS 調試器

前幾天俺在 Telegram 上面找到了一個神必壓縮包(莫急,已放在文章末尾),裏面是 IDA Pro 最新 9.1 版本的全平臺安裝包和註冊機 (!!),這為俺在 macOS 上學習逆向工程提供了機會。俺懷着敬畏之心,編寫了一個帶會員充值功能的 C++ 計算器 (😁) 拿來練手。

神必壓縮包
神必壓縮包
會員制計算器
會員制計算器

話說要破解這玩意,俺們需要修改會員判定函數的分支,這不算難,先打開 IDA Pro。

main 函數
main 函數

剛打開,我就一眼看到了我寫的 switch case 1 2 3 4 語句的分支,😅,並且左上角的函數窗口裏面,「checkVipStatus(void)」函數赫然在列,,,你功能太強,搞得我很沒有面子,,,

點進去 checkVipStatus 函數瞄了一眼,發現不知所云,趕緊進行反編譯:

bool checkVipStatus(void)
{
  bool v1; // [xsp+8h] [xbp-268h]
  _QWORD v2[72]; // [xsp+18h] [xbp-258h] BYREF

  sub_100000864(v2, &unk_100008018, 8);
  v1 = sub_1000009C0((__int64)v2 + *(_QWORD *)(v2[0] - 24LL));
  sub_1000009E4(v2);
  return v1;
}

雖然俺還不清楚中間三個函數的具體作用,但很明顯 v1 只要返回 true 就能通過驗證,,,接下來就是眾所周知的 mov x0, #1,,,

很抱歉,暫不支持 ARM64 彙編
很抱歉,暫不支持 ARM64 彙編

但是他內置的彙編器並不支持 ARM64,俺馬上跑到網上

在線編譯器你敢用嗎
在線編譯器你敢用嗎
我たま直接複製黏貼機器碼
我たま直接複製黏貼機器碼

崩不住了,哪有這樣子寫彙編的,,,最後俺找到了一個 👉🏻好用的彙編插件,可以直接在 IDA 裏面編譯 ARM64 彙編。

這簡直夢迴微機教室
這簡直夢迴微機教室

一通補丁操作後,打開 patched bytes 窗口一看:

是誰亂改了我的字節?
是誰亂改了我的字節?

我的二進制文件炸了,,,明明我只改了一個地方,,,
調查後發現是 IDA Pro 自作聰明地修改了程序的 relocation 表,破壞了俺的程序。
經過俺的測試,這種現象大概只會發生在 C++ 編譯的 ARM64 (macOS) 程序上。x64 程序、Go 程序或者 Windows 程序都沒有遇到這個問題。

目前俺的解決方法是先導出補丁後的二進制,用 Hex editor 打開,參照原文件並還原那些 byte。

運行之前,必須使用 codesign 對製作好的二進制進行簽名(macOS 真的嚴格

codesign -f -s - calc 

最終效果:

破解了 VIP 功能的計算器(
破解了 VIP 功能的計算器(

最後來試一下如何在 macOS 上啓動 IDA 調試器。

IDA Pro 內置了幾個調試 agent,放在 ‘/Applications/IDA Professional 9.1.app/Contents/MacOS/dbgsrv/’ 目錄下。

自帶的調試 agent
自帶的調試 agent

把這些 agent 複製到別的電腦上,即可實現在 macOS 上遠程調試各種架構的程序。(誰家裏沒有三四臺電腦呢?

macOS 上的調試權限比較難獲取,俺的 IDA 從來沒有成功拿到過。這裏先嘗試使用 mac_server_arm 進行本地調試。

IDA Mac OS X 64-bit remote debug server(MT) v9.1.31. Hex-Rays (c) 2004-2025
2025-06-14 15:20:09 Listening on :::23946 

給 IDA 設置一個遠端調試器(localhost):

遠端調試器設定
遠端調試器設定
調試中,下斷點。。。
調試中,下斷點。。。

可以看到輕鬆下斷點,輕鬆修改 X0 寄存器。

在 macOS 上調試 Windows 程序(Galgame (!!) 😱

既然遠端調試能行,那麼現在就來玩點大的,,,
當下正值 Summer Pockets 動畫化爆火出圈,俺又想起了那五年都沒打通的 SPRB 存檔,正好想趁此機會補習補習,結果被光盤驗證給攔住了。我當時究竟是怎麼玩上這遊戲的??

警告!請插入光盤。。。
警告!請插入光盤。。。

好在這個驗證已經有前人經驗可以參考。參照前人的做法,俺馬上趕到了光盤驗證的關鍵地點:

驗證函數分歧的老地方
驗證函數分歧的老地方

直接故伎重演,,,

我故伎重演
我故伎重演

有的羣友看到這邊可能要罵了,說好的調試器呢?!
啊對對對,真對不起,調試器屬於是高射炮打蚊子了,這個修改沒用到調試器 😅 下次一定

我哭了,我已經很久沒打開過這遊戲了,而且電腦也換了幾臺,Wine 環境也重裝過不知道多少遍,但存檔居然還在,我眼淚流下來了 😭😭

我很難受,叫,,,叫 AI 過來!!

上次玩逆向還是在上次,而上次玩逆向的時候生成式 AI 還沒火起來,一直以來俺都是在手撕彙編、人腦猜測變量名和數據結構中度過,,,那麼有沒有一種可能,把這些髒活累活交給 AI 來做呢?

這當然是可能的,而且市面上已經出現了很多幫助逆向工程的 AI 插件,能直接裝在 IDA 或者別的逆向工具上。俺之後會把那些插件列出來,不過現在,俺決定從剛剛的 Galgame 裏面挑一個函數來考考現在最先進的生成式 AI。

下面這個函數是 SiglusEngine 逐個對電腦上的驅動器進行枚舉的函數。我將向 LLM 提出兩個問題,看看牠們的表現如何。

👉🏻 點此展開反編譯函數
__int16 __cdecl sub_663E10()
{
  __int16 v0; // bx
  int v1; // esi
  void *v2; // eax
  int v3; // eax
  const WCHAR *v4; // eax
  const WCHAR *v5; // eax
  DWORD FileAttributesW; // eax
  int v8; // [esp+0h] [ebp-A0h]
  UINT uMode; // [esp+14h] [ebp-8Ch]
  void *v10[5]; // [esp+18h] [ebp-88h] BYREF
  unsigned int v11; // [esp+2Ch] [ebp-74h]
  void *Block[5]; // [esp+30h] [ebp-70h] BYREF
  unsigned int v13; // [esp+44h] [ebp-5Ch]
  void *v14; // [esp+48h] [ebp-58h]
  int v15; // [esp+58h] [ebp-48h]
  unsigned int v16; // [esp+5Ch] [ebp-44h]
  LPCWSTR lpFileName[5]; // [esp+60h] [ebp-40h] BYREF
  unsigned int v18; // [esp+74h] [ebp-2Ch]
  LPCWSTR lpRootPathName[5]; // [esp+78h] [ebp-28h] BYREF
  unsigned int v20; // [esp+8Ch] [ebp-14h]
  int v21; // [esp+9Ch] [ebp-4h]

  v0 = 0;
  v1 = 65;
  uMode = SetErrorMode(1u);
  while ( 1 )
  {
    v16 = 7;
    v15 = 0;
    LOWORD(v14) = 0;
    v21 = 0;
    v2 = (void *)sub_55D910(v10, v1);
    LOBYTE(v21) = 1;
    v3 = sub_6211E0(Block, v2, L":\\");
    LOBYTE(v21) = 2;
    LOWORD(lpRootPathName[0]) = 0;
    v20 = 7;
    lpRootPathName[4] = 0;
    sub_52F360(v3, 0, -1);
    if ( v13 >= 8 )
      j__free(Block[0]);
    v13 = 7;
    Block[4] = 0;
    LOWORD(Block[0]) = 0;
    if ( v11 >= 8 )
      j__free(v10[0]);
    v11 = 7;
    v10[4] = 0;
    LOWORD(v10[0]) = 0;
    LOBYTE(v21) = 6;
    if ( v16 >= 8 )
      j__free(v14);
    v16 = 7;
    LOWORD(v14) = 0;
    v4 = (const WCHAR *)lpRootPathName;
    if ( v20 >= 8 )
      v4 = lpRootPathName[0];
    v15 = 0;
    if ( GetDriveTypeW(v4) != 5 )
      goto LABEL_17;
    sub_533920(dword_BAE350 + 76, v8);
    v5 = (const WCHAR *)lpFileName;
    if ( v18 >= 8 )
      v5 = lpFileName[0];
    FileAttributesW = GetFileAttributesW(v5);
    if ( FileAttributesW != -1 && (FileAttributesW & 0x10) == 0 )
      break;
    if ( v18 >= 8 )
      j__free((void *)lpFileName[0]);
LABEL_17:
    v21 = -1;
    if ( v20 >= 8 )
      j__free((void *)lpRootPathName[0]);
    if ( (unsigned __int16)++v1 > 0x5Au )
      goto LABEL_25;
  }
  v0 = v1;
  if ( v18 >= 8 )
    j__free((void *)lpFileName[0]);
  v18 = 7;
  lpFileName[4] = 0;
  LOWORD(lpFileName[0]) = 0;
  if ( v20 >= 8 )
    j__free((void *)lpRootPathName[0]);
LABEL_25:
  SetErrorMode(uMode);
  return v0;
}

提問 1:解釋這個函數的功能 prompt

I am reverse engineering a program using IDA, and this function is part of the CD copy protection. Please speculate on the purpose and feature of this function.

💬 點此展開各家 AI 的對代碼的理解
DeepSeekBased on the provided disassembly, this function appears to be a **CD-ROM drive validator** designed to locate legitimate game discs as part of copy protection. Here's a breakdown of its purpose and key features:

Core Purpose

  1. Drive Letter Scanning
    Iterates through all drive letters (A: to Z:) to locate CD/DVD drives containing a specific protection file.

  2. CD-ROM Verification
    Uses GetDriveTypeW() to verify if a drive is a CD/DVD-ROM (type 5), filtering out other storage devices.

  3. File Existence Check
    Searches for a critical file (likely a copy protection signature) using GetFileAttributesW(). Requires:

    • File exists (GetFileAttributesW() != -1)
    • Is a regular file (!(attributes & 0x10), i.e., not a directory)

Key Features

  • Error Suppression
    SetErrorMode(1) silences system dialogs (e.g., “no disc” errors) during scanning.

  • String Handling
    Dynamically constructs paths:

    sub_55D910(v10, v1);        // Converts drive letter (e.g., 65/'A') to string
    sub_6211E0(Block, v2, L":\\"); // Appends ":\" to drive letter (e.g., "A:\\")
  • Resource Management
    Uses buffer reallocation checks (vXX >= 8) and free() calls to manage dynamically allocated path strings.

  • Termination Conditions
    Exits early when:

    • A valid CD-ROM drive containing the target file is found (returns drive letter as __int16)
    • All drives (A-Z) are scanned without success (returns 0).

Protection Workflow

Speculated File Characteristics

The target file (referenced in lpFileName) is likely:

  • A hidden/read-only file to deter casual copying
  • Located in the root directory of the disc
  • Contains cryptographic signatures or unique disc identifiers
  • Filename possibly obfuscated/hardcoded (not visible in snippet)

Why This Defeats Simple Copies

  • Checks for physical media (CD-ROM drive type)
  • Requires original disc structure (specific root file)
  • Silent operation prevents user debugging via error messages
  • Iterative scanning thwarts virtual drive tricks (e.g., unmounted drives)

This function forms the media verification layer of the protection system, ensuring the game only runs with an original disc inserted. Later stages would validate the file’s contents against game binaries.


Claude (非 Reasoning)

Looking at this disassembled function, this appears to be a CD-ROM drive detection routine that’s specifically designed to locate a physical CD drive containing what’s likely the original game disc. Here’s my analysis of its purpose and features:

Primary Purpose

This function scans through all available drive letters (A-Z, ASCII 65-90) to find a CD-ROM drive that contains a specific file, presumably part of the copy protection scheme.

Key Features & Functionality

Drive Enumeration: The function iterates through drive letters starting from ‘A’ (ASCII 65) up to ‘Z’ (ASCII 90), systematically checking each one.

CD-ROM Detection: It uses GetDriveTypeW() with a return value check of 5, which corresponds to DRIVE_CDROM in the Windows API. This specifically targets optical disc drives.

File Verification: For each CD-ROM drive found, it constructs a file path and uses GetFileAttributesW() to verify that:

  • The file exists (return value != -1)
  • The target is a file, not a directory (checks that the directory flag 0x10 is not set)

Error Suppression: The function calls SetErrorMode(1u) at the start, which corresponds to SEM_FAILCRITICALERRORS. This prevents Windows from showing error dialog boxes when accessing drives that might not be ready (like empty CD drives).

Memory Management: The code includes extensive string buffer management and cleanup, suggesting it’s building file paths dynamically.

Copy Protection Strategy

This is a classic “disc-in-drive” copy protection check. The software is likely looking for a specific file that exists on the original CD but wouldn’t be present on a copied disc or when running from a hard drive installation. The function returns the drive letter (as an integer) where the authentic disc is found, or 0 if no valid disc is detected.

This type of protection was common in the 1990s and early 2000s, requiring users to keep the original CD in their drive to play the game, even after installation.


ChatGPT

Here’s what this routine is doing, step by step, and why it fits into a CD‑based copy‑protection scheme:

  1. Suppressing error pop‑ups

    uMode = SetErrorMode(1);

    It calls SetErrorMode(SEM_FAILCRITICALERRORS), so that if it probes a non‑existent or unready drive (e.g. no disk in the drive), Windows won’t show a “please insert a disk” dialog.

  2. Brute‑forcing through all drive letters ‘A’…‘Z’

    for (v1 = 'A'; v1 <= 'Z'; ++v1) {
        // … build up a path based on letter v1 …
    }

    It loops v1 from 65 to 90 (ASCII ‘A’ to ‘Z’), attempting each letter as a candidate.

  3. Constructing the root path string

    v2 = sub_55D910(v10, v1);               // e.g. allocates/formats “A:\” in v10
    v3 = sub_6211E0(Block, v2, L":\\");     // maybe splits or duplicates that string
    sub_52F360(v3, 0, -1);                  // something like wcscpy or wcslwr

    Although the exact helpers are unknown, they’re clearly allocating and formatting a wide‐string like "X:\" for the current drive letter.

  4. Checking for CD‑ROM drive type

    if ( GetDriveTypeW(lpRootPathName[0]) == DRIVE_CDROM )
    
    else
        continue;

    It calls GetDriveTypeW(L"X:\") and compares against 5 (which is DRIVE_CDROM). Only CD‑ROM drives keep going.

  5. Reading some volume information or label

    sub_533920(dword_BAE350 + 76, v8);

    This almost certainly calls GetVolumeInformationW under the hood, storing the CD’s volume label (or serial) into a buffer at dword_BAE350 + 76. Many copy‑protections used a custom volume label check.

  6. Verifying presence of a special file

    FileAttributesW = GetFileAttributesW(lpFileName[0]);
    if ( FileAttributesW != INVALID_FILE_ATTRIBUTES 
         && !(FileAttributesW & FILE_ATTRIBUTE_DIRECTORY) )
        break;    // success!

    It calls GetFileAttributesW on another constructed path lpFileName[0] (likely something like "X:\PROTECT.DAT" or whatever the protection file is called). If the file exists and is not a directory, it treats that as a match.

  7. Cleanup and return

    • On success it stores the drive letter in v0 and jumps out.
    • On failure to find any matching CD‑ROM, it leaves v0 = 0.
    • It frees any heap buffers it allocated along the way.
    • Restores the original error‑mode with SetErrorMode(uMode) and returns v0 (the ASCII code of the working drive, or 0).

What this tells us about the copy‑protection feature

  • Drive enumeration & type check: it’s explicitly looking for a CD‑ROM drive, so the software expects an optical disc.
  • Volume label check: by calling (presumably) GetVolumeInformationW, it can verify the disc has the correct volume name—many protection schemes would tell users to “label your CD ‘GAMEPROT89’” or rely on the genuine manufacturer’s disk label.
  • File‐presence check: it looks for a specific, presumably hidden or custom file (like COPYPROT.DAT or a uniquely named binary blob) that wouldn’t be present on a simple data‐only copy.
  • Stealth: suppressing Windows dialogs and cleaning up all allocations makes this check “silent,” so a casual user copying files off the disc wouldn’t even see an error prompt.

Taken together, this function is the classic “find the original CD” routine: scan all drives, find the one that’s a CD‑ROM, verify it has the right volume label and that a secret/protected file exists, and return that drive letter (or zero if no valid CD is found).


GeminiBased on the analysis of the provided IDA pseudo-code, here is a speculation on the purpose and features of the `sub_663E10` function.

Purpose: CD Drive and Media Verification

The primary purpose of this function is to locate a specific file on a CD-ROM drive. This is a very common technique used in CD-based copy protection schemes to verify that the original, authentic game or software disc is present in the computer.

The function systematically checks each available drive, determines if it’s a CD-ROM drive, and if so, looks for a predefined “check file.”

Functional Breakdown

  1. Iterate Through Drive Letters: The function loops through all possible drive letters from ‘A’ to ‘Z’. This is evident from the initialization of v1 to 65 (the ASCII code for ‘A’) and the loop’s exit condition, which checks if v1 has exceeded 'Z' (ASCII 0x5A).

  2. Suppress System Errors: The call SetErrorMode(1u) at the beginning is crucial. It prevents the operating system from showing a pop-up message (like “There is no disk in the drive.”) if it checks a drive that is empty. This ensures the copy protection check runs silently in the background without interrupting the user. The original error mode is restored at the end of the function.

  3. Identify CD-ROM Drives: Inside the loop, the function calls GetDriveTypeW. It specifically checks if the result is 5, which corresponds to DRIVE_CDROM. This step filters out hard drives, network drives, and other non-CD devices.

  4. Check for a Specific File: Once a CD-ROM drive is identified, the code calls GetFileAttributesW to check for the existence of a particular file. The function isn’t just checking if any CD is present, but if the correct CD is present, which is verified by the existence of this specific file. The file’s path is constructed by combining the drive letter with a hardcoded filename (likely referenced by dword_BAE350 + 76).

  5. Return Value:

    • If a CD-ROM drive containing the target file is found, the loop is broken, and the function returns the ASCII value of the drive letter (e.g., 69 for ‘E’).
    • If the loop completes without finding the file on any CD-ROM drive, the function returns its initial value of 0, indicating failure.

In summary, this function is a classic “CD check” routine. Its sole feature is to confirm the physical presence of the original media by searching all CD-ROM drives for a unique file.

提問 2:推測變量類型和變量名 prompt

I am reverse engineering a program using IDA, and this function is part of the CD copy protection. Please speculate on the types and names of the variables and subroutines used in this function, and output the result code snippet. Also, try to recover the original logical structure of the function and eliminate goto statements.

💬 點此展開各家 AI 的還原的函數代碼
DeepSeek 還原的函數代碼
__int16 FindCDDriveWithFile()
{
    UINT uOldErrorMode = SetErrorMode(SEM_FAILCRITICALERRORS);
    __int16 foundDrive = 0;

    for (__int16 driveLetter = 'A'; driveLetter <= 'Z'; driveLetter++)
    {
        // Build drive root path (e.g., "A:\\")
        WCHAR szDriveRoot[4] = { driveLetter, L':', L'\\', L'\0' };
        
        // Check if drive is CD-ROM
        if (GetDriveTypeW(szDriveRoot) != DRIVE_CDROM)
            continue;

        // Build expected file path (global + relative path)
        const WCHAR* pBasePath = *(const WCHAR**)(dword_BAE350 + 76);
        size_t baseLen = wcslen(pBasePath);
        size_t rootLen = wcslen(szDriveRoot);
        
        WCHAR* szFilePath = (WCHAR*)malloc((rootLen + baseLen + 1) * sizeof(WCHAR));
        wcscpy(szFilePath, szDriveRoot);
        wcscat(szFilePath, pBasePath);

        // Check if file exists and is not a directory
        DWORD dwAttrs = GetFileAttributesW(szFilePath);
        if (dwAttrs != INVALID_FILE_ATTRIBUTES && 
           !(dwAttrs & FILE_ATTRIBUTE_DIRECTORY))
        {
            foundDrive = driveLetter;
            free(szFilePath);
            break;
        }
        
        free(szFilePath);
    }

    SetErrorMode(uOldErrorMode);
    return foundDrive;
}

Claude 還原的函數代碼
#include <windows.h>

// Helper function prototypes (these would be defined elsewhere)
void* BuildDriveLetterString(void* buffer, int driveNumber);
int AppendString(void* dest, void* src, const wchar_t* suffix);
void InitializeStringBuffer(void* buffer, int unused1, int unused2);
void BuildProtectionFilePath(void* globalData, void* outputBuffer);

// Global data structure containing protection file information
extern void* g_ProtectionData; // Corresponds to dword_BAE350

int16 __cdecl DetectProtectedCDROM()
{
    int16 foundDriveLetter = 0;
    int currentDriveLetter = 'A';  // Start with drive A (ASCII 65)
    
    // String buffers with capacity management
    struct StringBuffer {
        void* data[5];
        unsigned int capacity;
    };
    
    StringBuffer driveLetterBuffer = {0};
    StringBuffer drivePathBuffer = {0};
    StringBuffer protectionFilePathBuffer = {0};
    StringBuffer rootPathBuffer = {0};
    
    // Set capacity for string buffers
    driveLetterBuffer.capacity = 7;
    drivePathBuffer.capacity = 7;
    protectionFilePathBuffer.capacity = 7;
    rootPathBuffer.capacity = 7;
    
    // Disable system error dialogs during drive scanning
    UINT previousErrorMode = SetErrorMode(SEM_FAILCRITICALERRORS);
    
    // Scan through all possible drive letters (A-Z)
    while (currentDriveLetter <= 'Z') {
        // Reset string buffers for this iteration
        memset(&driveLetterBuffer, 0, sizeof(driveLetterBuffer));
        memset(&drivePathBuffer, 0, sizeof(drivePathBuffer));
        memset(&protectionFilePathBuffer, 0, sizeof(protectionFilePathBuffer));
        memset(&rootPathBuffer, 0, sizeof(rootPathBuffer));
        
        driveLetterBuffer.capacity = 7;
        drivePathBuffer.capacity = 7;
        protectionFilePathBuffer.capacity = 7;
        rootPathBuffer.capacity = 7;
        
        try {
            // Build drive letter string (e.g., "C")
            void* driveStr = BuildDriveLetterString(&driveLetterBuffer, currentDriveLetter);
            
            // Build root path string (e.g., "C:\")
            AppendString(&drivePathBuffer, driveStr, L":\\");
            
            // Initialize protection file path buffer
            InitializeStringBuffer(&protectionFilePathBuffer, 0, -1);
            
            // Get the root path for drive type checking
            const wchar_t* rootPath = (const wchar_t*)rootPathBuffer.data;
            if (rootPathBuffer.capacity >= 8) {
                rootPath = (const wchar_t*)rootPathBuffer.data[0];
            }
            
            // Check if this is a CD-ROM drive (type 5 = DRIVE_CDROM)
            UINT driveType = GetDriveTypeW(rootPath);
            if (driveType == DRIVE_CDROM) {
                // Build path to protection file on this CD-ROM
                BuildProtectionFilePath(g_ProtectionData, &protectionFilePathBuffer);
                
                // Get the protection file path
                const wchar_t* protectionFilePath = (const wchar_t*)protectionFilePathBuffer.data;
                if (protectionFilePathBuffer.capacity >= 8) {
                    protectionFilePath = (const wchar_t*)protectionFilePathBuffer.data[0];
                }
                
                // Check if protection file exists and is not a directory
                DWORD fileAttributes = GetFileAttributesW(protectionFilePath);
                if (fileAttributes != INVALID_FILE_ATTRIBUTES && 
                    !(fileAttributes & FILE_ATTRIBUTE_DIRECTORY)) {
                    // Found the protected CD-ROM!
                    foundDriveLetter = currentDriveLetter;
                    break;
                }
            }
        }
        catch (...) {
            // Clean up on any exception and continue to next drive
        }
        
        // Clean up string buffers before next iteration
        if (driveLetterBuffer.capacity >= 8) {
            free(driveLetterBuffer.data[0]);
        }
        if (drivePathBuffer.capacity >= 8) {
            free(drivePathBuffer.data[0]);
        }
        if (protectionFilePathBuffer.capacity >= 8) {
            free(protectionFilePathBuffer.data[0]);
        }
        if (rootPathBuffer.capacity >= 8) {
            free(rootPathBuffer.data[0]);
        }
        
        currentDriveLetter++;
    }
    
    // Final cleanup
    if (protectionFilePathBuffer.capacity >= 8) {
        free(protectionFilePathBuffer.data[0]);
    }
    if (rootPathBuffer.capacity >= 8) {
        free(rootPathBuffer.data[0]);
    }
    
    // Restore original error mode
    SetErrorMode(previousErrorMode);
    
    return foundDriveLetter;
}

// Simplified version without complex string buffer management
int16 __cdecl DetectProtectedCDROM_Simplified()
{
    UINT previousErrorMode = SetErrorMode(SEM_FAILCRITICALERRORS);
    
    // Scan drive letters A through Z
    for (int driveLetter = 'A'; driveLetter <= 'Z'; driveLetter++) {
        wchar_t rootPath[4];
        swprintf(rootPath, 4, L"%c:\\", driveLetter);
        
        // Check if this is a CD-ROM drive
        if (GetDriveTypeW(rootPath) == DRIVE_CDROM) {
            // Build path to protection file
            wchar_t protectionFilePath[MAX_PATH];
            // This would use the actual protection file name from global data
            swprintf(protectionFilePath, MAX_PATH, L"%c:\\PROTECTION_FILE.DAT", driveLetter);
            
            // Check if protection file exists and is a regular file
            DWORD attributes = GetFileAttributesW(protectionFilePath);
            if (attributes != INVALID_FILE_ATTRIBUTES && 
                !(attributes & FILE_ATTRIBUTE_DIRECTORY)) {
                SetErrorMode(previousErrorMode);
                return (int16)driveLetter;
            }
        }
    }
    
    SetErrorMode(previousErrorMode);
    return 0; // No protected CD-ROM found
}

ChatGPT 還原的函數代碼
#include <windows.h>

// --- Speculative helper routines ---
WCHAR*   FormatDriveRoot(WCHAR *outRoot, char driveLetter);
//   Sub_55D910: builds L"X:" into outRoot based on driveLetter ('A'–'Z').

WCHAR*   AppendSuffix(WCHAR *outBuf, const WCHAR *base, const WCHAR *suffix);
//   Sub_6211E0: concatenates base + suffix into outBuf.

void     FillMemoryPattern(void *buf, BYTE pattern, size_t size);
//   Sub_52F360: e.g. memset(buf, pattern, size).

void     ReadCdSector(const void *configPtr, int *outSectorData);
//   Sub_533920: perhaps reads a sector or header into outSectorData.

// --- Recovered function ---
uint16_t __cdecl CheckCdCopyProtection(void)
{
    uint16_t foundDriveLetter = 0;
    UINT     prevErrorMode    = SetErrorMode(SEM_FAILCRITICALERRORS);

    // Try each drive letter from 'A' to 'Z'
    for (char letter = 'A'; letter <= 'Z'; ++letter)
    {
        WCHAR rootPath[8] = {0};
        WCHAR workBuf [260] = {0};  // big enough for any Build + suffix + filename
        int   sectorInfo = 0;

        // 1) Build the root path string, e.g. "X:"
        FormatDriveRoot(rootPath, letter);

        // 2) Append ":\\" so we can call GetDriveTypeW()
        AppendSuffix(workBuf, rootPath, L":\\");
        FillMemoryPattern(workBuf, 0, sizeof(workBuf));  // clear out any leftover

        // 3) Check if this drive is a CD-ROM
        if ( GetDriveTypeW(rootPath) != DRIVE_CDROM )
            continue;

        // 4) Read whatever sector or header the copy‑protection needs
        ReadCdSector((const void*)(dword_BAE350 + 76), &sectorInfo);

        // 5) Build the expected “protection file” name into workBuf
        //    (e.g. rootPath + L"\\PROTECT.DAT" or similar, based on sectorInfo)
        AppendSuffix(workBuf, rootPath, /* format depends on sectorInfo */ L"\\PROTECT.DAT");

        // 6) See if that file exists (and is not a directory)
        DWORD attrs = GetFileAttributesW(workBuf);
        if ( attrs != INVALID_FILE_ATTRIBUTES
             && !(attrs & FILE_ATTRIBUTE_DIRECTORY) )
        {
            foundDriveLetter = (uint16_t)letter;
            break;
        }

        // otherwise keep scanning
    }

    // restore previous error‑mode and return 0 if none found, or the ASCII code of the drive letter
    SetErrorMode(prevErrorMode);
    return foundDriveLetter;
}

Gemini 還原的函數代碼
/**
 * @brief Scans all drives to find the game's CD.
 *
 * Iterates through drive letters 'A' to 'Z', checking for a CD-ROM drive
 * containing a specific, hardcoded protection file.
 *
 * @return Returns the drive letter (e.g., 'A', 'B') if the CD is found, otherwise returns 0.
 */
wchar_t FindGameCD()
{
    wchar_t foundDriveLetter = 0;

    // Suppress system pop-ups for empty drives.
    // SEM_FAILCRITICALERRORS = 1
    UINT previousErrorMode = SetErrorMode(1u);

    // Iterate through all possible drive letters from 'A' to 'Z'.
    for (wchar_t currentDriveLetter = L'A'; currentDriveLetter <= L'Z'; ++currentDriveLetter)
    {
        // Construct the root path for the current drive, e.g., "A:\\".
        // In the original code, this was done through a series of helper functions.
        wchar_t rootPathStr[4] = { currentDriveLetter, L':', L'\\', L'\0' };

        // Check if the drive is a CD-ROM drive. DRIVE_CDROM = 5.
        if (GetDriveTypeW(rootPathStr) == DRIVE_CDROM)
        {
            // Construct the full path to the protection file.
            // The original code used a helper function (sub_533920) and a global
            // variable (dword_BAE350 + 76) for the filename.
            // Let's assume the file is "PROT\\GAME.DAT" for this example.
            wchar_t protectionFilePath[MAX_PATH];
            swprintf(protectionFilePath, MAX_PATH, L"%s%s", rootPathStr, L"PROT\\GAME.DAT");

            // Check if the protection file exists and is not a directory.
            DWORD fileAttributes = GetFileAttributesW(protectionFilePath);
            if (fileAttributes != INVALID_FILE_ATTRIBUTES && !(fileAttributes & FILE_ATTRIBUTE_DIRECTORY))
            {
                // File found, save the drive letter and exit the loop.
                foundDriveLetter = currentDriveLetter;
                break;
            }
        }
    }

    // Restore the previous error mode.
    SetErrorMode(previousErrorMode);

    return foundDriveLetter;
}

LLM 逆向工程能力對比

基於對上述 CD 光盤驗證函數的分析,俺對四個主流 LLM 模型的逆向工程理解能力進行了詳細評估:

評估項目DeepSeekClaude (非 Reasoning)ChatGPTGemini
驅動器枚舉理解🟢 完全理解 A-Z 枚舉🟢 完全理解🟢 完全理解🟢 完全理解
光盤類型識別🟢 正確識別 GetDriveTypeW() == 5🟢 正確識別🟢 正確識別🟢 正確識別
錯誤抑制機制🟢 正確識別 SetErrorMode(1) 用途🟢 正確識別⭐ 解釋 SEM_FAILCRITICALERRORS🟢 正確識別
文件驗證邏輯🟢 理解文件存在性檢查🟢 理解文件屬性驗證🟢 理解目錄標誌檢查🟢 理解特定文件查找
返回值理解⭐ 準確說明返回驅動器字母⭐ 準確說明返回驅動器字母⭐ 準確說明 ASCII 值返回⭐ 準確說明 ASCII 值返回
拷貝保護原理⭐ 深入解釋保護機制⭐ 深入解釋保護策略⭐ 深入解釋保護原理⭐ 深入解釋保護目的
技術深度⭐ 包含 Mermaid 流程圖🟢 結構化分析清晰⭐ 逐行代碼分析🟢 功能分解詳細
專業術語使用⭐ 準確使用逆向術語⭐ 準確使用 Windows API⭐ 準確使用系統調用⭐ 準確使用技術術語
實用性建議⭐ 提供後續驗證思路🟢 提供歷史背景⭐ 提供具體技術細節🟡 基礎功能說明

各家 AI 都能滿足作為逆向工程助手的需求,但很明顯 DeepSeek 和 ChatGPT 同樣處於第一梯隊,ChatGPT 給出了最為詳細的代碼解釋,而 DeepSeek 則畫出了流程圖,這對於理解代碼的邏輯結構非常有用。

在還原原有函數的邏輯結構方面,四個 LLM 模型的對比結果如下:

評估項目DeepSeekClaude (非 Reasoning)ChatGPTGemini
代碼簡潔性⭐ 最簡潔清晰🟡 過度複雜化🟢 適度詳細🟢 簡潔明瞭
核心邏輯還原⭐ 完美還原驅動枚舉⭐ 完整保留原始結構⭐ 準確識別主要流程⭐ 正確理解核心功能
內存管理處理🟢 簡化為標準庫調用🔴 過度模擬原始複雜性🟢 適當抽象化🟢 使用現代 C++ 風格
變量命名⭐ 語義化命名最佳🟢 描述性命名🟢 專業命名規範🟢 直觀易懂
錯誤處理🟢 正確使用 SetErrorMode🟢 完整錯誤處理🟢 專業錯誤處理🟢 標準錯誤處理
代碼可讀性⭐ 最易閱讀理解🟡 過於繁瑣🟢 結構清晰🟢 邏輯清晰

對於幫助逆向工程師理解這段代碼,DeepSeek 在簡潔和精準上取得了最佳平衡,給出的代碼適合實际重寫和維護;
而 ChatGPT 也給出了非常詳細的代碼解釋,並且盡最大努力推測了一些中間被調用的函數,非常適合喜歡深度研究的人繼續挖掘。

而 Claude?妳免費版甚至沒有 Reasoning 模式;Gemini 感覺也多少差點意思(我就是歧視了,怎麼了?
還是不要來湊熱鬧了,,,

ARM64 彙編初探

最後,俺想分享一下俺作爲一個前 x86 的逆向工程師,初見 ARM64 彙編的感受。

寄存器

下面是 x86 的寄存器:

寄存器作用寄存器作用
RAX累加器,通常用於函數返回值R8第 5 個參數
RBX基址寄存器,調用約定中是被調用者保存R9第 6 個參數
RCX第 4 個參數R10通用寄存器,調用者保存
RDX第 3 個參數R11通用寄存器,調用者保存
RSI第 2 個參數R12通用寄存器,被調用者保存
RDI第 1 個參數R13通用寄存器,被調用者保存
RBP幀指針R14通用寄存器,被調用者保存
RSP棧指針R15通用寄存器,被調用者保存

下面是 ARM64 的寄存器:

寄存器作用寄存器作用
X0-X7參數寄存器,函數返回值在 X0X16 (IP0)內部程序調用臨時寄存器
X8間接結果寄存器,通常用於系統調用X17 (IP1)內部程序調用臨時寄存器
X9-X15臨時寄存器,調用者保存X18平台寄存器(保留)
X19-X28被調用者保存的寄存器X29幀指針
X30 (LR)鏈接寄存器(返回地址)SP棧指針

稍微對比一下可以看到,X0 相當於 RAX,用來傳遞函數的返回值,X29 相當於 RBP。其他的寄存器怎麼用是取決於編譯器的心情。

棧操作

在 x86 中,棧操作通常使用 pushpop 指令。這些指令會把寄存器的值壓入棧中或從棧中彈出。
比如 push rax 會把 RAX 的值壓入棧中,而 pop rbx 會從棧中彈出一個值到 RBX。棧指針 RSP 會自動調整,並且總是指向棧頂數據上面一格的位置。

而在 ARM64 中,棧操作使用 stpldp 指令。這些指令可以同時傳送兩組數據(store/load pair),並且棧指針需要手動聲明調整大小。
比如說現在我要壓棧四個寄存器:

STP             X28, X27, [SP, #-0x20]! ; Store Pair
STP             X29, X30, [SP, #0x10] ; Store Pair

這下抽象了,這是什麼意思呢?俺們知道寄存器的大小是 64 bit,也就是 8 個 byte。兩個寄存器就是 0x10 個 byte。
於是第一條指令,在當前棧指針的上方 4 個坑位(0x20 個 byte)存儲 X28 和 X27 的值。而且這個指令後面帶了一個感歎號,表示棧指針 SP 會自動減去 0x20 個 byte,再把數據塞到 SP 指向的地方 (!!)。
然後第二條指令,由於沒有感歎號,所以 SP 會保持不變,然後把 X29 和 X30 的值存儲到 SP 下方兩個坑位(0x10 個 byte)的位置。

注意,SP 指針在訪問內存的時候必須要對齊到 16 個 byte 的邊界,所以這裏的 #0x20 和 #0x10 都是 16 的倍數。
執行完這兩條指令後,棧的狀態如下:

棧地址內容
SPX28
SP + 8X27
SP + 16X29
SP + 24X30
…棧底…

x86,你萬萬沒有想到棧指針是要手動調整的…!!

再比如,我要從棧上彈出四個寄存器:

LDP             X29, X30, [SP, #0x10] ; Load Pair
LDP             X28, X27, [SP], #0x20

第一行應該好理解,就是從當前 SP 下方兩個坑位(0x10 個 byte)讀取 X29 和 X30 的值,然後 SP 會保持不變。
妳可能想罵娘了,為什麼讀取了棧上的東西棧指針可以不變?為什麼可以直接從棧的中間拿數據?我知道妳很急,但是妳先別急,
第二行的 #0x20 是在方括號外面的,表示這個指令會從 SP 當前位置讀取 X28 和 X27 的值,然後把 SP 往下移動 0x20 個 byte,一瞬間清理掉 4 個坑位的棧空間。

我知道妳越來越急了,不僅是因為這個棧不再是後進先出,也是因為加個感歎號就能自動改 SP,把索引寫在方括號外面也能自動改 SP,簡直就是精神污染,,,

這就像是有些人在超市裏面買東西,還沒結帳呢就開吃了,,,

函數調用,棧楨和返回

在 x86 中,調用一個函數通常是用 call 指令,返回則是用 ret 指令。
call 指令會自動把返回地址壓入棧中,然後跳轉到函數的入口地址。ret 指令則會從棧中彈出返回地址,然後跳轉回去。

這個過程中棧的變化可以如下所示:

準備 callcall 了函數執行中準備 retret 了
caller 的 RBP
返回地址返回地址返回地址
傳遞參數傳遞參數傳遞參數傳遞參數傳遞參數

在 ARM64 中,調用函數使用 bl 指令(branch with link),這會把返回地址存儲到鏈接寄存器 X30 中。返回則使用 ret 指令,這會從 X30 中讀取返回地址。
X30 隨時會被之後的程序覆蓋,這就導致在真正執行函數之前,先要把 X30 壓入棧;在準備 return 之前也要把返回地址從棧中彈出到 X30。

俺們來對比一下 x86 和 ARM64 的函數調用過程,先是爹函數:

; x86 函數調用
CALL 函數入口
; ARM64 函數調用
BL 函數入口

然後是子函數:

; x86 子函數
PUSH  RBP           ; 保存幀指針
MOV   RBP, RSP      ; 設置新的幀指針
SUB   RSP, 0x20     ; 為局部變量分配空間 

;... 函數體 ...

MOV   RAX, 返回值    ; 設置返回值
MOV   RSP, RBP      ; 恢復棧指針並釋放局部變量空間
POP   RBP           ; 恢復幀指針
RET                 ; 返回到調用者
; ARM64 子函數
SUB   SP, SP, #0x20        ; 為局部變量分配空間
STP   X29, X30, [SP,#0x10] ; 保存幀指針和鏈接寄存器(返回地址)
ADD   X29, SP, #0x10       ; 設置新的幀指針

;... 函數體 ...

MOV   X0, 返回值            ; 設置返回值
LDP   X29, X30, [SP,#0x10] ; 恢復幀指針和鏈接寄存器(返回地址)
ADD   SP, SP, #0x20        ; 恢復棧指針並釋放局部變量空間
RET                        ; 返回到調用者

怎麼樣,是不是對 ARM64 到處亂飛的棧指針和隨地亂射的幀指針,逐漸,😁,感到,😁,習慣了?
這就對啦! 😁👍🏻

結語

自從 2020 年蘋果發佈了 M1 芯片之後,ARM 的流行勢不可擋。當時發佈的 x86 兼容層 Rosetta 2,給俺們遊戲玩家帶來了那麼多的歡樂,最近蘋果卻宣佈它過兩年就要退役了。
按照這個趨勢發展下去,x86 被掃進歷史的垃圾堆很可能會比俺們想象中的更快到來。

俺們這些老 x86 上的逆向愛好者,曾一度以為 x86 將會貫穿自己的整個生涯,直到,😁,走進棺材,😁,然而,現在的情況,😁,反而是,Intel,😁,會比俺們先走進棺材,,,

是時候跟上時代,走出舒適區了。

學習建議和參考資料

學習建議

俺衷心建議妳採用興趣驅動的學習方法。找一個感興趣但是看上去又不怎麼難的程序拿來逆向,並嘗試解決妳中途遇到的問題,並在這個過程中增長經驗。

人與人的入門方法不一樣,就像俺是從 CheatEngine 遊戲修改開始的,也有另外一位羣友正在研究使用 CheatEngine 來修改植物大戰殭屍。如此由淺入深地自由研究是最好的。記得向我分享妳的經驗,見解和成果!

要是目前還沒有一個明確的目標的話,俺有一些資料想要分享給妳。

參考資料

🎮 遊戲破解 👇🏻


Next Post
品嚐你日露露(上)|交通標誌亂立🛑

darkreader icon 請停用 Dark Reader

親愛的 Dark Reader 用戶:

本博客已內置了深色模式,並且 Dark Reader 會導致部分元素顯示錯誤.
請在本博客上停用 Dark Reader,謝謝,,,

🌸
🌸
🌸