================================================================
ASUS ProStudio W7604 / i9-13980HX / Arch Linux
Performance Optimization Session Summary
================================================================

# ASUS ProStudio W7604 — Linux CPU Throttling Investigation

## System

| Component | Detail |
|-----------|--------|
| Laptop | ASUS ProStudio W7604 (W7604J3 family) |
| CPU | Intel Core i9-13980HX (Raptor Lake-HX, 24 cores / 32 threads) |
| OS | Arch Linux, KDE Plasma |
| Bootloader | systemd-boot (`/boot/loader/entries/linux.conf`) |
| Shell | fish |

---

## The Problem

The system was experiencing **severe CPU frequency throttling under sustained
multi-core load**, specifically during long compilation jobs. The CPU would
briefly run at higher frequencies for 10–15 seconds, then drop and hold at
a fraction of its rated performance for the entire remainder of the workload.

This made compilation jobs significantly slower than expected for a high-end
mobile workstation CPU.

---

## Observed Symptoms

- Under `stress-ng --cpu 0`, CPU frequencies would burst to ~3.8 GHz for
  roughly 10–15 seconds, then permanently settle to:
  - **P-cores: ~2.8 GHz**
  - **E-cores: ~2.3 GHz**
- This plateau persisted for the entire duration of the load regardless of
  thermal headroom — temperatures were only ~74–80°C with fans barely active
- CPU package power draw would start at ~130W during the burst phase, then
  drop and hold at **~55W** once throttled — the classic PL1 power limit clamp
- Fans were running at only ~3,800 RPM under full CPU load, indicating the
  firmware was not responding to the thermal demand
- The ACPI `platform_profile` and `throttle_thermal_policy` sysfs interfaces
  were not present — standard ASUS Linux tooling had no effect on the system
- `scaling_available_frequencies` (under `acpi-cpufreq`) reported a maximum
  of **2.401 GHz** — the firmware was presenting a throttled P-state table
  to the OS

---

## What We Were Fixing

1. **Raise and maintain the MMIO RAPL power limits** so the CPU can sustain
   higher than 55W under load — the direct cause of the frequency plateau

2. **Switch the EC to its performance thermal profile** at every boot, so
   fans respond correctly and the EC stops repeatedly resetting power limits
   to conservative values

3. **Ensure the correct CPU frequency driver is loaded** so the OS presents
   the correct hardware frequency ceiling to the scheduler rather than the
   firmware's throttled P-state table

4. **Make all of the above persistent** across reboots via systemd services
   and configuration files, since the EC resets all of this on every power
   cycle


----------------------------------------------------------------
CPU CORE TOPOLOGY — Intel Core i9-13980HX (Raptor Lake-HX)
----------------------------------------------------------------

CORE LAYOUT
  Total cores:    24
  Total threads:  32
  P-cores:         8 x Intel Raptor Cove architecture (HT = 2 threads each = 16 threads)
  E-cores:        16 x Intel Gracemont architecture (1 thread each = 16 threads)

FREQUENCY SPECIFICATIONS (Intel spec)
  P-core base frequency:        2.2 GHz
  P-core max turbo (single):    5.6 GHz
  P-core max turbo (all-core):  ~3.5-4.5 GHz (workload dependent)
  E-core max turbo:             4.0 GHz
  E-core base frequency:        1.6 GHz

  Note: The 5.6 GHz figure is the single-core Intel Turbo Boost Max 3.0
  peak for the best-performing P-core. All-core sustained turbo under
  full load is significantly lower and depends entirely on power limits
  and thermal headroom.

CURRENT OBSERVED FREQUENCIES UNDER LOAD (this system)
  Burst phase (~10-15s):        ~3.2-3.8 GHz P-cores, limited by PL2=130W
  Sustained phase (after burst): previously ~2.8 GHz P / ~2.3 GHz E at 55W
                                 now ~3.2-3.8 GHz P / ~2.4 GHz E at 100W
  CPU cores in /proc/cpuinfo:   cpu0-cpu15  = P-cores (logical)
                                 cpu16-cpu31 = E-cores

LINUX DRIVER NOTE
  With intel_pstate=active, scaling_max_freq correctly reports 5.4 GHz
  for P-cores. With intel_pstate=disable (acpi-cpufreq), the ACPI _PSS
  table incorrectly reported 2.401 GHz as the ceiling for all cores —
  this was one of the early misdiagnoses in this session. The 4.0 GHz
  E-core ceiling previously caused confusion because intel_pstate in
  some modes reads it as the global hardware turbo limit, incorrectly
  capping P-cores at 4.0 GHz as well.


----------------------------------------------------------------
SYSTEM CONFIGURATION — WHAT IS WORKING
----------------------------------------------------------------

KERNEL BOOT PARAMETERS
  File: /boot/loader/entries/linux.conf (options line)

  acpi_enforce_resources=lax
    - Allows asus-isa-000a hwmon driver to load
    - Exposes /sys/class/hwmon/hwmonX/pwm1_enable and pwm2_enable
    - Required for any fan control from the OS side

  intel_pstate=active
    - Uses Intel P-state driver directly
    - Bypasses ACPI _PSS table which incorrectly caps at 2.4 GHz
    - Correctly exposes 5.4 GHz scaling_max_freq
    - Note: intel_pstate=disable was tried and reverted — it caused
      acpi-cpufreq to read the _PSS table and cap frequencies at
      2.401 GHz, making performance worse


THROTTLED  (/etc/throttled.conf)
  - Continuously re-applies CPU package power limits every 5 seconds
  - Targets BOTH MSR and MMIO (intel-rapl-mmio) RAPL domains
  - Critical: the EC resets the MMIO RAPL domain to ~55W every few
    seconds; without throttled the CPU sustains only ~55W / ~2.8 GHz
  - With throttled at PL1=100W the CPU sustains ~100W / ~3.2-3.8 GHz

  Key settings in [AC] section:
    Enabled: True
    Update_Rate_s: 5
    PL1_Tdp_W: 100
    PL1_Duration_s: 128
    PL2_Tdp_W: 130
    PL2_Duration_s: 0.002
    Trip_Temp_C: 95

  Note on PL2: Intel spec for i9-13980HX is 55W base TDP / 157W max
  turbo. 130W is a conservative, thermally safe choice for PL2.
  ASUS ROG Turbo mode sets PL1=170W/PL2=175W on the same CPU —
  so 130W has plenty of headroom.

  systemctl enable --now throttled


ASUS PERFORMANCE MODE SERVICE
  File:    /usr/local/bin/asus-performance-mode.sh
  Service: /etc/systemd/system/asus-performance-mode.service
  systemctl enable asus-performance-mode.service

  What it does at every boot:
    1. modprobe acpi_call
       modprobe msr
    2. echo '\_SB.ATKD.FANL 0x02' > /proc/acpi/call
       - Calls ACPI method to set EC thermal profile to performance
       - Causes EC fans to spin up more aggressively
       - MUST be called at every boot — EC resets to conservative
         profile on reboot
    3. Re-applies RAPL limits (belt-and-suspenders backup to throttled)
       PL1=100W on intel-rapl:0 and intel-rapl-mmio:0
       PL2=130W on intel-rapl:0 and intel-rapl-mmio:0


FAN CONTROL
  ASUS hwmon device: /sys/class/hwmon/hwmon12 (may renumber on reboot;
  use: grep -rl "^asus$" /sys/class/hwmon/*/name to find it)

  Available control files:
    pwm1_enable, pwm2_enable  (mode files only — no pwm value files)

  Mode values:
    0 = full speed (EC hands off, hardware maximum)
    2 = EC automatic control

  Current setting: EC automatic (mode 2) — acceptable for now
  To force full speed manually:
    echo 0 > /sys/class/hwmon/hwmon12/pwm1_enable
    echo 0 > /sys/class/hwmon/hwmon12/pwm2_enable
  To return to EC auto:
    echo 2 > /sys/class/hwmon/hwmon12/pwm1_enable
    echo 2 > /sys/class/hwmon/hwmon12/pwm2_enable

  Note: FANL 0x02 performance profile causes the EC automatic curve
  to be more aggressive than the default conservative profile.


CPUPOWER  (/etc/default/cpupower-service.conf)
  GOVERNOR='performance'
  MIN_FREQ="800MHz"
  MAX_FREQ="5600MHz"
  systemctl enable cpupower


CURRENT PERFORMANCE RESULTS
  Sustained CPU package power:  ~100W (up from 55W)
  Sustained P-core frequency:   ~3.2-3.8 GHz all-core under load
  Max CPU temperature:          ~93C under heavy load
  Fans:                         EC automatic, performance profile


----------------------------------------------------------------
DEAD ENDS — THINGS TRIED THAT DIDN'T WORK
----------------------------------------------------------------

DRIVER / KERNEL PARAMETER ATTEMPTS
  intel_pstate=disable / acpi-cpufreq
    - acpi-cpufreq reads ACPI _PSS table which caps at 2.401 GHz
    - Switching to this driver made performance worse, not better
    - Reverted to intel_pstate=active

  intel_pstate=passive
    - Same _PSS table ceiling problem as acpi-cpufreq
    - Did not fix the frequency cap

  cpupower frequency-set -u 5400000
    - Ignored when hardware limit is reported as 4 GHz
    - Has no effect against firmware-level caps

  scaling_max_freq overrides per-core
    - Writing 5600000 to scaling_max_freq on individual cores
    - Did not change sustained frequency under load


RAPL / POWER LIMIT ATTEMPTS
  Writing MSR RAPL domain only (intel-rapl:0)
    - The intel-rapl-mmio domain was independently capped at ~55W
    - MSR writes had no effect because MMIO domain was the binding limit
    - Root cause of 55W cap was the MMIO domain, not the MSR domain


BD_PROCHOT MSR CLEARING
  wrmsr -a 0x1FC 0xe40050 (one-shot)
    - EC reasserts bit 0 within milliseconds
    - Single writes have no lasting effect

  Continuous wrmsr loop (while true; wrmsr -a 0x1FC 0xe40050; end)
    - EC writes the MSR back faster than any OS userspace loop
    - No measurable effect on frequencies; BD_PROCHOT stays asserted

  CWAP ACPI method (\_SB.ATKD.CWAP with values 0x00-0x05)
    - Found in DSDT but only does: WAPF |= Arg0
    - Just OR-masks an in-memory flag, no connection to BD_PROCHOT
    - All calls returned success (0x1) but had zero effect on
      frequencies or BD_PROCHOT status

  throttled with Disable_BDPROCHOT: True
    - EC reasserts BD_PROCHOT within milliseconds
    - A 5-second polling daemon cannot win this race
    - BD_PROCHOT is a symptom of the conservative power profile,
      not the root cause of the frequency cap


FAN CONTROL ATTEMPTS
  asus-nb-wmi / throttle_thermal_policy sysfs
    - Not exposed on W7604
    - W7604 uses ATK WMI, not the standard ASUSWMI path

  platform_profile / asus-armoury
    - Loaded but: "No matching power limits found for this system"
    - W7604 not supported by asus-armoury module

  asusctl / asusd
    - Could not control fan curves or thermal profiles on this model

  nbfc-linux (UX550VE profile)
    - EC dumps all zeroes — wrong EC address entirely
    - nbfc set -s 100 had zero effect on fan speed

  ec-probe (monitor/dump, all access modes: ec_sys, dev_port)
    - All zeroes on all access modes
    - EC is not at standard I/O ports 0x62/0x66

  ec-probe.exe on Windows 11
    - Also all zeroes — confirmed non-standard EC address
    - Not a Linux-specific issue

  RWEverything on Windows 11
    - Driver blocked by Windows even with Memory Integrity and
      Vulnerable Driver Blocklist disabled

  MsiEcRamEditor on Windows 11
    - All zeroes — confirms EC is ACPI-method-only, not I/O accessible

  fancontrol / pwmconfig
    - No pwm1/pwm2 value files exist on the ASUS hwmon device
    - Only pwm1_enable/pwm2_enable mode files are available
    - Granular curve configuration not possible via this interface


----------------------------------------------------------------
KEY TECHNICAL FINDINGS
----------------------------------------------------------------

1. ROOT CAUSE OF FREQUENCY CAP
   The intel-rapl-mmio (MCHBAR/MMIO) RAPL domain has an independent
   ~55W PL1 cap that the EC resets every few seconds. This is the
   binding power limit. throttled writing both MSR and MMIO domains
   every 5 seconds is the correct fix because the EC reset cadence
   is multi-second (winnable by a daemon), unlike BD_PROCHOT which
   is reset in microseconds (not winnable by a daemon).

2. EC IS ACPI-METHOD-ONLY
   The W7604's EC is not accessible at standard I/O ports. All direct
   EC access tools (Linux and Windows) return all zeroes. The only
   way to communicate with the EC is via ACPI method calls (RRAM,
   WRAM, FANL in the DSDT). This is exactly what Armoury Crate does
   on Windows — it is not using direct EC I/O.

3. FAN PWM CONTROL IS BINARY
   The hwmon device only exposes pwm1_enable/pwm2_enable mode files.
   No pwm value files exist. The only choices are EC automatic (2)
   or full speed (0). Granular fan curve control from Linux is not
   available on this hardware through any confirmed method.

4. FANL 0x02 IS THE CORRECT PERFORMANCE PROFILE CALL
   Calling \_SB.ATKD.FANL 0x02 via acpi_call sets the EC to its
   performance thermal profile. This causes fan speeds to increase
   noticeably and must be applied at every boot. It does not
   permanently change the EC — the EC resets to conservative profile
   on every reboot.

5. BD_PROCHOT IS A SYMPTOM
   BD_PROCHOT (MSR 0x1FC bit 0) being asserted by the EC is a
   consequence of the conservative power profile, not an independent
   cause of the frequency cap. The actual binding constraint is the
   MMIO RAPL power limit.

6. i9-13980HX POWER LIMIT REFERENCE
   Intel spec: PL1 (base TDP) = 55W, PL2 (max turbo) = 157W
   Current config: PL1=100W, PL2=130W — conservative and thermally
   safe. ASUS ROG Turbo mode uses PL1=170W/PL2=175W on the same CPU.


----------------------------------------------------------------
OPEN ITEMS / STILL BEING INVESTIGATED
----------------------------------------------------------------

  - Granular fan curve control via asus-fan-control + afc-scout
    (in progress — may work now that acpi_call is loaded)

  - Fan aggressiveness: EC auto with FANL 0x02 is acceptable
    but more aggressive curve control would be preferable

----------------------------------------------------------------
MISC MONITORING COMMANDS
----------------------------------------------------------------

## Freqs, temps and fan speeds:

root@eta-carinae ~# cat ~/sysmon.fish 
#!/usr/bin/env fish
echo "=== GOVERNOR ==="
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
echo "=== FREQUENCIES ==="
cat /proc/cpuinfo | grep "MHz"
echo "=== TEMPS ==="
sensors 2>/dev/null | grep -E Core | grep -oP '\+\K[0-9]+\.[0-9]+(?=°C\s+\()' | sort -n | awk 'NR==1{min=$0} END{print "Min: " min "°C  Max: " $0 "°C"}'
echo "=== FANS ==="
sensors 2>/dev/null | grep -i fan
root@eta-carinae ~# watch -n .2 ~/sysmon.fish

## CPU package power draw:

while true
    set e1 (cat /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/energy_uj)
    sleep 1
    set e2 (cat /sys/devices/virtual/powercap/intel-rapl/intel-rapl:0/energy_uj)
    echo (math "($e2 - $e1) / 1000000")" W"
end

## Force fans to full speed:

echo 0 | tee /sys/class/hwmon/hwmon12/pwm1_enable
echo 0 | tee /sys/class/hwmon/hwmon12/pwm2_enable

(set back to 2 to go back to automatic)

================================================================
