Support Services - How to disable ASPM on PCIe links for DAHDI devices, for systems where the BIOS improperly enables ASPM
Q: My Asterisk server is running Linux kernel prior to 2.6.32. I'm observing one of the following symptoms for my DAHDI PCIe card (or DAHDI PCI card connected through a PCI-to-PCIe bridge):
"Timeout in t1_getreg" errors on single span T1/E1 cards
"Version Synchronization Error!" on quad-span cards
hard lockups during "modprobe wctdm24xxp" with analog cards.
What could be the cause? How can I work around the symptom?
ย
A: These errors are accompanied by lack/cessation of interrupts from the DAHDI card.
In some instances, these errors appear when the PCIe link is brought down unexpectedly due to Active State Power Management (ASPM) on systems where [root cause:] the BIOS improperly enables ASPM.
This guide explains a work-around: modify DAHDI source code to disable ASPM on PCIe links for DAHDI devices. The work-around may be necessary when running older Linux kernels, especially some 2.6.18.x kernels, on systems where the BIOS improperly enables ASPM. Newer Linux kernels (2.6.32+) contain code that can work around BIOSes which improperly enable ASPM, but older kernels do not.
Work-around steps
The following steps outline how to modify the DAHDI source code to forcibly disable ASPM on PCIe links for DAHDI hardware.
Step 1: Download the source code for the latest DAHDI drivers, version 2.6.1 or later. Extract and change into the source code directory using the following commands. Substitute the version of DAHDI for the X.X.X in the command lines below.
cd /usr/local/src/
wget http://downloads.digium.com/pub/telephony/dahdi-linux-complete/dahdi-linux-complete-current.tar.gz
tar -zxvf dahdi-linux-complete-X.X.X+X.X.X.tar.gz
cd dahdi-linux-complete-X.X.X+X.X.X/
Step 2a: Modify the source code to forcibly disable ASPM for single span T1/E1 and analog (AEX/TDM series) cards (wcte12xp and wctdm24xxp drivers). In the following example, the "vim" text editor is used, but you can specify nano, emacs, gedit or any other text editor.
vim linux/drivers/dahdi/voicebus/voicebus.h
Next, search for the following line:
#undef CONFIG_VOICEBUS_DISABLE_ASPM
Change it to this:
#define CONFIG_VOICEBUS_DISABLE_ASPM
Step 2b: Modify the source code for quad-span and 8-span T1/E1 cards (wct4xxp driver):
vim linux/drivers/dahdi/wct4xxp/base.c
Search for the following line:
/* #define CONFIG_WCT4XXP_DISABLE_ASPM */
Uncomment the line, so that it looks like this:
#define CONFIG_WCT4XXP_DISABLE_ASPM
Step 3: After making the above modifications and saving the source files, build and install the DAHDI drivers with the usual commands:
make
make install
Step 4: Reload Asterisk and the DAHDI drivers either with the init scripts or manually:
With init scripts
service asterisk stop
service dahdi restart
service asterisk start
-- or --
Manually
asterisk -rx "core stop now"
[You may have to run 'amportal stop' or 'killall -9 safe_asterisk' depending on the system.]
modprobe -vr wctdm24xxp wct4xxp wcte12xxp ...list...
[Unload these and any other card-specific drivers that are loaded. Run 'lsmod | egrep "dahdi|wct"' for the full list.]
modprobe -vr dahdi
[Unload the DAHDI base driver.]
modprobe -v dahdi wctdm24xxp
[This example would load DAHDI base driver and the analog card driver. You should specify the driver(s) for the installed DAHDI card(s).]
asterisk -vvvg
(Or use 'amportal start' or 'safe_asterisk', or however you start the asterisk process on your server.)
Now you have successfully patched and installed DAHDI with ASPM forcibly disabled on PCIe links for DAHDI devices. This should resolve errors dues to ASPM disabling the DAHDI PCIe links erroneously. Note that updates to the BIOS and/or kernel may also resolve this issue, removing the need to manually patch the DAHDI driver.
Background
DAHDI PCIe hardware supports Active State Power Management (ASPM): It should be valid to take the PCIe link down when the PCIe channel is not active -- when the corresponding DAHDI device driver is not loaded. (However, in practice, there is usually no benefit of reduced power consumption, because DAHDI hardware cannot be in power saving mode when DAHDI channels are active, as they normally would be on a production Asterisk server.)
This article's symptoms occur because some motherboard BIOSes improperly enable ASPM on some devices but still indicate that the system supports ASPM. This issue also affects non-Digium hardware.
The origin of the info above (conditional compilation directives to enable specific DAHDI patches) is http://issues.asterisk.org issue DAHLIN-283, "Disable Active State Power Management on PCIe links for DAHDI devices."
For more information on the underlying cause, see the following article on http://LWN.net : PCIe, Power Management, and Problematic BIOSes
Servers whose BIOS has been reported to be affected
BIOSes exhibiting this issue have been found on various motherboards from various manufacturers. Below is a list of server models whose BIOSes have been reported to exhibit this issue. However, note again that the issue may or may not appear on a given server due to differences in BIOS and hardware revisions.
Servers whose BIOS is known to suffer from this ASPM issue:
Dell PowerEdge R210 II, BIOS version ?
Dell PowerEdge T110 II, BIOS version ?