T1 Troubleshooting

 

Basic Troubleshooting Procedures

First, a few facts to know

  • It's not always the carrier's fault. Perform as much testing as you possibly can without having to involve the carrier unless you have very good reason to do so. You will save the customer time and money and help keep them out of a finger-pointing match.

  • Document everything!! If you perform a loopback test, even if you get no output, just write in the ticket something short and sweet like "did loop test, came out clean". If you checked the T1 status before & after an event, log the event & status change.

  • Be patient with the customer. T1 cards and PRI lines are expensive, and customers will get very angry when they go down during business hours. Remind them that you're on their side and you want to help them get to a resolution as quickly and effectively as possible.

  • Be systematic in your approach. If it's too hard to do a certain step at the time (such as a loopback test), try everything else that you can do first.

 

 

PRI ERROR CODES, CALL SETUP INFORMATION

See the following link for details on PRI cause codes, general call setup/teardown, and which side is responsible for certain errors:

http://www.freesoft.org/CIE/Topics/126.htm

Brian Weber helpfully internalized the contents of that link should it ever go offline:

PRI Error Codes and Call Setup Information

 

T1 Circuits are very linear - start with one end

Always start your observations at one end and work your way outward. If you work from one end outward, you can easily isolate the source of an issue with a T1 card very quickly. Here's the best method to follow for troubleshooting:

 

PBXtra Core Application - basic

If it's a new setup, there are almost always going to be some quirks to iron out. Placing test calls in and out while observing a call trace will tell all. On the asterisk CLI, you should become familiar with pri debug span 1. The most interesting part of the call is the initial frame that comes in or goes out. Many carriers want every call to have an outbound caller ID number or it won't connect (if you don't know how to set outbound caller ID, click here). Does the outbound dial string (specifically when dialing out over a PRI) look something like this:

Sep 26 16:35:24 VERBOSE[11807] logger.c: -- Called g2/w13105551212

if so, remove the "w" from globals.conf and link to & ping this QA ticket: 1137070

PBXtra Core Application - severe

Do you see anything funny in the call trace? Call logs? What do you get from pri show span X? A few things you can try looking out for in just the normal trace - copy them into the ticket if you see them:

Sep 26 00:01:19 WARNING[5025] chan_zap.c: No D-channels available! Using Primary channel 24 as D-channel anyway!
Nov 23 21:08:45 NOTICE[2054]: PRI got event: No more alarm (5) on Primary D-channel of span 1

 

 

Drivers, hardware, and firmware versions

Double-check against the latest and greatest on the Version Compatibility wiki page, as Fonality will produce better software and Sangoma will produce better drivers as time goes on. If the customer is on faulty drivers or firmware, offer to upgrade after hours. Also, check for interrupt conflicts with

Sangoma cards can share the same IRQ with each other, but not with other devices. Troubleshoot accordingly.

 

zapata.conf/zaptel.conf

Below on this page are details on what these should be. If this is a new line, get a cutover sheet or a carrier confirmation of the correct settings. Always do a sanity check here before proceeding.

 

wanpipe settings (wancfg)

Timing on a PRI line should be set by the carrier. There are exceptions, in such cases the carrier will let us know. T1 should use clock port 24. During troubleshooting, you may try turning off echo-cancellation. Details on what these settings should be are below on this page. Always do a sanity check here before proceeding. When in doubt, back up the existing /etc/wanpipeX.conf file and recreate using the wancfg utility.

 

Hardware - the card itself

If everything software-wise is in check, then performing a loopback test is in order. Ping this suggestions ticket 678405 and follow the instructions on how to do a loopback test here: How to pattern loopback test a T1. You may choose to postpone this test until later, as it can involve a length of downtime for the customer.

 

Cabling

Your options here are simple: straight-thru or crossover. Swapping cables around is much less effort than a loopback test, so even though it's out of systematic order, it's simple enough to try before doing a loopback test. You could be getting set up for the test while the customer is changing cables around.

If the card is a Sangoma card, use the wanpipemon tool to check for Short Circuit alerts. An active Short Circuit alarm means the cabling is incorrect.

wanpipemon T1/E1 line alarms - what they mean - Sangoma Support Wiki

 

Carrier

Any company that brings in a T1 line is only bound to connect up to the DMARC (point of demarcation - legal limit of carrier responsibility for the line). That could be the smart jack a few feet from the PBXtra, and it could be fourteen floors downstairs in the basement. We can only see up to the card. Carriers may want to do a loop test to the smart jack, but only for a few minutes. Suggest that the best method of testing is overnight with an intense or intrusive pattern. There are many variables along a T1 line, and unless a carrier has solid validation of any points of failure on their part, they will likely point back to us, resulting in more customer frustration. By the time the troubleshooting gets this far, at least we will have cleared every possible point of failure on the PBXtra.

With a clear procedure (above) and all the necessary tools (below and linked), you will have no problem coming to a direct and irrefutable resolution on a down PRI line.

 

 

Dropped call error conditions

Didn't receive frame

When you see something like the following:

This indicates that the channel was closed abruptly due to Asterisk not receiving necessary data over the circuit.

 

Frame Control

If you see something like the following in the Asterisk message log:

This means that the phone system received an indication on the circuit. These messages are somewhat hard to interpret without all the necessary information. Frame control 5? What?

These are the different codes you may find associated with FRAME_CONTROL

Code

Description

Code

Description

1

The remote end has hung up

2

Local ringing

3

Remote end is ringing

4

Remote end has answered

5

Remote end is busy

7

Line is off-hook

8

Circuit congestion

9

Flash hook

Digium Cards

Tried all else on a Digium card and nothing's working? Powercycle the server - don't just restart, powercycle it. Still not working? Switch the cards between PCI ports. Especially useful after a power outage or other event.

 

Sangoma Cards

 

What to look for in logs

In /var/log/messages a T1 dropping will look something like this:

And in /var/log/asterisk/messages you will see something like this:

 

 

How to configure a Sangoma T1 Card

This information has been moved here: Basic T1 Configuration

 

 

What is installed on a customer's server?

To check hardware information use the command:

You can also use the optional "hwprobe verbose" argument to see which FXO and FXS modules are installed. Here is sample output:

What does this all mean? The columns from left to right are: Card type, PCI slot, PCI bus, CPU, PRI Port, Hardware Echo Canellation, and Firmware version.

The first card is an A200 (analog) card with Version 10 of the Sangoma firmware and a 0 indicates this card does not have HWEC. Any number other than a 0 indicates the card *does* have hardware echo cancellation.

 

 

How to check hardware statistics

A few good commands to belt out:

The stats clear out when the wanpipe driver stops (e.g., by performing a wanrouter stop). If you check the Router uptime and see lots of errors, then there's a problem. An "Out Of Frame" (OOF) or "Loss Of Service" (LOS) is, by definition, a dropped T1 line. There's enough error correction on a T1 line where if three frames are lost within a span of five frames, that is considered an OOF.

Long story short... OOF's and LOS's are related to errors generated by the card or by the line. If you see this appearing on a system, do a loop test on the card and the line.

Important: interpreting what wanpipemon T1/E1 line alarms mean - Sangoma

 

 

Known Software and Firmware Issues

A good firmware choice for A200 analog cards is V. 10.

  1. Analog Cards

    1. Firmware Version 6 is bad, can cause dropped calls

 

How to run a pattern loopback test on a Digium T1 card

To verify that a T1 PCI card is working properly, a loopback test can be run using a hardware loopback plug. If you ned to build a loopback plug, pin 1 connects to pin 4 and pin 2 connects to pin 5. Pins are numbered left to right when the hook is underneath, and the conductors are furthest away from you. Connect the loopback plug, and run the test as described below.

To start off, edit /etc/zaptel.conf. Comment out the span information using the ‘#’ character. Enter these two lines:

Before continuing, take a look at the output of lsmod. See below:

Looking specifically at the line:

Make a note of the module name in the fourth column, you will need to reload this module later.

Now, stop asterisk and astwatch.pl. You should be able to use /etc/init.d/asterisk stop. Verify that zaptel has unloaded by doing another lsmod. If it hasn't, unload the module you made a note of earlier (wct4xxp in the above example). Reload the zaptel and card driver modules like so:

Verify with the customer that a green light shows on the card when the loopback plug is connected. Additionally, run zttool and verify that it is n "OK" state.

Run /sbin/ztcfg -vvvvvv and ensure there are no errors, and that all channels are in a cleared state.

Now we move on to running the actually loopback test.

If the system is running fon-o or greater, do the following:

If the system is not running fon-o, you will need to first download the zaptel source, and build the test suite. Do the following on the server:

Now you can continue with loopback testing the card. Do the following:

Watch for the message "going for it". If you do not see the message, the test is not working. If you do see it, only errors will be displayed after that. The prompt will return when the test is complete. If you do see errors, then we should contact Digium for an RMA on the card.

When done, restore the /etc/zaptel.conf to its original configuration, and restart asterisk by doing /etc/init.d/asterisk start.

 

 

How to debug a Sangoma T1 card configured for E&M - Wanpipe RBS

The wanpipemon utility provides the ability to debug RBS bits. The following command enables/disables WANPIPE RBS debugging and prints all debugging message into the /var/log/messages file. This feature is supported for all Sangoma digital AFT-series cards.

 

To enable RBS debugging on receiver side (this command will print only changes to the RBS bit settings):

To disable RBS debugging on receiver side:

To enable RBS debugging on transmit side (this command will print only changes to the RBS bit settings):

To disable RBS debugging on transmit side:

Read the current RBS status bits from AFT-series cards:

Read the current RBS status from the Wanpipe driver:

In order to verify the RBS operations, you can run ./zttool utility and enable Sangoma RBS debugging at the same time. Zttool shows what zaptel thinks is the RBS setting and the wanpipe utilities show what the sangoma driver thinks is the RBS setting. The two have to match at all times.

In one window run:

In another window run:

Enable RBS debugging as stated above:

  • Place a call on Asterisk with reproducible bad behaviour

  • Compare zttool output versus the /var/log/messages

  • If driver rbs changes are identical to zttool rbs changes problem is with the telco.

  • If driver rbs changes differ from zttool rbs changes there could be a problem with the drivers.

  • If zttool output doesn't match wanpipe rbs output please contact Sangoma Support.

zttool

zttool is an application from the Digium suite of TDM card configuration and troubleshooting. 

You invoke the application simply by typing zttool.  With this application, you can view the current trunk status of most PRI/E1 circuits as shown:

By selection one of the interfaces marked RED we can learn more information about the circuit:

Any of the statistics above may contain information vital to resolving a problem with the carrier or the customer's own equipment.  In the image above, the card simply does not have a circuit attached to port 3 of the Sangoma PRI card.

Asterisk PRI Span Debugging

Please follow these steps when faced with dead span. Record the stdout to file for each command.

Check that wanpipe status is Connected:

Check the physical T1/E1 Alarms:

Interpreting the alarms:
https://sangomakb.atlassian.net/wiki/spaces/TC/pages/53706877?search_id=68f10b58-5c37-4b7f-bec7-ba5ae615d484

Below is a description of each Alarm (excerpt from the Sangoma Wiki URL above)

RED

Indicates the device is in alarm 

LOF

(Loss of Framing).  Raised after four consecutive frames with FAS error.  If RAI and AIS alarms are not indicated, verify that you have selected the proper line framing (i.e T1: ESF, D4, E1:CRC4, NCRC4..etc)  

LOS

(Loss Of frame Signal)

AIS

(Alarm Indication Signal): typically known as a BLUE Alarm. all-ones signal transmission to the receiving equipment (the Sangoma card) to indicate that an upstream repeater (telco equipment) is in alarm, due to upstream transmission fault, either from another repeater or from the telco itself. If the only alarms indicating in the wanpipemon output is AIS:ON, then contact your telco with this information (RAI:ON can also be a possibility in this case as well) 
Example call diagram of the scenario: Sangoma card <---------------repeater <--------------Telco

RAI

(Remote Alarm Indication): Indicates that the Far end (typically the Telco) is in RED alarm state and sending that message over the line.  If the only alarm in your wanpipemon output is RAI:ON then contact your telco with this information.

You will also get this alarm, and only this alarm, if your framing is incorrect.This setting can be changed in the wanpipeX.conf file.

Short Circuit

 the wires in your cable connected to the port are crossed.  If you see this alarm, check the pinouts for the cable you are using. You may also be plugging in the wrong form of cable (straight-through, or cross-over) 

Open Circuit

 No line plugged into the port.  Make sure that your connector is plugged in and the wiring is making a good connection.  If this alarm is on, you will also Rx Level='-36'->'-44'. 

Loss of Signal

 Cabling issue.  Check the health of the cable plugged into the port, as well as its connection to the port it is plugged into.  You will also see the Rx Level either very low, or in a disconnected state: -36 -> -44.  It is typical to have this alarm triggers in combination with 'Open Circuit' if there is an issue with the physical connection 

YEL

When the equipment enters a Red-Alarm state, it returns a Yellow-Alarm back up the line of the received OOF. A typical scenario would be mis-configuration during the Sangoma card configuration (i.e selected CRC4 vs NCRC4).  In this type of scenario also LOF and RED alarms will be triggered.

Line Code Violation

 This occurs upon a bipolar violation

Far End Block Errors

is reported by the upstream end of the PHY (the wire between you and the switch) on the out-of-band management channel. This means the other end of the line received bad data from you.  Possible reason are: line noise, corroded wires..etc.  Also, check line Framing (E1: CRC4 vs NCRC4)

CRC4 Errors

 This occurs when the CRC polynomial calculation performed before transmission does not match the CRC calculation done upon reception.

FAS Errors

 (Frame alignment signal error). One or more incorrect bits in the alignment word

Rx Level

 Signal strength of the connection between the Sangoma card and the other end.  Health connection will show -2.5db.  If you notice your connection lower (i.e. -10db-->-12db, or -36fb, -44 db) Then check the cable or possibly replace it.  If the Rx level is very low, it can trigger Loss of Signal Tx, or even Open Circuit tx.

As per: https://sangomakb.atlassian.net/wiki/spaces/TC/pages/53706877?search_id=68f10b58-5c37-4b7f-bec7-ba5ae615d484

 

Make sure that tx/rx counters are incrementing:

Check for any wanpipe errors in messages:

Run a dchan trace and check that you are seeing incoming and outgoing traffic:

Check that spans are UP and Active. From asterisk CLI run:

 

T1 DTMF Troubleshooting

A T1/PRI passes DTMF through the b-channels and can be assessed as inband. This means that sound quality and echo issues can affect the systems ability to pick up on DTMF tones. Below are instructions for inbound DTMF recognition problems on a T1 PRI.

  • NOTE: zapata.conf settings require an PBXtra Core restart to take affect. THIS WILL DROP CALLS.

  1. As with any Zap DTMF recognition problem, you can set relaxdtmf=yes in zapata.conf - this should be done via the web interface in the Options->settings page. Let the customer know you've made this change and also inquire about any quality issues they may be having (step 3 and 4.) TO check and make sure this setting has taken affect properly you can login to the server and run the command

    Note: Replace "1" with a channel from the T1 span you are concerned with (any channel will do.) Look for the string:

    to see if Relax DTMF is set to yes or no. If it is set improperly check zapata.conf to make sure relaxdtmf=yes and run show uptime to see when the last restart was:

  2. If the DTMF is still not being picked up increase the rxgain 3dB at a time and place a test call between each.

  3. If the customer has responded they have residual noise on the line you may need to troubleshoot the noise before you are able to resolve the DTMF issue.

  4. If the customer has echo you may need to troubleshoot the echo before you are able to resolve the DTMF issue.

  5. During your correspondence with the customer, request they also contact the carrier to see if they are sending DTMF appropriately. Have the carrier watch as they place an inbound call and send DTMF tones. Make sure the carrier verifies they are sending all the tones correctly. The carrier may also be able to increase the gain for when they are sending DTMF.

 

T1 Static Troubleshooting

Please see Static - PRI/T1

 

General Info

Useful stuffs to know: http://www.techfest.com/networking/wan/t1.htm

^This has explanations of Alarm types.

http://en.wikipedia.org/wiki/T-carrier

 

 

DID with E&M

http://www.voip-info.org/wiki/view/Asterisk+tips+did

 

 

D4/AMI settings

How to set up a T1 with D4/AMI

 

T1 D channel not coming online

If you encounter a T1 circuit that is attached through an AdTran device (for splitting it up for voice/data), you may find that you must not do hardware HDLC on the D channel, or the circuit may not come up correctly. In these cases, ensure that

is set in /etc/wanpipe/wanpipeN.conf. This is occasionally necessary on standard T1's connected to a smart jack as well, but is quite rare.

 

E1 Troubleshooting

Puttings this info in the T1 page - will promote to unique page if it is warranted

 

Basic E1 mechanics

E1 pipes have 30 b-channels + 1 d-channel. The d-chan exists on the sixteenth channel.

E1 PRI signalling is different from any traditional American T1:

switchtype=euroisdn ; (as opposed to NI2 national) signalling=pri_cpe ; (this does not change) framing/encoding=ccs/hdb3

 

 

zapata/zaptel

Here's zaptel.conf from a Quad E1 (courtesy of | voip-info):

 

If the PBX is the MASTER timing source on span 1, use span=1,0,0.

# span definition format:
# span=(spannum),(timing),(LBO),(framing),(coding)

# timing= How to synchronise the timing devices.
# 0: to not use this span as sync source; Send timing synchronisation to other end.
# 1: to use as primary sync source
# 2: to set as secondary and so forth

If no T1/PRI spans are hooked to a carrier (rare, but it does happen), any span set to use itself as a sync source can bring down the other spans as well.

Return to Documentation Home I Return to Sangoma Support