Debugging TCP throughput in 5G/4G

Debugging TCP throughput Issues in 5G is tricky as TCP adapt the sender’s rate to network capacity and attempt to avoid potential congestion situations. If there is any drop of TCP packets or delay in the reception of the Acknowledgement(ACK) then TCP application assumes that network is not able to handle the current throughput therefore it reduces the Window size which leads to drop in overall throughput. This paper illustrates the steps to debug TCP throughput issues in 5G Systems.

1.  Overview of TCP

TCP congestion control is method used by the TCP protocol to manage data flow over network and prevent congestions. TCP uses a congestion window and congestion policy that avoids congestion. The sender can transmit based on the lower value of congestion & advertised window. The congestion window is flow control imposed by the sender and it is tuned based on TCP duplicate ack or TCP timeout (packet is lost). The advertised window is advertised by receiver which impose flow control based on receiver window capacity. In this whitepaper we will try to understand how the congestion window will impact the TCP throughput.

TCP maintain the below congestion policy –

  • Slow Start phase: Starts slowly and increment is exponential till ssthresh is reached.
  • Congestion Avoidance phase: After reaching the ssthresh increment is by 1.
  • Congestion Detection Phase: Sender goes back to slow start phase or congestion avoidance phase.

TCP Data Transmission Packet Count

As shown in Figure 1: TCP Congestion Window Plot , initially TCP congestion window is in slow start phase. In this phase congestion window starts with value of 1 and increase exponentially till 5th transmission round when ssThresh increase to 32 and post that goes into the congestion avoidance phase which continues till the 10th transmission. At the 10th transmission round 3 duplicate ACKs are received by the receiver and ssthresh, congestion window drops to half and sender moves to congestion avoidance phase. Timeout occurs at the 16th transmission round and congestion window moves to 1 and sender goes to slow start phase.

TCP Transmission Count during Debugging

As shown in Figure 2 TCP Throughput is varying as per the congestion window movement happening in Figure1. TCP throughput at point A throughput drops as congestion window has moved to congestion detection phase due to duplicate ACKs which mean there is delay in reception of TCP Acknowledgements. At point B, congestion window has moved to slow start phase as TCP timeout happened which means that packet is lost. This means that there is delay in

reception of TCP acknowledgements due to congestion in network or there is a packet drop which caused throughput to drop drastically.

Method to debug TCP Traffic issue.

As TCP throughput fluctuates due to congestion in network which cause TCP server to pump less traffic.  The past congestion in the network affects the current throughput, therefore it is sometime difficult to trace what issue happened in the past.

  1. Therefore, it is advisable to first verify the e2e network using UDP traffic and verify all the below KPIs –
    • Initial & Residual BLER for both UL & DL.
    • Check MCS, CQI, PRB usage in each TTI for DL.
    • Scheduling interval of UE, ideally UE should be scheduled every TTI.
    • Packet Drop stats in all the Nodes of network.
    • Check flow control in F1-U interface working fine, CU-UP is pumping enough data.
    • Frequent RLC segmentation
    • Reception of RLC Status PDU – RLC BLER.
    • In case of CA, deactivation of secondary cell.
  2. Collect TCP statistics using ssStats tool on iperf server or UPF

It displays TCP statistics and state information which help us to know the state of congestion window of TCP sender. With this information we can find out when duplicate ACKs are sent, or packet is lost due to Timeout.

Minimize packet Drop

TCP Packet Flow diagram in 5G

Data flows between Data Network and UE via N6, N3 , F1-U and air interface. Due to radio channel condition, packet drop will happen, and packet transmission will be delayed due to retransmission. This will cause TCP window to shrink and adjust to throughput supported by radio channel condition. Apart from the radio channel between UE & DU, there are other Nodes (DU,CU-UP, UPF) as shown in  Figure 3: 5G Network Topology where packet drop can happen. Here we will try to minimize the packet drop in all the Nodes.

Ensure packet drop should not happen in DU

Sometimes packet drops happens in RLC when RLC Queue is full on DU or PDCP Queue is full in CU-UP. RLC Queue become full when scheduling rate is not same as packet rate pumped from CU-UP.  If there is limitation due to radio condition, then the drop should happen in CU-UP rather than DU. Here is the reason –

TCP packet drop analysis

As shown in Figure 4 TCP Packet Flow, UE receive PSN-0 (PDCP Sequence Number), PSN-1 and does not receive PSN-2(PDCP Sequence Number) , PSN-0, PSN-1 is transmitted to UE. However, PSN-3 is not sent immediately as PSN-2 is not received and PDCP reorder Timer is started at UE. Post expiry of PDCP Reorder timer, PSN-3 is transmitted to TCP application. This delay introduced will cause further reduction in TCP throughput. If the packet is dropped on CU-UP then PDCP Sequence Number is not assigned to the packet, then PDCP SN-2 is assigned to the packet with TCP-SN (TCP Sequence Number) to 4292 and thus drop of packet with TCP-SN 2892 does not cause any delay due to PDCP Reorder Timer Expiry.

Optimum PDCP Reordering Timer

The reordering timer value should be sufficient big enough to re-order the out-of-order packets. If the reordering timer expires before all the out-of-order or missing packets are received, then PDCP transmit all the received packets to application and move the reordering window ahead. If the missing packets are received after expiry, then they are dropped as reordering window has moved ahead.

Reception of RLC Status PDU- RLC BLER

If RLC Status PDU is received late or NACK is received, then retransmission is triggered by RLC Sender Entity. Optimum RLC Reassembly timer value should be more than average HARQ Retransmission count which can reduce the NACK in RLC status PDU.

 Avoid Secondary Cell Deactivation in case of Carrier Aggregation

In case of low traffic from CU-UP the Secondary Cell(SCell) get Deactivated. Cell activation & deactivation takes time therefore optimum value of SCell Deactivation Timer should be configured so that SCell does not get deactivated very often.

Scheduling limitation on air interface.

Throughput can be less due to the low scheduling of packets in MAC layer. This packet scheduling information can be checked via KPI parameters-

  • UL , DL initial and residual BLER
  • MCS, CQI, PRB usage in each TTI for DL
  • Scheduling interval of UE, ideally UE should be scheduled every TTI.

Related Post

Article Submitted By:

About Prerit Jain
Prerit Jain has about 15 years of experience in 4G, 5G System development and currently working as Sr. Member of Technical Staff, Altiostar Networks Inc. He completed his Bachelors of Technology from NSIT, Delhi. from India.
You may reach out him at LinkedIn