Thursday, January 7, 2016

Platform Latency of WebSocket and WebRTC

These are some notes about latency in my tilt-and-pan system.

The tilt-and-pan WebRTC Streaming system (link) relies on an underlying Network Architecture (link) which makes it all work over the internet.

There are two main network elements:

  • WebSockets (routed from Pi through intermediate host to PC)
  • WebRTC (direct from Pi to PC)

Architecture

Excerpted from a prior post is this diagram of the network setup:

Basic Premise

When moving the Cardboard viewer, orientation data in the phone is sent via WebSockets to the Pi, which operates motors which control the direction of the camera.

The camera video is constantly being streamed back to the phone.

There is latency between the actual movement of the phone and the corresponding video image appearing to move.

Scenarios to Investigate

To narrow down where there is latency, I decided to take measurements in two deployments:

  • LAN
  • Internet


The LAN deployment runs the RDVP Server on the Pi, and all comms are local LAN.  The idea being to eliminate public internet uncertainty from measurements.

The Internet deployment is more "real-world" use case for future projects.

In all cases the Pi remains on the LAN.

A note about the LAN

During testing, it became clear that the LAN should be further split into wired and wireless.

That is, connections to the local LAN can be on Ethernet or WiFi.

Ethernet does not have to contest with wireless congestion.  WiFi does.

For the tests below, the Pi is always on Ethernet, and the Laptop is tested on both Ethernet and 2.4GHz WiFi.  And the WiFi is congested as a result of many apartments nearby.


A note about the Internet

The RDVP Server lives on the internet.  But not just anywhere.

My apartment is located in NJ.  The RDVP Server is running in Canada right near Montreal.

So traffic via the internet will have to travel approx 320 miles in each direction (as the crow flies).  This will likely be a fair amount longer in actual network links.

Areas to Investigate

Network Latency - Ping times and general RTT
WebSocket Latency - WebSocket messaging over given Networks
WebRTC Latency - Attempt to measure video latency over given Networks


For the WebSocket Latency, I am measuring the RTT from a message sent from my Laptop to the Pi and back again to the Laptop.


For the WebRTC Latency, I'm pointing the webcam at a high-res clock on the computer and then comparing the received video image to the clock.


LAN - Network Latency

Here I am pinging from the Laptop to the PI on the local LAN.


Pi Ethernet
Laptop Ethernet
sub-ms
Laptop WiFi (2.4GHz)
32ms

(highs in hundreds and high hundreds)

Internet - Network Latency

Here I am simply pinging the intermediate RDVP Server.  Cannot ping the Pi directly of course.



RDVP Server
Laptop Ethernet
22ms
Laptop WiFi (2.4GHz)
36ms



LAN - WebSocket Latency


Pi Ethernet
Laptop Ethernet
12ms
Laptop WiFi (2.4GHz)
14ms

Internet - WebSocket Latency


RDVP Server
Pi Ethernet
via RDVP
Laptop Ethernet
27ms
53ms
Laptop WiFi (2.4GHz)
38ms
65ms
Laptop LTE Phone Tether (cable)
65ms
99ms



LAN - WebRTC Latency


Pi Ethernet
Laptop Ethernet
516ms
527ms

Internet - WebRTC Latency


Pi Ethernet
Laptop LTE Phone Tether (cable)
466ms
430ms






Observations

Firstly, WiFi congestion has a pretty dramatic effect.  It makes for extremely erratic RTT on ping and WebSocket traffic, and hugely increases the average latency.

Otherwise, if you consider 1/2 RTT to be the time the Pi receives a command, then the following tables show the latency there isn't so bad.

LAN - WebSocket Comms


Ethernet
WiFi
RTT
12ms
14ms
½ RTT
6ms
7ms

Internet - WebSocket Comms

Ethernet
WiFi
Cellular
RTT
27ms
38ms
99ms
½ RTT
14ms
19ms
50ms


The rest of the visual latency is due to the WebRTC Stream.


Observations about WebRTC

In short, I think the framerate is part of the root problem.  The other part is possibly software (UV4L) queuing on the Pi side.

The video stream, when looking at Chrome's internal WebRTC monitoring system (chrome://webrtc-internals), shows many stats about the connection.

Here is one of the stats about frame rate over time.  It's very low.  I've been able to up this number with various tweaks to upwards of 10 frames/sec.  Not much higher.



At 10 frames/sec, the duration between each frame is 100ms.

That is still well short of the total visual latency of the approx 500ms latency seen. This could be software queuing, as it doesn't appear to be bandwidth limitations or CPU issues on the Laptop side.

A future release (link) of UV4L claims to vastly improve the framerate and nearly eliminate latency. It will be worth measuring again once that is released.