Sahan Serasinghe

Software Engineer | Data Enthusiast

Understanding WebSockets with ASP.NET Core

2021-01-01asp.net core 8 min read

In this article, we will go through RFC 6455 WebSockets specification and configure a generic ASP.NET (Core) 5 application to communicate over WebSockets connection with SignalR. We will dive into the underlying concepts to understand what happens under the covers.

A bit about WebSockets

WebSockets was introduced to enable two-way communication between a client and a server. One of the pain points with HTTP 1.0 was creating and closing a connection each time we send a request to the server. With HTTP 1.1 however, persistent connections (RFC 2616) were introduced by making using of a keep-alive mechanism. With this, connections could be reused for more than one request - which will reduce latency as the server knows about the client and they do not need to start over the handshake process per request.

💡 When you are learning about protocols, a good place to start is to read its corresponding RFC specification.

WebSockets is built on top of HTTP 1.1 spec as it allows persistent connections. So, when you are making a WebSocket connection for the first time, it is essentially an HTTP 1.1 request (more on this later). This enables real-time communication between a client and a server. In a nutshell, the following diagram depicts what happens during the initiation (handshake), data transfer and closing of a WS connection. We will dive deeper into these concepts later.

understanding-websockets-with-aspnetcore-1.jpg

The protocol has two parts to it; Handshake and Data Transfer.

Handshake

Let’s talk about the opening handshake first. From the spec,

The opening handshake is intended to be compatible with HTTP-based server-side software and intermediaries, so that a single port can be used by both HTTP clients talking to that server and WebSocket clients talking to that server.

Simply put, a WebSocket connection is based on HTTP (and TCP as transport) over a single port. Here’s the summary of steps.

  1. A server must be listening for incoming TCP socket connections. This could be any port you have assigned - normally this would be 80 or 443.
  2. The client initiates the opening handshake (otherwise the server wouldn’t know who to talk to) with an HTTP GET request - This is the “Web” part in “WebSockets”. In the headers, the client will ask the server to Upgrade the connection to a WebSocket.
  3. The server sends a handshake response telling the client that it will be changing the protocol from HTTP to WebSocket.
  4. Both client and server negotiate the connection details. Either of the parties can back out if the terms are unfavourable.

Here’s what a typical opening (client) handshake request looks like.

GET /ws-endpoint HTTP/1.1
Host: example.com:80
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: L4kHN+1Bx7zKbxsDbqgzHw==
Sec-WebSocket-Version: 13

Note how the client sends out Connection: Upgrade and Upgrade: websocket headers in the request.

And, the server handshake response,

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: CTPN8jCb3BUjBjBtdjwSQCytuBo=

Note how the server sends out HTTP/1.1 101 Switching Protocols in the response headers. Anything other than a 101 indicates that the opening handshake was not completed.

The closing handshake is pretty simple. Either the client or server can send out a closing handshake request. From the spec,

It is safe for both peers to initiate this handshake simultaneously. The closing handshake is intended to complement the TCP closing handshake (FIN/ACK), on the basis that the TCP closing handshake is not always reliable end-to-end, especially in the presence of intercepting proxies and other intermediaries.

We will talk about these in action when we jump over to the demo section.

Data Transfer

The next key concept we need to understand is Data Transfer. Either of the parties can send messages at any given time - as it is a Full Duplex communication protocol.

The messages are composed of one or more frames. A frame can be of type text (UTF-8), binary, and control frames (such as 0x8 (Close), 0x9 (Ping), and 0xA (Pong)).

If you are interested, you can read the full RFC spec from here.

Setup

Let’s put this into action and see how it works.

💡 Follow along with the completed code from my repository here

First create a new ASP.NET 5 WebAPI app.

dotnet new webapi -n WebSocketsTutorial
dotnet new sln
dotnet sln add WebSocketsTutorial

Now we will add SignalR to our app.

dotnet add WebSocketsTutorial/ package Microsoft.AspNet.SignalR

Sample code explanation

We will start by adding the WebSockets middleware to our WebAPI app. Head over to the Startup.cs file and add the following line inside the Configure method.

I like to keep things simple for this tutorial. Therefore, I’m not going to talk about SignalR (Hubs and stuff). It would be purely based on WebSocket communication. You can also achieve the same with raw WebSockets, you don’t have to use SignalR if you want to keep things even more simpler.

...

app.UseWebSockets();

...

Next, we will delete the default WeatherForecastController and add a new controller called WebSocketsController. Note that we will be just using a controller action instead of intercepting the request pipeline

The full code for this controller will look like this. This code is based on Microsoft’s official docs’ example.

WebSocketsController.cs

using System;
using System.Net.WebSockets;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using Microsoft.AspNetCore.Mvc;
using Microsoft.Extensions.Logging;

namespace WebSocketsTutorial.Controllers
{
    [ApiController]
    [Route("[controller]")]
    public class WebSocketsController : ControllerBase
    {
        private readonly ILogger<WebSocketsController> _logger;

        public WebSocketsController(ILogger<WebSocketsController> logger)
        {
            _logger = logger;
        }

        [HttpGet("/ws")]
        public async Task Get()
        {
          if (HttpContext.WebSockets.IsWebSocketRequest)
          {
              using var webSocket = await HttpContext.WebSockets.AcceptWebSocketAsync();
              _logger.Log(LogLevel.Information, "WebSocket connection established");
              await Echo(webSocket);
          }
          else
          {
              HttpContext.Response.StatusCode = 400;
          }
        }
        
        private async Task Echo(WebSocket webSocket)
        {
            var buffer = new byte[1024 * 4];
            var result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
            _logger.Log(LogLevel.Information, "Message received from Client");

            while (!result.CloseStatus.HasValue)
            {
                var serverMsg = Encoding.UTF8.GetBytes($"Server: Hello. You said: {Encoding.UTF8.GetString(buffer)}");
                await webSocket.SendAsync(new ArraySegment<byte>(serverMsg, 0, serverMsg.Length), result.MessageType, result.EndOfMessage, CancellationToken.None);
                _logger.Log(LogLevel.Information, "Message sent to Client");

                result = await webSocket.ReceiveAsync(new ArraySegment<byte>(buffer), CancellationToken.None);
                _logger.Log(LogLevel.Information, "Message received from Client");
                
            }
            await webSocket.CloseAsync(result.CloseStatus.Value, result.CloseStatusDescription, CancellationToken.None);
            _logger.Log(LogLevel.Information, "WebSocket connection closed");
        }
    }
}

Here’s what we did,

  1. Add a new route called ws/
  2. Check if the current request is via WebSockets otherwise throw a 400.
  3. Wait until client initiates a request. L:40
  4. Going into a loop until the client closes the connection. L:43
  5. Within the loop, we will prepend “Server: Hello. You said: <client’s message>” to the message and send it back to the client.
  6. Wait until the client send another request.

Note that the server does not need to wait until the client sends a request to push messages to the client, after the initial handshake. Let’s run the application and see whether it works.

dotnet run --project WebSocketsTutorial

Once you run the application, head over to https://localhost:5001/swagger/index.html. You should see the Swagger UI.

understanding-websockets-with-aspnetcore-2.png

We will now see how we can get the client and server to talk to each other. For the purpose of this demo, I will be using Chrome’s DevTools (Open new tab → Inspect or press F12 → Console tab). But, you can use any client of your choice.

First, we will create a WebSocket connection to our server endpoint.

let webSocket = new WebSocket('wss://localhost:5001/ws');

What this does is, it initiates a connection between the client and the server. wss:// is the WebSockets Secure protocol since our WebAPI app is served via TLS.

You can then send messages by calling webSocket.send() method. Your console should look similar to the one below.

understanding-websockets-with-aspnetcore-3.png

A closer look at the WebSocket connection

if you go to the Network tab, filter out the requests by the WS tab and click on the last request called ws.

Click on the Messages tab and examine the message passed back and forth. During this time, if you invoke the following command, you will be able to see “This was sent from the Client!” appearing in this box. Give it a try!

webSocket.send("Client: Hello");

understanding-websockets-with-aspnetcore-4.png

As you can see, the server does need to wait for the client to send a response (that is, after the initial handshake), and the client can send the messages without being blocked. This is Full Duplex communication. We have covered the Data Transfer aspect of WebSocket communication. As an exercise you could run a loop to push messages to the client to see it in action.

In addition to this, the server and client will have ping-pongs to see if the client is still alive. This is an actual feature in WebSockets! If you really want to have a look at these packets, you can use a tool like WireShark to get an idea.

How does it do the Handshake? Well, if you jump over to the Headers tab, you will be able to see the request-response headers we talked about in the first section of this post 🙌

understanding-websockets-with-aspnetcore-5.png

Have a play around with webSocket.close() too so that we can fully cover the open-data-close loop.

Conclusion

If you are interested in having a look at the RFC for WebSockets, head over to RFC 6455 and have a read. This post only scratches the surface of WebSockets, and there are many other things that we could discuss such as Security, Load Balancing, Proxies etc.

Don’t forget to let me know any feedback or comments. Until next time ✌️

References

  1. https://tools.ietf.org/html/rfc6455
  2. https://docs.microsoft.com/en-us/aspnet/core/fundamentals/websockets?view=aspnetcore-5.0
  3. https://www.meziantou.net/using-web-sockets-with-asp-net-core.htm
  4. https://developer.mozilla.org/en-US/docs/Web/API/WebSocketsAPI/WritingWebSocket_servers