Screen Recorder

I Demand a VOD

Recently, there was a streamed presentation for the whole company but the recording was not available to be viewed in our own time. The video was to be broadcast several times over the next few days to accommodate peoples’ schedules and timezones. After looking at my calendar, I noticed that none of the upcoming broadcast times worked for me - they either clashed with my other meetings or were broadcast outside of my working hours.

I started to think of ways I could watch the presentation: Reschedule a meeting? Nah, the meeting seems more important. Record the stream, watch it at a more convenient time, and then delete the recording? That could work. This got me thinking about how I could record a streamed video on my work laptop without installing any unapproved software. In the end, I decided not to record the broadcast as I was uncomfortable going against the wishes of the organisers, but I needed to know how I could solve this problem.

Coming Up With Ideas

My first idea was to use the Mac’s pre-installed Quicktime Player to record my screen. This seemed like a good idea at first because I had used it to record my screen in the past but I soon realised that it does not record the audio output from applications - it can only record audio from audio input devices. There is a way to workaround this with the use of a loopback driver that exposes that audio as an input device but installing an unnecessary driver that I do not fully understand onto my work laptop is definitely not secure and well outside my bounds of comfort1.

Was there any software on my laptop that I could leverage here? Yes! My next idea was to see if I could use a web browser. Web browsers do so many things including video conferencing so I figured it must have a mechanism to access the screen. Also, the stream I want to record is viewed in the web browser so there might be a way to hook into that. After perusing the MDN docs, I learned of the Screen Capture API and the MediaStream Recording API. The Screen Capture API provides access to a video stream of the user’s screen and the MediaStream Recording API provides a way to capture that data and write it to disk.

With both of these APIs, we can build a basic screen recorder.

Capturing the Screen

The first stage of building the screen recorder will consist of using the Screen Capture API to get a video stream of our screen and playing it back in the page.

The Screen Capture API allows you to capture a browser tab, window or entire screen as a stream and process it however you want. The obvious use case here is video conferencing but it suits our local screen recording use case perfectly. This API is an extension of the Media Capture and Streams API which gives access to the user’s microphone and webcam intended to be streamed for real-time communication (RTC).

The Screen Capture API consists of one method: MediaDevices.getDisplayMedia(). Calling this method will return a Promise containing a MediaStream object which can be used to access the video and audio data. For the promise to resolve successfully, the user will have to give permission to share their screen and then choose what to share (browser tab, window, or screen). MediaDevices.getDisplayMedia() can take an optional parameter which allows you to specify the options that should be available to the user as well as constraints on the video and/or audio you want to capture. Let’s see this in action!

Let’s start by creating the following files:

screen-recorder.html

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" href="style.css">
    <script src="code.js"></script>
    <title>Screen Recorder</title>
  </head>
  <body>
    <h1>Screen Recorder</h1>
    <button onclick="startCapture()">Start Capture</button>
    <button onclick="stopCapture()">Stop Capture</button>
    <video id="video-playback"></video>
  </body>
</html>

style.css

#video-playback {
  border: 1px solid #000000;
  width: 100%;
}

code.js

async function startCapture() {
  try {
    const options = {
      video: true,
      audio: true,
      preferCurrentTab: false,
      selfBrowserSurface: "include",
      surfaceSwitching: "include",
      monitorTypeSurfaces: "include",
    };

    const captureStream = await navigator.mediaDevices.getDisplayMedia(options);

    const videoElement = document.getElementById("video-playback");
    videoElement.srcObject = captureStream;
    videoElement.muted = true;
    videoElement.play();

    captureStream.getVideoTracks()[0].addEventListener("ended", stopCapture);
  } catch (e) {
    console.error(e);
    stopCapture();
  }
}

function stopCapture() {
  const videoElement = document.getElementById("video-playback");
  if (videoElement.srcObject) {
    let tracks = videoElement.srcObject.getTracks();
    tracks.forEach((track) => track.stop());
    videoElement.srcObject = null;
  }
}

Let’s break down what the code does.

The first thing we are doing is constructing an object containing what type of stream we want as well as which options to present to the user. By setting video to true, we’re requesting access to a video stream of the user’s screen and by setting audio to true, we’re requesting the audio as well. Both video and audio fields can be set to a MediaTrackConstraints object instead of a boolean if we want specific capabilities or constraints. We’re happy with the defaults so a true is good enough for us.

By setting preferCurrentTab to false, the user will be presented with multiple options, not just the current tab. By setting selfBrowserSurface to include, the browser will present the current tab as an option — if this is set to exclude, the current tab is not sharable. Setting surfaceSwitching to include allows the user to change which tab is being shared after starting to share. If this is set to exclude, the user would have to stop and start sharing again to change which tab is shared. By setting monitorTypeSurfaces to include, the user is allowed to share entire monitors not just browser tabs or windows.

Next, we call navigator.mediaDevices.getDisplayMedia(options) using async/await syntax and store the value into captureStream. captureStream is not set until the user has finished selecting what they want to share. If there is an error (such as permission denied), an error will be thrown which we catch and log. This try-catch block is unnecessary at the moment but we use it in the next part.

Now that we have the stream, we can start using it. We grab the video element from the document and tell it to start playing the video stream we are capturing by setting videoElement.srcObject = captureStream. Just setting srcObject does not start the video playing so we have to explicitly start it by calling videoElement.play(). Before we play the video, we mute the video with videoElement.muted = true. If we don’t do this, we would hear the original audio and the captured audio at the same time which causes an echo effect.

Lastly, we have a stopCapture() function which we register to the ended event handler on the video track. This event handler is called when the capture is stopped. This can happen if the user clicks ‘stop sharing’ in their browser. We also have the ‘Stop Capture’ button on our page which calls stopCapture() directly. This function stops the capture and the video element.

To view your web page you can open screen-recorder.html in your web browser. The URL in your web browser will look something like file:///Some/Path/To/Your/File/screen-recorder.html. You should see the heading “Screen Recorder”, a “Start Capture” button, and an empty video element with a thin black outline. Clicking ‘Start Capture’ should prompt you to share. Once you select what you want to share, you should see your screen inside the page!

Some things to note:

  • macOS requires you to give the web browser permission to record your screen. This can be done by going to System Settings → Privacy & Security → Screen Recording and enabling your browser. You will need to restart the browser afterwards.
  • Firefox does not support audio capture or several of the options given to getDisplayMedia()2
  • On Chrome, audio capture seems to only work if you share a tab and only the audio from the tab is shared.
  • If you get mysterious errors such as DomException: The object can not be found here. You probably haven’t given screen recording permission to the web browser.

Time to Record

Now that we can capture the user’s screen and view it in the page, we can move onto recording and saving it to the disk. Let’s update the HTML to include recording size and a button to download the recording.

screen-recorder.html

<!DOCTYPE html>
<html>
  <head>
    <link rel="stylesheet" href="style.css">
    <script src="code.js"></script>
    <title>Screen Recorder</title>
  </head>
  <body>
    <h1>Screen Recorder</h1>
    <button onclick="startCapture()">Start Capture</button>
    <button onclick="stopCapture()">Stop Capture</button>
    <button onclick="downloadRecording()">Download</button>
    <a id="download-link"></a>
    <div>Recording Size: <span id="recording-size"></span></div>
    <video id="video-playback"></video>
  </body>
</html>

Now we have a button that we can click to download the recording. The download-link anchor element will be used to trigger the browser to download the file for us but we don’t need to show it on the page. We also have a spot on the page to put the size of the recording as it grows. Now, let’s add the code to make it work!

code.js

let recordedChunks = [];
let recordingSize = 0;

async function startCapture() {
  try {
    const options = {
      video: true,
      audio: true,
      preferCurrentTab: false,
      selfBrowserSurface: "include",
      surfaceSwitching: "include",
      monitorTypeSurfaces: "include",
    };

    const captureStream = await navigator.mediaDevices.getDisplayMedia(options);

    const videoElement = document.getElementById("video-playback");
    videoElement.srcObject = captureStream;
    videoElement.muted = true;
    videoElement.play();

    captureStream.getVideoTracks()[0].addEventListener("ended", stopCapture);

    recordedChunks = [];
    recordingSize = 0;
    const recorderOptions = { mimeType: "video/webm" };
    const mediaRecorder = new MediaRecorder(captureStream, recorderOptions);

    mediaRecorder.addEventListener("dataavailable", (event) => {
      if (event.data.size > 0) {
        recordedChunks.push(event.data);
        updateRecordingSize(event.data.size);
      }
    });

    mediaRecorder.start(1000);
  } catch (e) {
    console.error(e);
    stopCapture();
  }
}

function stopCapture() {
  const videoElement = document.getElementById("video-playback");
  if (videoElement.srcObject) {
    let tracks = videoElement.srcObject.getTracks();
    tracks.forEach((track) => track.stop());
    videoElement.srcObject = null;
  }
}

function updateRecordingSize(byteCount) {
  recordingSize += byteCount;
  const span = document.getElementById("recording-size");
  span.innerText = (recordingSize / 1000000).toFixed(2) + " MB";
}

function downloadRecording() {
  const blob = new Blob(recordedChunks, { type: "video/webm" });
  const downloadLink = document.getElementById("download-link");
  downloadLink.href = URL.createObjectURL(blob);
  downloadLink.download = "recording.webm";
  downloadLink.click();
}

We’ve added the variables recordedChunks and recordingSize to store the chunks of recorded data and the number of bytes used by the chunks. We reset the values of these variables in the startCapture() function to prevent chunks of different recordings mixing together and to have the recording size represent the latest recording’s size. Now with a place to store the recorded data, we can start recording!

We create a MediaRecorder object with the stream returned from the call to getDisplayMedia(options) and an object containing some options. The options object can contain various properties which will tell the MediaRecorder some aspects of the media we would like such as: audio bitrate, video bitrate, or media format. We have supplied the mimeType property set to video/webm which requests that the container format to be webm and the lack of codecs specified indicates that the browser can select its preferred codecs for video and audio. If you want to specify a codec, you can add that information in this property string e.g. video/webm; codecs="vp8, vorbis". More information can be found here.

To access the recorded data, we add an event listener to listen to dataavailable events. These events are fired when the recorder has data ready for us to use which contain a chunk of the recording as a Blob object. Our event listener pushes these chunks to our recordedChunks array and then updates the recording size displayed on the page by calling updateRecordingSize(byteCount).

In the updateRecordingSize(byteCount) function we: update the recordingSize variable, convert the number of bytes to megabytes3, and then update the recording-size span on the page.

To start the recording, we call mediaRecorder.start(1000). Passing in 1000 as an argument tells the recorder to split the recording in 1000 millisecond chunks. If this argument is omitted, the recorder does not split up the recording and only outputs a single chunk. Splitting the recording up is useful. Since we have access to the chunks while we are recording, we can keep track of how much memory it is using and display that on the page. Another feature we could use this for is to break the recording up into multiple files (which we are not doing here).

To download the recording, we have the downloadRecording() function which is called when our download button is clicked. This function allows the user to get access to the recording by ‘downloading’ the file. We do this with the following steps. First we create a blob with the data from the recording and an options object. We use the options object to set the MIME type which indicates to other programs what type of data is in the blob.

Now that we have our blob, we want the browser to allow us to access it. We can achieve this by having an anchor tag with two properties set: href and download. The href property tells the browser the URL of the content and the download property tells the browser to download the resource with the specified filename instead of displaying it in the browser. We use URL.createObjectURL(blob) to create a URL to our blob and set the download property to recording.webm. There are considerations around the lifetimes of these URLs and releasing them but for our purposes we won’t worry about them. More information can be found here.

Usually you click an anchor tag to trigger the navigation or download but here we’re using a hidden anchor tag (it has no text) and are clicking it programatically with downloadLink.click(). This was just a UI design choice, there is no reason you can’t just have the anchor tag visible and click it yourself.

It’s Alive!

With that code, all the buttons should work and we should be able to record our screen!

This is an example of how make a bare bones screen recorder with a browser. There a plenty of ways to improve the UI and add features.

Ideas for UI improvements:

  • The download button could be disabled initially and while recording
  • The play button could stop the recording once clicked and the stop button could be removed
  • Better (any) styling

Ideas for features:

  • We could take snapshots of the recording and then display them on the page — this can be done with the canvas element
  • Support for multiple recordings
  • Breaking up large recordings into multiple files

Related topics to look into:

  • Codecs: How browsers support them or how they work in general
  • MediaTrackConstraints: Tweak properties of the media

If you want to see the recorder in action, you can find it here.

MDN Web Docs

The MDN Web Docs were my main reference and inspiration for this blog post. It is a fantastic resource and is my go-to reference while doing web development. When starting web development, I found it to be big, scary and full of concepts/terminology I didn’t fully understand. But after building experience and confidence, the MDN docs are easy to understand and an absolute treat.

Conclusion

Thank you for reading. This is my first blog post in which I decided to write within a domain I am confident in: web development. I am currently interested in learning things lower in the stack such as hardware and systems programming, so future posts probably won’t focus on web development.

I’d love to dig into how web browsers have implemented this feature, and how one would implement a native screen recorder. That sounds like a future project.

If you have any questsions, find any mistakes, or just want to reach out, you can email me at ‘contact’ ‘at’ this domain.



  1. For Mac, I found Black Hole. It appears to mostly consist of a four thousand line .c file. Digging into how a loopback driver works would be a good exercise learn how audio drivers work. Future project idea! ↩︎

  2. You can see the compatibility here. The browser compatibility data in the MDN docs is fantastic and also available in GitHub↩︎

  3. Not to be confused with mebibytes↩︎