Screen Recorder
I Demand a VOD
Recently, there was a streamed presentation for the whole company but the recording was not available to be viewed in our own time. The video was to be broadcast several times over the next few days to accommodate peoples’ schedules and timezones. After looking at my calendar, I noticed that none of the upcoming broadcast times worked for me - they either clashed with my other meetings or were broadcast outside of my working hours.
I started to think of ways I could watch the presentation: Reschedule a meeting? Nah, the meeting seems more important. Record the stream, watch it at a more convenient time, and then delete the recording? That could work. This got me thinking about how I could record a streamed video on my work laptop without installing any unapproved software. In the end, I decided not to record the broadcast as I was uncomfortable going against the wishes of the organisers, but I needed to know how I could solve this problem.
Coming Up With Ideas
My first idea was to use the Mac’s pre-installed Quicktime Player to record my screen. This seemed like a good idea at first because I had used it to record my screen in the past but I soon realised that it does not record the audio output from applications - it can only record audio from audio input devices. There is a way to workaround this with the use of a loopback driver that exposes that audio as an input device but installing an unnecessary driver that I do not fully understand onto my work laptop is definitely not secure and well outside my bounds of comfort1.
Was there any software on my laptop that I could leverage here? Yes! My next idea was to see if I could use a web browser. Web browsers do so many things including video conferencing so I figured it must have a mechanism to access the screen. Also, the stream I want to record is viewed in the web browser so there might be a way to hook into that. After perusing the MDN docs, I learned of the Screen Capture API and the MediaStream Recording API. The Screen Capture API provides access to a video stream of the user’s screen and the MediaStream Recording API provides a way to capture that data and write it to disk.
With both of these APIs, we can build a basic screen recorder.
Capturing the Screen
The first stage of building the screen recorder will consist of using the Screen Capture API to get a video stream of our screen and playing it back in the page.
The Screen Capture API allows you to capture a browser tab, window or entire screen as a stream and process it however you want. The obvious use case here is video conferencing but it suits our local screen recording use case perfectly. This API is an extension of the Media Capture and Streams API which gives access to the user’s microphone and webcam intended to be streamed for real-time communication (RTC).
The Screen Capture API consists of one method: MediaDevices.getDisplayMedia()
.
Calling this method will return a Promise containing a MediaStream
object which can be used to access the video
and audio data.
For the promise to resolve successfully, the user will have to give permission to share their screen and
then choose what to share (browser tab, window, or screen).
MediaDevices.getDisplayMedia()
can take an optional parameter which allows you to specify the options that
should be available to the user as well as constraints on the video and/or audio you want to capture.
Let’s see this in action!
Let’s start by creating the following files:
screen-recorder.html
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="style.css">
<script src="code.js"></script>
<title>Screen Recorder</title>
</head>
<body>
<h1>Screen Recorder</h1>
<button onclick="startCapture()">Start Capture</button>
<button onclick="stopCapture()">Stop Capture</button>
<video id="video-playback"></video>
</body>
</html>
style.css
#video-playback {
border: 1px solid #000000;
width: 100%;
}
code.js
async function startCapture() {
try {
const options = {
video: true,
audio: true,
preferCurrentTab: false,
selfBrowserSurface: "include",
surfaceSwitching: "include",
monitorTypeSurfaces: "include",
};
const captureStream = await navigator.mediaDevices.getDisplayMedia(options);
const videoElement = document.getElementById("video-playback");
videoElement.srcObject = captureStream;
videoElement.muted = true;
videoElement.play();
captureStream.getVideoTracks()[0].addEventListener("ended", stopCapture);
} catch (e) {
console.error(e);
stopCapture();
}
}
function stopCapture() {
const videoElement = document.getElementById("video-playback");
if (videoElement.srcObject) {
let tracks = videoElement.srcObject.getTracks();
tracks.forEach((track) => track.stop());
videoElement.srcObject = null;
}
}
Let’s break down what the code does.
The first thing we are doing is constructing an object containing what type of stream we want as
well as which options to present to the user. By setting video
to true, we’re requesting access to
a video stream of the user’s screen and by setting audio
to true, we’re requesting the audio as well.
Both video
and audio
fields can be set to a
MediaTrackConstraints
object instead of a boolean if we want specific capabilities or constraints. We’re happy with the defaults
so a true
is good enough for us.
By setting preferCurrentTab
to false, the user will be presented with
multiple options, not just the current tab. By setting selfBrowserSurface
to include
, the browser will
present the current tab as an option — if this is set to exclude
, the current tab is not sharable.
Setting surfaceSwitching
to include
allows the user to change which tab is being shared after starting
to share. If this is set to exclude
, the user would have to stop and start sharing again to change which
tab is shared. By setting monitorTypeSurfaces
to include
, the user is allowed to share entire monitors
not just browser tabs or windows.
Next, we call navigator.mediaDevices.getDisplayMedia(options)
using async/await syntax
and store the value into captureStream
. captureStream
is not set until
the user has finished selecting what they want to share. If there is an error (such as permission denied),
an error will be thrown which we catch and log. This try-catch block is unnecessary at the moment but we
use it in the next part.
Now that we have the stream, we can start using it. We grab the video element from the document and tell
it to start playing the video stream we are capturing by setting videoElement.srcObject = captureStream
.
Just setting srcObject
does not start the video playing so we have to explicitly start it by calling
videoElement.play()
. Before we play the video, we mute the video with videoElement.muted = true
.
If we don’t do this, we would hear the original audio and the captured audio at the same time which causes
an echo effect.
Lastly, we have a stopCapture()
function which we register to the ended
event handler on the video track.
This event handler is called when the capture is stopped. This can happen if the user clicks ‘stop sharing’ in their
browser. We also have the ‘Stop Capture’ button on our page which calls stopCapture()
directly.
This function stops the capture and the video element.
To view your web page you can open screen-recorder.html
in your web browser.
The URL in your web browser will look something like file:///Some/Path/To/Your/File/screen-recorder.html
.
You should see the heading “Screen Recorder”, a “Start Capture” button, and an empty video element with a thin black outline.
Clicking ‘Start Capture’ should prompt you to share. Once you select what you want to share, you should see your
screen inside the page!
Some things to note:
- macOS requires you to give the web browser permission to record your screen. This can be done by going to System Settings → Privacy & Security → Screen Recording and enabling your browser. You will need to restart the browser afterwards.
- Firefox does not support audio capture or several of the options given to
getDisplayMedia()
2 - On Chrome, audio capture seems to only work if you share a tab and only the audio from the tab is shared.
- If you get mysterious errors such as
DomException: The object can not be found here
. You probably haven’t given screen recording permission to the web browser.
Time to Record
Now that we can capture the user’s screen and view it in the page, we can move onto recording and saving it to the disk. Let’s update the HTML to include recording size and a button to download the recording.
screen-recorder.html
<!DOCTYPE html>
<html>
<head>
<link rel="stylesheet" href="style.css">
<script src="code.js"></script>
<title>Screen Recorder</title>
</head>
<body>
<h1>Screen Recorder</h1>
<button onclick="startCapture()">Start Capture</button>
<button onclick="stopCapture()">Stop Capture</button>
<button onclick="downloadRecording()">Download</button>
<a id="download-link"></a>
<div>Recording Size: <span id="recording-size"></span></div>
<video id="video-playback"></video>
</body>
</html>
Now we have a button that we can click to download the recording. The download-link
anchor element
will be used to trigger the browser to download the file for us but we don’t need to show it on the page.
We also have a spot on the page to put the size of the recording as it grows. Now, let’s add the code to make it work!
code.js
let recordedChunks = [];
let recordingSize = 0;
async function startCapture() {
try {
const options = {
video: true,
audio: true,
preferCurrentTab: false,
selfBrowserSurface: "include",
surfaceSwitching: "include",
monitorTypeSurfaces: "include",
};
const captureStream = await navigator.mediaDevices.getDisplayMedia(options);
const videoElement = document.getElementById("video-playback");
videoElement.srcObject = captureStream;
videoElement.muted = true;
videoElement.play();
captureStream.getVideoTracks()[0].addEventListener("ended", stopCapture);
recordedChunks = [];
recordingSize = 0;
const recorderOptions = { mimeType: "video/webm" };
const mediaRecorder = new MediaRecorder(captureStream, recorderOptions);
mediaRecorder.addEventListener("dataavailable", (event) => {
if (event.data.size > 0) {
recordedChunks.push(event.data);
updateRecordingSize(event.data.size);
}
});
mediaRecorder.start(1000);
} catch (e) {
console.error(e);
stopCapture();
}
}
function stopCapture() {
const videoElement = document.getElementById("video-playback");
if (videoElement.srcObject) {
let tracks = videoElement.srcObject.getTracks();
tracks.forEach((track) => track.stop());
videoElement.srcObject = null;
}
}
function updateRecordingSize(byteCount) {
recordingSize += byteCount;
const span = document.getElementById("recording-size");
span.innerText = (recordingSize / 1000000).toFixed(2) + " MB";
}
function downloadRecording() {
const blob = new Blob(recordedChunks, { type: "video/webm" });
const downloadLink = document.getElementById("download-link");
downloadLink.href = URL.createObjectURL(blob);
downloadLink.download = "recording.webm";
downloadLink.click();
}
We’ve added the variables recordedChunks
and recordingSize
to store
the chunks of recorded data and the number of bytes used by the chunks. We reset the values of these variables
in the startCapture()
function to prevent chunks of different recordings mixing together
and to have the recording size represent the latest recording’s size. Now with a place to store the
recorded data, we can start recording!
We create a MediaRecorder
object with the stream
returned from the call to getDisplayMedia(options)
and an object containing some options.
The options object can contain various properties which will tell the MediaRecorder
some
aspects of the media we would like such as: audio bitrate, video bitrate, or media format. We have supplied
the mimeType
property set to video/webm
which requests that the
container format to be webm and the lack of codecs specified
indicates that the browser can select its preferred codecs for video and audio. If you want to specify a codec,
you can add that information in this property string e.g. video/webm; codecs="vp8, vorbis"
.
More information can be found here.
To access the recorded data, we add an event listener to listen to dataavailable
events. These events are fired
when the recorder has data ready for us to use which contain a chunk of the recording as a
Blob object. Our event listener pushes
these chunks to our recordedChunks
array and then updates the recording size displayed on the page by
calling updateRecordingSize(byteCount)
.
In the updateRecordingSize(byteCount)
function we: update the recordingSize
variable,
convert the number of bytes to megabytes3, and then update the recording-size
span on the page.
To start the recording, we call mediaRecorder.start(1000)
.
Passing in 1000 as an argument tells the recorder to split the recording in 1000 millisecond chunks.
If this argument is omitted, the recorder does not split up the recording and only outputs a single chunk.
Splitting the recording up is useful. Since we have access to the chunks while we are recording,
we can keep track of how much memory it is using and display that on the page. Another feature we could use this for is
to break the recording up into multiple files (which we are not doing here).
To download the recording, we have the downloadRecording()
function which is called when our download
button is clicked. This function allows the user to get access to the recording by ‘downloading’ the file. We
do this with the following steps. First we create a blob with the data from the recording and an options object.
We use the options object to set the MIME type which indicates to other programs what type of data is in the blob.
Now that we have our blob, we want the browser to allow us to access it. We can achieve this by having an
anchor tag with two properties set: href
and download
. The href property tells the browser the URL of the content
and the download property tells the browser to download the resource with the specified filename
instead of displaying it in the browser. We use URL.createObjectURL(blob)
to create a URL
to our blob and set the download property to recording.webm
. There are considerations around
the lifetimes of these URLs and releasing them but for our purposes we won’t worry about them.
More information can be found
here.
Usually you click an anchor tag to trigger the navigation or download
but here we’re using a hidden anchor tag (it has no text) and are clicking it programatically with
downloadLink.click()
. This was just a UI design choice, there is no reason you can’t just
have the anchor tag visible and click it yourself.
It’s Alive!
With that code, all the buttons should work and we should be able to record our screen!
This is an example of how make a bare bones screen recorder with a browser. There a plenty of ways to improve the UI and add features.
Ideas for UI improvements:
- The download button could be disabled initially and while recording
- The play button could stop the recording once clicked and the stop button could be removed
- Better (any) styling
Ideas for features:
- We could take snapshots of the recording and then display them on the page — this can be done with the canvas element
- Support for multiple recordings
- Breaking up large recordings into multiple files
Related topics to look into:
- Codecs: How browsers support them or how they work in general
- MediaTrackConstraints: Tweak properties of the media
If you want to see the recorder in action, you can find it here.
MDN Web Docs
The MDN Web Docs were my main reference and inspiration for this blog post. It is a fantastic resource and is my go-to reference while doing web development. When starting web development, I found it to be big, scary and full of concepts/terminology I didn’t fully understand. But after building experience and confidence, the MDN docs are easy to understand and an absolute treat.
Conclusion
Thank you for reading. This is my first blog post in which I decided to write within a domain I am confident in: web development. I am currently interested in learning things lower in the stack such as hardware and systems programming, so future posts probably won’t focus on web development.
I’d love to dig into how web browsers have implemented this feature, and how one would implement a native screen recorder. That sounds like a future project.
If you have any questsions, find any mistakes, or just want to reach out, you can email me at ‘contact’ ‘at’ this domain.
-
For Mac, I found Black Hole. It appears to mostly consist of a four thousand line .c file. Digging into how a loopback driver works would be a good exercise learn how audio drivers work. Future project idea! ↩︎
-
You can see the compatibility here. The browser compatibility data in the MDN docs is fantastic and also available in GitHub. ↩︎