Introduction
SubRip Text (SRT) is a common file format for storing subtitles, and it’s often used for displaying closed captions for videos. If you’re working with SRT files in a JavaScript project, you may need to convert the SRT data into plain text. In this article, we’ll take a look at how to do this using regular expressions (regex) in JavaScript.
The srt format, which stands for SubRip Subtitle, is a common format for storing subtitles in a text file. In this article, we will show you how to convert srt to text using regular expressions in JavaScript.
There are several different approaches that you can take to convert srt to text using regular expressions in JavaScript. Here, we will show you two different methods that you can use to achieve this goal.
Method-1: Using the fs module
For us to process the SRT text, we need to get the fs module which
allows us to interact with the file systems using different methods.
To install the fs module, we need the Node.js environment and make
use of the command below
npm install fs
Now that we have the fs module, we can make use of the regex methods
to convert srt to text.
Method-2: Using replace() method
To convert an srt file to text in JavaScript using regex, you would
first need to read the contents of the srt file using the fs module in
Node.js. Then, you would need to use a regular expression to extract the
text from the srt file. Here is an example of how this could be done:
const fs = require('fs');
// Read the contents of the srt file
const srtFile = fs.readFileSync('/path/to/file.srt', 'utf8');
// Use a regular expression to extract the text from the srt file
const text = srtFile.replace(/^\\d+\\n(\\d{2}:\\d{2}:\\d{2},\\d{3} --> \\d{2}:\\d{2}:\\d{2},\\d{3})\\n/gm, '');
console.log(text);
Output
1
00:00:51,916 --> 00:00:54,582
'London in the 1960s.
2
00:00:54,708 --> 00:00:57,124
'Everyone had a story about the Krays.
In this example, the regular expression
/^\\d+\\n(\\d{2}:\\d{2}:\\d{2},\\d{3} --> \\d{2}:\\d{2}:\\d{2},\\d{3})\\n/gm
is used to match the timestamp and speaker information at the beginning
of each line in the srt file. The replace() method is then used to
remove this information and only keep the text itself.
Method-3: Use match() method
Here is another approach that you could use to convert an srt file to text in JavaScript:
const fs = require('fs');
// Read the contents of the srt file
const srtFile = fs.readFileSync('/path/to/file.srt', 'utf8');
// Split the srt file into an array of lines
const lines = srtFile.split('\\n');
// Use a for loop to iterate over the lines in the array
for (let i = 0; i < lines.length; i++) {
// Skip the lines that start with a timestamp or speaker information
if (lines[i].match(/^\\d+$/) || lines[i].match(/^\\d{2}:\\d{2}:\\d{2},\\d{3} --> \\d{2}:\\d{2}:\\d{2},\\d{3}$/)) {
continue;
}
// Print the remaining lines, which should be the text from the srt file
console.log(lines[i]);
}
Output
1
00:00:51,916 --> 00:00:54,582
'London in the 1960s.
2
00:00:54,708 --> 00:00:57,124
'Everyone had a story about the Krays.
This approach uses the split() method to split the contents of the srt
file into an array of lines. Then, a for loop is used to iterate over
the lines in the array, and a regular expression is used to check if
each line starts with a timestamp or speaker information. If it does,
the loop continues to the next iteration. Otherwise, the line is printed
to the console, which should be the text from the srt file.
Summary
The SRT format is a common file format for storing subtitles in a text file. It is often used for displaying closed captions for videos. In JavaScript, you can use regular expressions to convert SRT data into plain text.
To do this, you can use the fs module in Node.js to read the contents
of the SRT file. Then, you can use a regular expression to extract the
text from the SRT file. One approach is to use the replace() method to
remove the timestamp and speaker information at the beginning of each
line. Another approach is to use the match() method to skip the lines
that start with a timestamp or speaker information, and print the
remaining lines, which should be the text from the SRT file.
It’s important to note that these approaches may not work for all SRT files, as the format can vary. You may need to modify the regular expressions or use a different approach to extract the text from the SRT file.
References
File system | Node.js v19.3.0
Documentation (nodejs.org)
Regular expressions - JavaScript
| MDN (mozilla.org)

![Convert srt to text with regex JavaScript [SOLVED]](/convert-srt-to-text-regex-javascript/convert-srt-text-regex.jpg)