2020-10-21 17:07:19 +00:00
## Transcript File Format Details
2020-10-21 16:58:54 +00:00
This is the initial spec for the podcast transcript format. There are three possible formats detailed below.
2020-10-21 17:07:19 +00:00
< br > < br >
2020-10-21 16:58:54 +00:00
## HTML
The HTML transcript format provides a solution when a transcript is available but no or limited timecode data is available. HTML transcript files are considered low-fidelity and are
designed to serve as an accessibility aid and provide searchable episode content. The HTML format used for podcast transcripts should adhere to the following specifications.
2020-10-21 17:02:44 +00:00
#### HTML tags used:
- `<cite>` : Name of the speaker (if available)
- `<time>` : Start time of monologue (if available)
- `<p>` : Content of monologue
#### Snippet:
```
< cite > Kevin:< / cite >
< time > 0:00< / time >
< p > We have an update planned where we would like to give the ability to upload an artwork file for these videos< / p >
< cite > Alban :< / cite >
< time > 0:09< / time >
< p > You're triggering Tom right now with a hey, here's a cool feature.< / p >
```
2020-10-21 16:58:54 +00:00
2020-10-21 17:07:19 +00:00
Example file: [example.html ](example.html )
2020-10-21 16:58:54 +00:00
< br > < br >
## JSON
The JSON representation is a flexible format that accomodates various degrees of fidelity in a concise way. This format for podcast transcripts should adhere to the following specifications.
2020-10-21 17:02:44 +00:00
#### Elements included in this representation:
- `<version>` : The version of JSON transcript specification
- `<segments>` : An array of dialogue elements (segments)
- `<speaker>` : Speaker
2020-10-30 15:59:51 +00:00
- `<startTime>` : Start time for the segment
- `<endTime>` : End time for the segment (if available)
2020-10-21 17:02:44 +00:00
- `<body>` : Dialogue content
#### Snippet:
```
{
"version": "1.0.0",
"segments": [
{
"speaker": "Darth Vader",
"startTime": 0.5,
"endTime": 0.75,
"body": "I"
},
{
"speaker": "Darth Vader",
"startTime": 1,
"endTime": 1.25,
"body": "am"
},
{
"speaker": "Darth Vader",
"startTime": 1.5,
"endTime": 2.0,
"body": "your"
},
{
"speaker": "Darth Vader",
"startTime": 2.25,
"endTime": 2.50,
"body": "father."
},
2020-10-21 16:58:54 +00:00
{
2020-10-21 17:02:44 +00:00
"speaker": "Luke",
"startTime": 2.75,
"endTime": 3.0,
"body": "Nooooo"
2020-10-21 16:58:54 +00:00
}
2020-10-21 17:02:44 +00:00
]
}
```
2020-10-21 16:58:54 +00:00
2020-10-21 17:07:19 +00:00
Example file: [example.json ](example.json )
2020-10-21 16:58:54 +00:00
< br > < br >
## SRT
The SRT format was designed for video captions but provides a suitable solution for podcast transcripts. The SRT format contains medium-fidelity timestamps and are a
popular export option from transcription services. SRT transcripts used for podcasts should adhere to the following specifications.
2020-10-21 17:02:44 +00:00
#### Properties:
- Max number of lines: 2
- Max characters per line: 32
- Speaker names (optional): Start a new card when the speaker changes. Include the speaker's name, followed by a colon.
#### Snippet:
```
1
00:00:00,000 --> 00:00:02,760
Sarah: In today's episode,
you'll learn whether or not you
2
00:00:02,760 --> 00:00:06,090
should have a podcast trailer.
And if so, what should you
3
00:00:06,090 --> 00:00:11,610
include in one? Welcome to
Podcasting Q& A, where you learn
4
00:00:11,610 --> 00:00:15,750
the best tips and strategies to
launch, grow and monetize your
5
00:00:15,750 --> 00:00:18,630
podcast. This week's question
comes from Gillian.
6
00:00:19,080 --> 00:00:21,450
Gillian: Hi Buzzsprout, Gillian
here from breaking through
7
00:00:21,450 --> 00:00:25,350
careers podcast. My question is,
do we need a podcast trailer?
```
2020-10-21 17:07:19 +00:00
Example file: [example.srt ](example.srt )