RSS Feed Requirements

This guide is to help you prepare your RSS Feeds so that SpeechKit can perfectly ingest each article into audio.

Introduction

Using the RSS integration you can use a single RSS Feed for the whole website or a single RSS Feed for each section of a website. RSS Integration is faster than integrating with our API as it doesn't require any development on your side other than structuring the RSS Feed.

The SpeechKit RSS Extractor can be adapted to accommodate any RSS structure. However we suggest including specific RSS Elements in the <item> structure to ensure the best outcome.

Required Elements

SpeechKit will extract content from each element in the <item>. It will process the <title> and <content> elements into SSML and then convert them into audio. It will also store content from each element as metadata for each audio article.

Please include the following elements in each RSS Feed <item>:

<author>

This is the author of the article.

<guid>

This is the Globally Unique Identifier. The JS Player can use this to match audio articles in your SpeechKit audio content management system with articles in your content management system.

<title>

This is the article title.

<description>

This is the article description.

<link>

This is the article URL. The JS Player and Player iFrame embed can use this to match audio articles in your SpeechKit audio content management system with articles in your content management system.

<pubDate>

This is the pubDate. If a pubDate is updated e.g, when an update is made to an article. SpeechKit will regenerate the audio.

<enclosure>

This is multimedia content e.g. images.

<content>

This is the article content (excluding the title).

Content Format

We recommend that the <title>, <description>, and <content elements contain HTML, although plain text is sufficient.

This is because the HTML provides SpeechKit with context that helps us to better process text into audio e.g. paragraphs, sub-headings, etc.

This content should ideally be in a CDATA section and include HTML:

<content><![CDATA[<p>The first paragraph text</p><p>Second paragraph text.</p>]]></content>

Or escape the contained HTML:

<content>&lt;p&gt;&lt;p&gt;The first paragraph text.&lt;/p&gt;&lt;p&gt;Second paragraph text.&lt;/p&gt;&lt;p&gt;</content>

An <item> that uses CDATA for the <title>, <description>, and <content> elements should look like this:

<item>
      <author>author</author>
      <guid>id</guid>
      <title><![CDATA[This is the article title]]></title>
      <description><![CDATA[This is the article description]]></description>
      <link>article_url</link>
      <pubDate>Thu, 01 Oct 2020 16:50:02 +0200</pubDate>
      <enclosure url="image.jpg" length="0" type="image/jpg" />
      <content><![CDATA[<p>This is paragraph one</p><p>This is paragraph 2</p><p>etc</p><div class="related_articles"></div>]]></content>                
</item>