My name is Dave Winer.
river.js been in use since 2009, and is already the basis for interop for a small community. I think it's time to promote a format for wider use.
This is the first time it's been documented.
How to think of it -- it represents the flow from a collection of feeds, not just a single feed.
However, it is not the collection, rather it's the news from the collection.
And unlike earlier formats such as RSS and OPML, it's JSON not XML. Personally I don't care which it is, but there's a feeling that JSON is more modern.
river.js is the format we're using in the rivers generated by River2.
It's called river.js because it represents a river of news in JSON.
1. It's a JSONP file, a call to a procedure named onGetRiverStream, with one parameter, a JSON structure.
2. At the top level of the structure there are two branches: updatedFeeds and metadata.
3. Under updatedFeeds is a sequence of updatedFeed elements. Each one represents a feed.
4. Each updatedFeed contains several values: feedUrl, websiteUrl, feedTitle, feedDescription, whenLastUpdate, and one or more items.
5. An item contains body, permaLink, pubDate, title, link and id.
6. metadata contains information about the river.js file.
Each of the elements of an updatedFeed is mandatory. If you don't have a value for it, include it with an empty string as its value.
The elements of an updatedFeed come from the top level of a feed, except for feedUrl, which is of course the address of the feed itself, and whenLastUpdate which is the time when the new items from the feed was read by the aggregator.
websiteUrl comes from the link element in the feed, feedTitle from the title element and feedDescription from the description element.
body is the description from the feed, with html markup stripped, and limited to 280 characters. If the original text was more than the maximum length, three periods were added at the end.
permaLink, pubDate, title and link are straightforward copies of what appeared in the feed.
id is a number assigned to the item by the aggregator. Usually it is incremented by one for each item, but that's not guaranteed
Optional elements are comments, enclosure and thumbnail.
comments points to a page of comments related to the item (it's exactly as in RSS 2.0).
enclosure is exactly as in RSS 2.0, with three sub-elements, url, type and length.
thumbnail has three sub-elements, url that points to the full image, and width and height which give the size of the thumbnail.
The top-level metadata element contains data that would have appeared in the head element if this were OPML or at the top level of an RSS feed.
docs is a link to a web page that documents the format. It's present here considering the possibility that someone finds this file 50 years from now and has no idea what the format is. It should tell them, more or less, what they're looking at. As soon as this document is finished I'm going to have my rivers point to this page from this element.
whenGMT says when the file was built in a universal time. whenLocal is a string that says when it was built in local time so the person responsible for the file doesn't have to do the translation when debugging the code.
version is 3.
secs is the number of seconds it took to build the file.
In this post, written on June 23, 2009, I admit that 280 is an arbitrary number.
In that post I talk about doing an experiment with the feeds of the NY Times and the BBC, and we actually did that experiment and found that the average length of descriptions was longer than 140 character (the limit in Twitter), but less than 500 characters.
In case it isn't obvious, 280 is exactly twice 140. :-)