home  >  notes  >  syndication2

XML syndication 2: The Second Draft

v1.0 10may2000: First version

See also:
Considering syndication of XML/RSS: The precursor to this document. Considerations, requirements and rationale.

By Matt Webb. If you think this is outrageous, you'd better stop him before he makes a fool of himself.


Right, here we go. This is how XML syndication should work. In further documents I'll get more detailed, method-calls etc. Comments very welcome.


Overview

Each network node is a propagator. All propagators have a list of IPs of all other nodes. This network is what all messages and data are transmitted over.

Listeners are local to progagators (ie on the same LAN, but preferably on the same machine). They decide what data is needed. There is only a single listener on a propagator.

Sources put data into the network. They must also be a propagator.


Sources

Let there be a source with a new piece of data to publish. This data is identified by some unique ID (say, it's original URL and a timestamp). The publish the data the source must do two things:

  1. Propagate a notification that this data exists.
  2. If requested for the data, send the data out, or a later version if one exists.

The mechanisms for both of the above are defined in the next section.


Propagators

Notifications:

Notifications are published via an optimised viral propagation.

Each propagator has an ordered list of all other propagators. This list is ordered by response speed of servers (ie faster and closer nodes have priority). Having a new notification (we'll call N), the propagator relays this to a certain number of propagators in its list. A relay consists of a query and a response.

Query: Here is N. Was N new to you?
Response: Cheers mate. Yes|No.

The speed of the response is used to order the list.

A notification N consists of three parts:

  1. ID: The unique ID of the data
  2. Path: An ordered list of the nodes that N has passed through such that the first item is the source of the notification and the last is the current node.
  3. Pseudopath: An unordered list of propagators known to have received this notification, ie nodes not directly on the path but that were on the relay-to list higher up than the on-the-path node. If we wanted to be really clever then a relay 'No' response would also return more nodes to be added.

The propagator relays to the nodes on the relay list, in order, skipping the nodes either on N's path or pseudopath. It either relays to every node on the list or until the number of 'Yes' responses has reached a certain limit (I haven't decided yet).

Possible alteration: If a propagator regularly returns a 'No' response then perhaps it should move down the list.

Possible alteration: To avoid waiting on a the response for N we should make the query/response asynchronous. This would necessitate all queries having a unique ID (node url plus timestamp plus serial number). The propagator would need a memory of queries sent out so it could move the non-repliers down the relay list (response time can be worked out using the query ID timestamp).

Data requests:

A data request is not virally propagated. The requester sends a request back to the data source along the path contained in N. The source sends the data back along this same path. A request consists of the path and the ID.

If a propagator receives a request to pass on towards the source: If it has the requested data with the same or newer timestamp, it does not send the request back to the source but sends the data back along the path to the receiver.

If a propagator receives data to pass towards the requester: It must cache the data, unless it has a newer version in which case that should be passed towards the requester instead.

Possible alteration: If a propagator A is regularly issuing data requests towards the source through propagator B, perhaps A should move up B's list.

Optimisation: If a propagator A has had two data requests for the same data issued through it, then it should only send one request and remember to send both pieces of data out when it gets them.


Listeners

Listeners are sit on the node and decide whether data requests are issued.

For every notification that comes through, the local propagator lets the listener know. It can do this either by calling the listener directly or adding the notification to a list that the listener has access to. It is then up to the listener to decide when to issue data requests.

Optimisation: Allow 'subscriptions' by the listener. The propagator then merges its notification response with its data request.

Further alteration: Allow 'subscriptions' (a subscription now being that data is allowed to move without a notification happening first) to move around the network.

There may be many applications that need the data that the listener is fetching. It is up to the local implementation of the listener to decide how to deal with this: Listener is simply the term for the part of a node that decides whether to issue data requests.


Issues


home  >  notes  >  syndication2