Microcontent Granularity

2004/9/22

Microcontent Granularity

It is a common assumption that the size of typical microcontent is about few KBs (4KB being most popular?). This seems to me to be about right when considering size of typical email.

However the interesting problem is granularity of microcontent. One can argue that everything should be split into sub-atomic particles such as KnownSpace entity that is described as an indivisible amount of information that is later recombined to provide microcontent (such as email message combined of body text entity, subject entity, date entity, etc) and can be flexibly recombined to create new microcontent. This does look like a compelling approach from purely theoretical point of view (search for information building blocks), and easy enough for machines but it is not the way that humans thinks ...

It looks to me that email is not only the right amount of information but also the right way to represent it: content (email text) plus set of metadata attributes (such as subject and date) should be kept together. This just needs a tad of refining to make it extensible and easy to recombine/filter hence use of XML (and XML namespaces) and leads to compatibility with such new microcontent standards like ATOM (now ATOM name starts to make sense :)).

Still a computer scientist inside of me wants indivisible information particles/KS entities ... and I think not everything is lost with XML. XML is very flexible so even if metadata is directly included in microcontent still one can use current super-glue of URLs to create links between microcontent entities and essentially re-create KS power but make it more accessible to user?

Such microcontent infrastructure will naturally and completely mesh with current (HTML/HTTP) and future Web Services (XML) and be described and accessed by the most important naming scheme: URLs.

As a bonus writing KS-style simpletons that works on email messages may be as simple as converting MIME message into XML and storing such microcontent in space (and using I2S indexing but this will is topic for later discussion ...)