Friday, May 27, 2005

App Server Platform

I have an idea for the underpinings of a Java app-server platform. I'm referring to the underpinings & process model, not higher level stuff. It involves taking several projects out there and bringing them together:

* IBM's WBI ("Web Intermediaries") development kit. This is mainly centered around HTTP but contains a nice process model for handling requests & responses flexibly.

* A SEDA framework like MINA. Here, there is essentially a thread per stage with buffers interconnecting them. Leads to very high performance systems.

(how would MINA + WBI integrate?)

* Spring for instantiating the whole kit & kaboodle.

On top of this, you could build a servlet engine, mail server, or whatever. That would be a lot of work that I know I wouldn't want to do so I'd do a major refactoring of a good open source one like Tomcat.

Thursday, May 12, 2005

Programming languages with native XML inline syntax

E4X
http://www.ecma-international.org/publications/standards/Ecma-357.htm
This is very cool. It has more benefits than allowing for simpler syntax. Imagine being able to specify that the XML block must meet a certain schema. Smart IDEs like IntelliJ could go wild with this if it could know that. Tab completion and all that. I think it could lead to XML syntax-izing other languages like say SQL or even and LDAP query. Wow! WOW! I dare say it would be a new plateau in programming.

Tuesday, May 03, 2005

Common Layered Specifications

Note: I'm putting a CompSci spin on this but this could apply to anything, even laws and such.

Problem Statement:
There are many specs out there. We can't fit them all in our head. Though some specs will refer to each other, more of this would be better. Specs don't seem to have much of a common format whatsoever either. Many specs seem rooted, or have a foundation, in another spec. So there is commonality.

Idea:
Using 3-6 specs already known which share a connection with each other, develop new specifications be re-using definitions where possible, but don't re-specify the same thing. Must of this could be copy-pasted from existing specs to save work. Develop a diagram to depict these relationships.

Thursday, December 02, 2004

XML, ASN.1, and information encodings

Observation: There are quite a few different syntactical formats of files & protocols of primarily text & numbers. For example: XML, ASN.1, and then all name-value carriage return protocols commonly used on the internet like HTTP & SMTP, and then there's Windows .ini files, Java .properties files, and others too.

The problem: Mainly, it's a problem because it is more that developers have to learn. With each are settle nuances like character encodings and escape sequences that differentiate them. Also, there are usually multiple programming API libraries that accompany each of them and each are different with quirks and/or bugs (and must be learned). And there are pros/cons to each of the formats in their own right with regards to readability, verbosity, structurability (i.e. attributes? namespaces? entities?), and availability of a schema language.

My solution: First lets define three layers to this... the binary encoding, then what I call the info-set (i.e. the abstract model which is often eluded to with the API), and then the visual representation. Next, write out some basic definitions of all these formats in terms of these three layers. What we need, and this is important, is a common info-set. The XML info-set (AKA the DOM) is featureful and modern and should be the common one. Now re-express (or "map") the other models into the XML info-set. Now define and implement binary marshallers and unmarshallers between the various binary layers and the common info-set model. This should attempt to also place the non-pertinent white space and other comments in such a way to preserve this faithfully in case we need to translate backwards. A native implementation of the DOM with a target format other than XML is another way to get this done too. As far as the view layer... it turns out that many of the formats have the same binary and view representation such that we don't have to tackle that additional problem.

DELETE:
I think what perpetuates this problem is that the textual representation for all of them is identical to their binary encoding. "Hogwash" some of you might say, but seriously, I think this is it.