Saturday, January 19, 2008

Web Services: Contract? Code? Who's On First?

Just about all of the writing on Web Services recommends contract-first development. Even packages that have extensive support for code-first, and weaker support for contract-first, feel compelled to salute this flag.

The advice sounds good. A strong contract avoids ambiguity and maximizes inter-operation. What could be wrong with that?

There is one tiny detail missing, however: the language in which you must specify this contract. Like 18th century English legal contracts written in 17th century French and medieval Latin, web service contracts are written in two languages that few really understand and which offer a wide variety of pitfalls to the unwary. These languages are XML Schema and WSDL.

This doesn't necessarily imply that contract-first development is bankrupt and code-first the only true way. Code-first has pitfalls all of its own. It does imply that you need to have a strong understanding of the interactions between the two models if you want to create and deploy services with a minimum of fuss and bother.

Before I dive in, a word of positioning. There are some cases in which the goal of a web service is to transfer a particular, complex, XML document from one place to another. I've never seen the point of this, myself. This posting is addressed to those writing all the rest of the web services in the world, which have the goal of moving some specified set of information from one place to another, which will only incidentally exist in XML as part of the process of moving it from here to there.

On with the show. What's the fuss with XML Schema? XML Schema is a specification language for arbitrary XML documents with arbitrarily complex content models. XML documents derive, as all of you know, from SGML, which was intended as a markup language. Thus, XML schema must cope with mixed content models.

I don't think that many people contemplating contract-first development need to be told to avoid mixed content model schemas.

XML Schema has a type model. If you are considering contract-first development, you have to understand it. I can promise you that this will not be an easy task. The XML Schema type model is fundamentally different than C++, Java, C#, Python, Perl, Common Lisp, and every other language with inheritance that I've ever encountered.

Stop and savor the irony. The idea of contract-first development is to use a specification that is agnostic as to programming language. It could be seen as an achievement, of a sort, to come up with a specification that is equally incompatible with all known programming languages.

This isn't the fault of the XML Schema designers. They didn't set out to build a data model for use behind many programming languages. They set out to build a specification language for XML documents.

Over and above the type inheritance model, XML schema includes several elements that map poorly to programming languages, such as xs:any and xs:choice.

My purpose here is not to write a jeremiad against XML schema. Rather, it is to suggest a practical approach to constructing web service contracts.

The best way to understand the implications of XML schema constructs is to convert them to code and see what they look like. In the Java universe, JAXB is the predominant mapping, and JAXB comes with xjc.

If you run some schemas through xjc and examine the results, you will begin to see some patterns. First and foremost, you'll see that the resulting code is infested with snails. That is to say, it has many, many, @ annotations.

Many of those annotations are redundant, especially with more modern toolkits such as CXF. Some of them are there so that you can edit the code fairly aggressively without changing the contract.

Next, you can try the opposite experiment. Write straightforward, simple, interfaces and bean types, and then run schemagen (for the beans) or a full java2ws tool (to include the interfaces and operations). Read the resulting XML schema.

Notice that straightforward code constructs lead to relatively simple, stereotypical, XML schema. Using those same constructs in schema leads to simple, readable, code. Let me give you an example of the opposite.

You might, some day, feel inclined to put the following in your schema:

<xs:any namespace="##any"/>

This allows any element to occur. I've seen code that generates schemas like this 'to allow for expansion.' Now, what happens if you map that to code with JAXB? What happens is this: you get an @XmlAnyElement annotation. There's just one problem: @XmlAnyElement doesn't correspond to '##any'. It corresponds to '##other'. In other words, the tools silently change your contract on you?

How does this happen? Well, I'm not a student of the history of JAXB, but my sense is that it's designers didn't want to tie themselves inextricably to XML Schema. They wanted a set of snails that could, perhaps, be mapped to some other schema specification, like RELAX-NG.

For example, consider arrays. In XML Schema, the closest thing to an array is the minOccurs and maxOccurs attributes of elements. Optional elements have 0, 1. Required elements have 1, 1. Arrays are generally 0, N, where N can be 'unbounded.' (Don't ask me what you get in JAXB if you specify, say, 12, 15.)

In JAXB, you have 'required' on an element. An element with required=true gets 1,1. An element with required=false gets 0,1. And a Java array gets 0,unbounded.

There are similar dances with the XML Schema 'form' attribute.

What's a person to do about all of this? Well, I offer a possible prescription.

Unless you are already a scholar of XML Schema, ignore the prescriptions of contract-first.

The first step is to design a code-first contract as as contract. Don't try to 'remote' whatever you have lying around by slapping a few snails on it. You might protest, 'Now I have extra classes all over the place and I have to copy all my data from my real objects to these special contract objects.' Well, my friend, you'd be in the same position if you used contract-first, only with a lot more snails and much more confusing code.

As a particular point here, consider using Document/Literal/Bare. That's right, bare. Not wrapped. One of the causes of confusion in code-first development is conflicts between the front-end (e.g. JAX-WS) and the data binding (e.g. JAXB). In a bare service, the front end is narrowly focussed on the interface, and the data us under the control of the binding. You don't have to worry about who wins a war between @XmlRootElement and @WebParam.

The second step is to review the schema. Pull the WSDL with the appropriate tool, and study it. Make sure you understand it. If it has strange quirks, adjust your code until it is clean.

The third step is to freeze the code. A contract should sit still until you make an organized, intentional, decision to evolve it. This is another justification for those 'extra' objects. They allow the contract to stay put while the code beneath it evolves.

If you adopt this discipline, you will find that it restricts you to a relatively simple set of constructs. Java Map objects are right out. Complex polymorphism will fall by the wayside.

This is the cost of interoperability. If you don't care about interoperability, then you don't need any of this. If Web Services are just an RPC mechanism, and you control both ends in real time, then you can write any code-first thing you like. Just don't come looking for too much help when the more obscure code-first constructs don't do precisely what you want.

So, let's review the good news. You don't have to become an XML Schema adept to build an web services that interoperates. You don't have to become a conchologist, either. You do have to be aware of the dual nature of what you do in code and schema.

No comments: