You have achieved Clojure enlightenment. Namaste.

For anybody who wants to learn Clojure or to get good first hands-on experience with it, I cannot recommend the Clojure Koans enough. Clojure Koans is a collection of simple exercises, in the fill in the blanks style, with an auto-runner that evaluates the exercises as you solve them. The exercises cover a broad range of concepts, from using the basic Clojure data types to defining functions to recursion to concurrency to Java interop, and much more.

You can either download the exercises as one big package, batteries included, or you can try it the Clojure way:

  1. Install Leiningen
  2. git clone https://github.com/functional-koans/clojure-koans.git
  3. cd clojure-koans
  4. lein koan run
  5. emacs src/koans/*.clj

Cannot get much simpler and more effective than that.

Posted in Clojure, Programming | Leave a comment

On UPC, Google, and a dot

Recently, I started getting invoices from UPC Czech Republic in my GMail account. UPC Digital + high speed internet, 767 CZK per month. Not a bad deal except… Except that I have never had anything to do with UPC Czech Republic, I have lived in the Netherlands for the last seven years, and the invoices are addressed to certain Vojtěch Toman with an address in Pilsen. I have been in Pilsen only two or three times in my life, and each time only passing through.

Since this January, I started getting not only invoices, but also collection letters from UPC. Apparently this other Vojtěch Toman’s payment morale has been rather low lately. Or… perhaps he just hasn’t been getting any invoices?

So what happened? A closer inspection of the e-mail messages showed that they were addressed to vojtechtoman@gmail.com – but that is not my address, as mine contains a dot between the first name and the last name. So somehow Google must have screwed up; those messages surely were not meant for me.

That was more or less how I thought about it – at least initially – and I ignored the e-mails. But after I started getting the collection letters, I thought that since I have been receiving the UPC e-mails, perhaps that poor hasn’t. Maybe he wasn’t even aware that something went wrong with his payments.

So I took a second look. A simple Google search quickly pointed me to a GMail help page titled Receiving someone else’s mail – so apparently I am not the only one with the same problem. What I have found there, however, took me by complete surprise:

Sometimes you may receive a message sent to an address that looks like yours but has a different number or arrangement of periods. While we know it might be unnerving if you think someone else’s mail is being routed to your account, don’t worry: both of these addresses are yours.

Gmail doesn’t recognize dots as characters within usernames, you can add or remove the dots from a Gmail address without changing the actual destination address; they’ll all go to your inbox, and only yours. In short:

homerjsimpson@gmail.com = hom.er.j.sim.ps.on@gmail.com
homerjsimpson@gmail.com = HOMERJSIMPSON@gmail.com
homerjsimpson@gmail.com = Homer.J.Simpson@gmail.com

All these addresses belong to the same person. You can see this if you try to sign in with your username, but adding or removing a dot from it. You’ll still go to your account.

Maybe I was the only one who didn’t know about this already, but quite a shocker for me nonetheless. In retrospect, I can understand Google’s motivation to make their services more user friendly, but at the same time… Yes, it saves you from the occasional typo in the recipient e-mail address, but doesn’t it introduce new problems as well? It works in the cases when you omit a dot or use extra dots, but what if you, for instance, misplace a character or forget one? How often can this, combined with Google’s leniency in checking of the usernames, lead to wrong people receiving your e-mail?

This and similar thoughts I had when I looked at the other Vojtěch Toman’s e-mail address: vojtechtoman@gmail.com.

  1. Was his actual Google username something like vojtechtoman2 and they have made a mistake and lost the ‘2‘ at UPC somehow?
  2. Has he filled in a made up fake address in a required registration form field?
  3. What if his username really is vojtechtoman? (Potentially the worst scenario of all.)

Scenario 1 is always a possibility. Human error occurs.

Scenario 2 is also likely, especially given the fact that apart from UPC, I didn’t get any mails addressed to vojtechtoman@gmail.com.

As for scenarion 3, I don’t know if Google’s relaxed policy on usernames has always been the same, or if it is something that they introduced along the way. I assume the former, because I can’t imagine how would they enforce the new policy on what would likely be thousands and thousands of potentially conflicting usernames. So I assume that scenario 3 is out of question, but that vague, uneasy feeling doesn’t go away…

I called UPC today and told them about the situation, and especially that I am concerned that this other Vojtěch Toman may not have received the invoices and collection letters from them and is therefore unaware that something has gone wrong. Luckily for me, it turned out that the other Vojtěch Toman gave them two e-mail addresses and that they send e-mails to both; so they removed the address vojtechtoman@gmail.com from the system.

Let’s see if I get any more invoices after today. I just hope this other Vojtěch Toman is not browsing through my mailbox on Saturday evenings.

Posted in Uncategorized | Leave a comment

Do it JSelf

With Michael Kay’s recent announcement that he is trying to create a JavaScript version of his Saxon processor, it is quite likely that we may be getting XSLT 2.0 support in the web browser relatively soon. Of course, everybody would prefer that browser vendors implement XSLT 2.0 natively, but that is, sadly, not the reality of today. We all remember how long it had taken for some browser vendors to implement XSLT 1.0, and that was in the time when there were still relatively many who believed that XML-based web is the way to go – a position that has become increasingly difficult to defend in the current HTML5 frenzy.

Only time will show what the future for XML in the browser holds. Right now, it seems that XML has become the unwanted child that nobody wants to play with, because it seems inept when it comes to creating flashy rotating logos.

Or maybe it is not that bad; we will have to wait and see. And in the meantime, the best we can do is to make sure that XML support in the browser evolves and catches up with today’s state of the technology to remain relevant. Browser vendors are clearly not (or so it seems) going to assist with this, so the solution is either to write browser extension plugins (which, in truth, never really worked in practice) or to use JavaScript to implement the missing functionality.

Luckily, and almost paradoxically, browser vendors have given us a helping hand where it comes to JavaScript: With the almost total war for the fastest web browser and the best level of HTML 5 support, we are witnessing tremendous improvements in the JavaScript engines in virtually all relevant web browsers of today. Not only are they able to execute JavaScript much faster than before, but they are also able to handle much larger JavaScript libraries, and even complete applications.

For the first time – and rather remarkably – it is now possible to consider JavaScript to be robust enough for such tasks as, for example, implementing a conformant XSLT 2.0 processor or…

…or an XProc implementation. For the last couple of months, I have been working (when other priorities allow) on creating a self-contained JavaScript version of Calumet, a Java XProc processor that I have been developing at EMC. In August 2010, I presented a live demo at the Balisage XML conference in Montréal and expressed my hope that future versions of Calumet will be distributed as a dual Java/JavaScript package. If you are interested in the technical details, you can find the slides from the presentation, as well as the conference paper (which, coincidentally, seems to have been one of the impulses that have convinced Michael Kay to attempt a similar thing with Saxon) here.

So, how has the JavaScript version of Calumet progressed in the last two or three months? I must say that we are still not there, but we are getting quite close. Take a look for yourself:

http://xmldemo.emc.com:8080/calumet

I still consider the JavaScript processor “alpha” quality, as there are still compatibility or performance issues with various web browsers (yes, I know that the p:validate-with-schematron example does not work in Safari/Chrome) and some features are not (or not fully) supported yet (for instance, the p:http-request implementation is really, really simple and limited). But perhaps the processor is already good enough to be released so that users can play with it, test it – and most importantly, figure out if they can do anything useful with it.

Christmas sounds like a good plan.

Posted in JavaScript, Programming, XML, XProc | 2 Comments

EMC XProc Engine announces itself

…and how else than in XProc:

<p:declare-step xmlns:p="http://www.w3.org/ns/xproc" version="1.0">
  <p:output port="result" sequence="true"/>
  <p:try name="announcement">
    <p:group>
      <p:output port="result"/>
      <p:identity>
        <p:input port="source">
          <p:inline>
            <p xmlns="http://www.w3.org/1999/xhtml">
              I am happy to announce the availability of version
              1.0.8 of EMC XProc Engine, EMC's XProc processor
              implementation. The processor is free for developer
              use and can be downloaded - together with other XML
              tools - from the EMC XML Developer Community website.
            </p>
          </p:inline>
        </p:input>
      </p:identity>
    </p:group>
    <p:catch>
      <p:output port="result"/>
      <p:identity>
        <p:input port="source">
          <p:inline>
            <p xmlns="http://www.w3.org/1999/xhtml">
              This is the first public release of the processor. Note
              that while version 1.0.8 supports most of the required
              features of XProc, the implementation is not yet complete.
            </p>
          </p:inline>
        </p:input>
      </p:identity>
    </p:catch>
  </p:try>
  <p:identity name="feedback">
    <p:input port="source">
      <p:inline>
        <p xmlns="http://www.w3.org/1999/xhtml">
          We are looking forward to your feedback - please use the EMC
          XML Developer Community forum (or the xproc-dev mailing list)
          for any comments, suggestions, feature requests, or bug reports.
        </p>
      </p:inline>
    </p:input>
  </p:identity>
  <p:http-request name="download">
    <p:input port="source">
      <p:inline>
        <c:request xmlns:c="http://www.w3.org/ns/xproc-step" method="GET"
            href="https://community.emc.com/community/edn/xmltech"/>
      </p:inline>
    </p:input>
  </p:http-request>
  <p:wrap-sequence name="features" wrapper="features">
    <p:input port="source" xmlns="http://www.w3.org/1999/xhtml">
      <p:inline>
        <p>XProc Engine implements most of the required
          features of the XProc specification. For an overview of the
          supported features, and the progress of the implementation,
          see the XProc Test Suite website.
        </p>
      </p:inline>
      <p:inline>
        <p>Plug-ins can extend the processor and customize
          the default behavior or provide new functionality, such as
          extension XProc steps.
        </p>
      </p:inline>
      <p:inline>
        <p>The XProc Engine distribution comes with a number
          of plug-ins that can be used with the processor. Software
          developers can use the XProc Engine API to create custom
          plug-ins.
        </p>
      </p:inline>
      <p:inline>
        <p>The Java application programming interface makes
          it easy to embed the XProc Engine in other Java applications.
        </p>
      </p:inline>
      <p:inline>
        <p>XProc Engine provides an interface for running
          XProc pipelines from the command-line.</p>
      </p:inline>
      <p:inline>
        <p>XProc Engine can be integrated with the EMC
          Documentum xDB XML database (via a plug-in). The plug-in
          provides a number of xDB-specific XProc steps, and allows
          developers to combine the benefits of a state-of-the-art
          native XML database and XProc.
        </p>
      </p:inline>
    </p:input>
  </p:wrap-sequence>
  <p:compare>
    <p:input port="source">
      <p:document
          href="http://tests.xproc.org/results/calumet/report.xml"/>
    </p:input>
    <p:input port="alternate">
      <p:document
          href="http://tests.xproc.org/results/calabash/report.xml"/>
    </p:input>
    <p:documentation>
      <p xmlns="http://www.w3.org/1999/xhtml">
        How does it compare to the other XProc processor? :)</p>
    </p:documentation>
  </p:compare>
  <p:identity>
    <p:input port="source">
      <p:pipe step="announcement" port="result"/>
      <p:pipe step="feedback" port="result"/>
      <p:pipe step="download" port="result"/>
      <p:pipe step="features" port="result"/>
    </p:input>
  </p:identity>
</p:declare-step>

Posted in Work, XML | Tagged , , | 1 Comment

EMC XML Developer Community Launching

At Last! I am happy to announce that EMC has launched the new XML Developer Community, a developer forum for sharing information about XML and EMC’s XML technology. I was never good at making flashy announcements, so I will just say: go to http://developer.emc/com/xmltech and see for yourself, there is some seriously interesting stuff there. (And expect more to come soon – after all, we are just taking off!)

Besides the content (articles, videos, blogs), the real highlight for me is that EMC is going to release a whole suite of its XML tools through the site, free for unlimited developer use. And the first of the downloads is nothing less than EMC Documentum xDB, our native XML database (formerly known as X-Hive/DB).

Personally, I have been waiting for this to happen for a long time. I know that I am biased, but I really think that xDB is a killer application, with so much value and potential. In the 10 years of its existence (yes: birthday party!), xDB has grown into a really robust and mature product with excellent support for standards, and it is no wonder that it has become so successful it the commercial sphere. Sadly, it has also successfully stayed out of reach of the wider community. But this is changing now, and I am expecting lots of positive reactions to this; I, for myself, have always found xDB great to work with.

But the EMC XML Community site is not only about promoting EMC software. Its main focus is XML and XML technologies in general, and the goal is to provide a place of dialog between us – EMC’s XML developers – and the developer community, where useful information about XML (tutorials, best practices, support, etc.) can be shared and discussed.

Looking forward to your feedback, either on this blog, or better, on the EMC XML Community website!

Posted in Work, XML | Tagged , , , , | Leave a comment

Back from XML Prague (and bed, almost)

Finally recovered enough from XML Prague to write a short post. The conference was great (hey, the strudel was fantastic!), and I had a chance to chat with quite some interesting people there (…some of which probably infected me with a nasty bug that I have been struggling with until now).

The organizers attempted to make the event a bit special this year: they were broadcasting a live feed from the conference, to make it accessible also for people who could not participate in person. Quite a unique touch, and surely something that will attract a wider audience in the following years of the conference, should the organizers keep doing the same thing (and they should!).

Recordings of the presentations are now starting to show up on the conference website. Only slides with audio at the moment, but the organizers are planning to upload full video recordings of the speakers once these are processed and polished (…and let’s hope not – using Robin Berjon’s words – censored :-).

My presentation about XProc was on the second day of the conference (right after another XProc talk by Norm Walsh who showed some ingenious real-world pipelines – see his blog if you are interested in knowing more), and it was kind of motivating for me to see the positive reaction to EMC‘s plans to release its XML tools to the wider public.

The most frequent question after my presentation seemed to be: When? – and at that moment, the only answer I could give was: Soon, because, in all honesty, I didn’t know. Today, I think, I can reveal more: Very soon.

Posted in Work, XML | Tagged , , , , | Leave a comment

Presenting at XML Prague 2009

On March 22nd, I am presenting at XML Prague. Topic: Optimizing XML Content Delivery with XProc. There will be some interesting names there, like Michael Kay, Murata Makoto, Norman Walsh, or Jeni Tennison to name a few. Should be fun.

Posted in Work, XML | Tagged , | Leave a comment

Introduction to XProc

Anyone interested in learning/understanding XProc, don’t miss the excellent Introduction to XProc written by Dave Pawson. It is still work in progress, but it already contains a lot of useful information.
Another nice place to get started with XProc is James Sulak’s blog.

Posted in XML | Tagged | Leave a comment

First XProc test report for Calumet published

We have gone public – at last! Yesterday, I submitted the first test report for Calumet, which is the code name for our XProc implementation.
As of today, we pass 95% percent of the required tests, and about half of the optional tests, which makes for about 90% success rate overall. Compared to Calabash, which scores 97% at the moment, we still lag behind a bit, but we have made a good start, I think, more so that Calumet is getting more mature and compliant every day (literally). Stay tuned.

Posted in Work, XML | Tagged , , , | Leave a comment

XProc – The next rockstar?

While working on an XProc implementation, and especially after using XProc in real-life and seeing its true power, I am more and more confident that XProc will soon become one of the most popular (and useful) XML technologies out here. For me, the reasons are simple:

  • XProc integrates a whole plethora of XML technologies, ranging from XPath, XSL and XSLT, to XInclude processing, schema validation and XQuery support. The good thing about XProc is that you don’t have to learn the details of the different programming APIs and models. XProc shields you from that. To me, this is the most significant benefit of using XProc – and in fact, it is the very reason why XProc exists after all. Make manipulating XML content simple, transparent, and easy to understand.
  • XProc can make application development simpler, and faster. No more tedious XML programming (I guess we all know that how many times did I write this code before? feeling…), no more low-level dances around constructing/navigating/updating the DOM tree. Not any more: Here, this my XProc pipeline, run it and give me the results I want.
  • XProc can make applications more reliable and less buggy (once the XProc processor get good enough, of course :). This is related to the previous point. You see, manual XML programming is potentially dangerous, especially in the hands of unexperienced developers who are not aware of all the nifty details. I have seen too many examples of badly written code for performing an XSLT transformation, or for just parsing an XML document… And I am really glad that finally there is a tool that, to put it bluntly, can shield us from crappy programming. And once there are visual tools for building XProc pipelines – and I am sure there will be some soon – we will be even safer.
  • XProc is simple. Querying content using XQuery, transforming the results to XSL-FO, and generating a final PDF document has never been easier:
    <p:declare-step>
      <p:input port="source"/>
      <p:input port="parameters" kind="parameter"/>
    
      <p:xquery>
        <p:input port="query">
          <p:data href="stats.xq" content-type="application/xquery"/>
        </p:input>
      </p:xquery>
    
      <p:xslt>
        <p:input port="stylesheet">
          <p:document href="stats2fo.xsl"/>
        </p:input>
      </p:xslt>
    
      <p:xsl-formatter href="out/stats.pdf"/>
    </p:declare-step>
    
  • XProc is extensible. XProc comes with a library of standard steps, but one of the core features of the language is that it allows you to declare custom steps that provide more complex (or not supported by default) functionality. You can organize your custom steps in libraries, which you can then import in your XProc pipelines in a way similar to importing stylesheets in XSLT.
    <p:pipeline>
      <p:import href="sq-library.xpl"/>
      ...
      <sq:get-stock-quote ticker="GOOG"
          xmlns:sq="http://www.foo.com/sq/ns/"/>
      ...
    </p:pipeline>
    
  • …and finally, XProc is fun!
Posted in Work, XML | Tagged , , | 2 Comments