Asymmetrical View

Array Type Hints in Clojure

How to encode type hints for Java array types came up recently in conversation with a friend and I found it difficult to Google for so I decided to write it up here. This is because Java doesn’t have what you’d normally think of as a class name for its typed arrays, but first a brief explanation of type hints…

Clojure Type Hints

Clojure allows you to use type declarations, or hints, in two ways. The first are declarations to the Clojure compiler which aid in function signature checking. Type-hinted code will be faster in many cases (when the type was otherwise ambiguous to the compiler) because the Clojure run-time doesn’t have to spend time using Java’s Reflection API to figure out which underlying method is appropriate for the type of your arguments. The second aspect of where types used in Clojure is in multi-methods, where the types of the arguments are typically used in determining how method resolution is performed.

You can see Clojure doing reflective lookups by setting *warn-on-reflection* to true. Having this set to true while you run your unit tests, or at the end of your development pushes, is a good habit to get into since the performance impact of all the default reflective look-ups is a good thing to eliminate out of your code once things are stable.

Enabling those warnings produces errors that look like this:


user> (set! *warn-on-reflection* true)
true
user> (def x (StringBuilder.))
#'user/x
user> (.append x "foo")
Reflection warning, NO_SOURCE_PATH:1 - call to append can't be resolved.
#<StringBuilder foo>
user> (.append x 1)
Reflection warning, NO_SOURCE_PATH:1 - call to append can't be resolved.
#<StringBuilder foo1>
user> 
user> (defn second-ch [s]
        (.charAt s 1))
Reflection warning, NO_SOURCE_PATH:2 - call to charAt can't be resolved.
#'user/second-ch
user> (second-ch "twenty")
\w
user> 

Note that the warning happened at the time the function was compiled, not when it is called. Clojure is warning us that it couldn’t generate code to call the method directly, but it had to generate code that used the reflection API to first find the proper method to call and then call it. We can avoid the expensive reflective look-up and quiet warning by introducing a type-hint for the parameter, telling Clojure that it is a String:

user> (defn second-ch [#^String s]
        (.charAt s 1))
#'user/second-ch
user> (second-ch "thirty")
\h
user> 

This time there is no warning and the generated code will call charAt directly.

Multimethods

Multimethods are the other area where the annotations frequently come into play. There are 2 steps to defining a basic multimethod. The first is the defmulti declaration where you name the multi-method and you then provide a function which Clojure will use to dispatch the call to one of the multi-method instances you later declare. One of the most common dispatch functions is class, which will match up the Java class of the argument to the one declared in the defmethod.

This example declares a multimethod that takes either a String or an Integer:


user> (defmulti bar class)
#'user/bar
user> (defmethod bar String  [s] (str "the-str:" s))
#<MultiFn clojure.lang.MultiFn@2e239525>
user> (defmethod bar Integer [s] (str "the-int:" s))
#<MultiFn clojure.lang.MultiFn@2e239525>
user> (bar "this")
"the-str:this"
user> (bar 123)
"the-int:123"
user> 

And if you attempt a call with an argument that doesn’t match any of the declarations:

user> (bar 4.56)
; Evaluation aborted.
No method in multimethod 'bar' for dispatch value: class java.lang.Double
  [Thrown class java.lang.IllegalArgumentException]
...

That’s effectively what Emacs, Slime and Clojure report to me.

There are more nuances to multimethods, like specifying defaults, handling multiple arguments and using your own dispatch functions – for now, the documentation is probably the best place to find out more.

Arrays and Class Names

So, back to why we’re here. As you start using type hints you may (as I did) run into a situation where you want to use a hint for a situation where you have a typed array, or as a dispatching value in multimethods. Javadoc presents these as Type[] and that is how you encode them in your Java source code. The problem, though, is that that’s not what the byte-code or JVM calls it at run-time, and what it does call it is not syntatically valid in either your Java or Clojure source code.

So what are typed arrays called in Java? Lets ask Java what they’re called…you can find out what a String array is called with this bit of example code:

kyleburton@indigo64 ~$ cat Test.java
public class Test {
  public static void main(String [] args) {
    System.out.println("String array: " + args.getClass());
    System.out.println("Byte Array: " + "foo".getBytes().getClass());
  }
}
kyleburton@indigo64 ~$ javac Test.java
kyleburton@indigo64 ~$ java Test
String array: class [Ljava.lang.String;
Byte Array: class [B
kyleburton@indigo64 ~$ 

You can get the same information from the Clojure REPL as well by using Clojure’s into-array:

user> (class (into-array ["a"]))
[Ljava.lang.String;
user> (class (.getBytes "foo"))
[B
user> 

So you now know how to ask for the class name of an array you have an instance of. You can use this to ask Class for the class based on its name (as a String):

user> (Class/forName "[B")
[B
user> (Class/forName "[Ljava.lang.String;")
[Ljava.lang.String;
user> 

Of course Clojure supports this for primitive types by pluralizing the primitive type:

user> (defn foo [#^bytes b] (String. b))

That approach doesn’t work for non-primitive types (classes) though.

Java Array Type Hints

Putting these together we can now declare defmethods using either of these techniques. Asking the JVM what the the class is based on a hard-coded example value looks like this:


user> (defmethod bar (class (into-array String [])) [s]
        (str "the-string[]:" s))
#<MultiFn clojure.lang.MultiFn@70e8fdc9>
user> (bar (into-array String ["a" "b"]))
"the-string[]:[Ljava.lang.String;@14f3cf72"

Using a hard-coded string of the class name as the JVM sees it (which also works for the primitive types) looks like the following:


user> (defmethod bar (Class/forName "[Ljava.lang.String;") [s]
        (str "the-string[]:" s))
#<MultiFn clojure.lang.MultiFn@70e8fdc9>
user> (bar (into-array String ["b" "c"]))
"the-string[]:[Ljava.lang.String;@4597871d"
user> 

user> (defmethod bar (Class/forName "[B") [s] ;; same as #^bytes
        (str "the-bytes:" s))
#<MultiFn clojure.lang.MultiFn@70e8fdc9>
user> (bar (.getBytes "foo"))
"the-bytes:[B@7eedec92"
user> 

Even though this works, I recommend staying away from hard-coding the string representation and using Class/forName. I worry that it might change in a future JVM release, breaking the code.

Conclusion

Even though Clojure does not have direct syntax support for hints for Java arrays, it’s still possible to use them.

Kyle Burton, 14th July 2009 – Wayne PA

Thanks

Special thanks to Jonathan Tran, and Mike DeLaurentis for reading drafts and providing feedback and suggestions.

Photo Credits
Tags: programming,clojure