Oops Null Pointer

Java programming related

Category Archives: JSON

Memory and wire size of message protocols

There are many studies of the speed of message protocols like protobuf, JSON, BSON, etc, but little in the way of measuring the memory usage required to get the in memory data out to the client. The simplest approach (and the worst in terms of memory usage) is buffering the whole data structure before sending. This typically requires at least the same amount of memory as the original data.

My data set at hand for testing was a large (82MiB) 2D array of decimal values represented as strings (about 10 decimal places).

The Java generated CORBA serialisation code I started with buffers everything at once in its write method. 82MiB is copied to 82MiB.

JSON mapped using Jackson had similar but slightly better memory usage.

Using an ancient version of  The Mind Electric’s GLUE SOAP toolkit  (don’t ask!), the SOAP wrapped JSON message also buffers the lot into memory and was horribly inefficient in creating the envelope (using 100’s of MiB’s).

*A note on wire size – the SOAP message compressed very well using GZip as the whole message is available.

BSON (using BSON4Jackson) by default requires the first element to be the message size and thus buffers the lot into memory. By disabling pure BSON using the BsonGenerator.Feature.ENABLE_STREAMING setting, streaming code can be used and the memory usage is about a third of the original data size again to send.

I couldn’t get Google’s protocol buffers to be very large data friendly. Strings appear to be unoptimised (it just uses String.getBytes()), so even sending an array “row” at a time did not yield great performance. Sending a field at at time with string size fields and row length prefixes was even worse.

The least memory usage was by sending the data via Jackson’s streaming API. This coupled with no content length header to enable chunking (a HTTP 1.1 feature) had almost no overhead. Sending 82MiBs took about 4KB! There is some clever code in the streaming API as it is exceptionally efficient at streaming to an output stream (in my case in a servlet or via Restlet’s OutputRepresentation class).

You can also use GZip on this and it produces half the wire size, but takes twice the time.

In summary: Plain JSON streamed with Jackson is the clear winner for my data set with it’s tiny memory usage sending data to a stream.

In practice I felt that it was much simpler that this article made it seem (but I’m sending a very simple message here). Here is my code to stream a JSON representation of a 2D string array:


JsonFactory f = new JsonFactory();
JsonGenerator g = f.createJsonGenerator(outputStream);
g.writeStartObject();
g.writeStringField("type", "JsonJacksonStreaming");
g.writeArrayFieldStart("vals");

for (int r = 0; r < a2d.length; r++)
{
    g.writeStartArray();
    for(int c = 0; c < a2d[r].length; c++)
    {
        g.writeString(a2d[r][c]);
    }
    g.writeEndArray();
}
g.writeEndArray();
g.writeEndObject();
g.close();

Note: I didn’t get time to try MessagePack, but I’d like to. Anyone who has tried message pack with large amounts of string-ish data care to comment?

Advertisements

JSON from Java to C# and back again

Recently I needed to send JSON between Java and C# over an existing SOAP gateway.

On the C# side I used Jayrock, which required a painless recompile to support strong names on the assemblies.

On the Java side I initially used Jettison with XStream, but Jettison’s output is not “pure JSON”, at least not the same as other JSON I have seen and the JSON that Jayrock produces.

Jettison’s JSON:

  • puts a root node of the fully qualified class name
  • names string elements “string” and string arrays “string-array”

While it’s probably possible to work around these in Jettison (or Jayrock) I though it would be more straight forward to use a library that produced more pure JSON. XStream can export JSON with its JsonHierachicalDriver, but it can not consume JSON (as of 1.3.1). So I turned to Jackson.

After a little work discovering the mix-in annotation method to work with existing static enumeration classes (see an earlier post) Jackson was working fine.

The only other issue was an version of GLUE 4.1.2 (yes it’s old and I’m pushing to upgrade to CXF or Axis2). This GLUE version has a very inefficient “specials” replacement method that replaces quotes with &quot;. Doing the same function with String.replace stopped the CPU from maximising and made each request orders of magnitude faster.

Jackson Custom Deserializer Example – Annotated MixIn

Having  just recently started using Jackson I ran into the issue of a serialising a old-style static enum pattern class. This class has no constructor (all “enum” options are static instances) so I had to make my own deserialiser.

There are two ways to map the custom deserialiser – via a Mixin or via a factory.

Mix in:

Create a MixIn for the Class that contains the Enum – overriding the field or setter


public abstract class JacksonMyStaticEnumMixIn {

  JacksonMyStaticEnumMixIn() { }

  @JsonDeserialize(using=MyStaticEnumDeserializer.class)
  public MyStaticEnum myStaticEnum;
}

Then add this to the object mapper:

ObjectMapper mapper = new ObjectMapper();
// Implicit deserialisation mapping via mix in class
mapper.getDeserializationConfig().addMixInAnnotations( ClassContainingStaticEnum.class, JacksonMyStaticEnumMixIn.class);

Explicit Binding:

The mix-in method can not be checked by the compiler and requires discovery, rather than being intuitive. The other option is an explicit binding via a factory as follows:

ObjectMapper mapper = new ObjectMapper();
CustomDeserializerFactory factory = new CustomDeserializerFactory();
factory.addSpecificMapping(MyStaticEnum.class, new MyStaticEnumDeserializer());
mapper.setDeserializerProvider(new StdDeserializerProvider(factory));