Ultimately, we could resolve this by moving to 64-bits, but that requires a lot of testing. Instead, we could create some simple file-based structures:

  • CHTTPMessageBodyBuilder should start storing on disk if the size exceeds a certain threshold.
  • We need to support a type of CDatum that behaves like a string but stores its content on disk. [And we rely on the same garbage collecting mechanics.]
  • Both parsers (CJSONParser and CAEONParser) should generate the new disk-based strings if necessary.
  • Unfortunately, we'll end up doing a file copy when we serialize this to the database. And we'll do an additional file copy when we store it in the database. But we can't optimize that right now. The solution might be to set a ref-count on the file itself and use it to transfer ownership.
george moromisato on 4/3/2015 4:01 AM:

This will be fixed in the next release. It took a lot longer to implement than I thought (two days) mostly because I needed to handle parsing across buffer boundaries (and handle cases where the buffer is on disk).

george moromisato on 4/17/2015 6:45 PM: