Thrift for serializing/deserializing objects in Membase
First off, if you haven't heard of Membase, you should check it out. It's an evolution of sorts from memcached.
Typically when you use memcache or membase to store/retrieve key-value data, the value part is not a simple datatype. Instead, it would most likely be a serialized representation of some complex application specific data-structure. It's great to set/get complex datastructure with a single remote call like this. But what could become a problem very soon is the performance of the serialize/deserialize operations that needs to happen with set/get operations.
With php, the obvious way to do this is to use the language's builtin serialization facility. Since the serialized format is a ASCII based format, I would guess that it's performance is not optimal (especially for deserialization). Also, one would want to do compression to reduce the data transfer and storage costs. This again adds to the set/get operation costs.
I'm looking at one such application which could be optimized to work more efficiently in these areas. I've looked at Google Protocol Buffers. It's very easy to understand and use and has very good documentation. Unfortunately it doesn't have good support for PHP. So I'm now looking at Thrift. Thrift was initially developed by Facebook for use primarily with PHP and other languages. So it has good support for PHP and has comparable performance and functionality to that of protobufs. But it's documentation seems to be too sparse.
On Compression
LZO compression is a more suitable compression algorithm for reasons of CPU and memory efficiency. When compression is used as part of a web request handling, one has to carefully do the trade-off between compression size and speed.
Typically when you use memcache or membase to store/retrieve key-value data, the value part is not a simple datatype. Instead, it would most likely be a serialized representation of some complex application specific data-structure. It's great to set/get complex datastructure with a single remote call like this. But what could become a problem very soon is the performance of the serialize/deserialize operations that needs to happen with set/get operations.
With php, the obvious way to do this is to use the language's builtin serialization facility. Since the serialized format is a ASCII based format, I would guess that it's performance is not optimal (especially for deserialization). Also, one would want to do compression to reduce the data transfer and storage costs. This again adds to the set/get operation costs.
I'm looking at one such application which could be optimized to work more efficiently in these areas. I've looked at Google Protocol Buffers. It's very easy to understand and use and has very good documentation. Unfortunately it doesn't have good support for PHP. So I'm now looking at Thrift. Thrift was initially developed by Facebook for use primarily with PHP and other languages. So it has good support for PHP and has comparable performance and functionality to that of protobufs. But it's documentation seems to be too sparse.
On Compression
LZO compression is a more suitable compression algorithm for reasons of CPU and memory efficiency. When compression is used as part of a web request handling, one has to carefully do the trade-off between compression size and speed.