A month back I blogged about wanting a better throughput scaling tools for DynamoDB. Not having been able to find an existing tool that ticked all my boxes, I ended up scratching my own itch and developed a small Java tool that runs in the background, monitoring a set of DynamoDB tables. The tool satisfy… Continue reading Building a better DynamoDB throughput scaling tool, part 2
Tag: BigData
Space efficient Bloom filter index, 2x performance gain
To quote Wikipedia: A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. This is often used by systems in order to avoid accessing slow media, like a disk. Take HBase or Cassandra for instance: Instead of reading their data files in… Continue reading Space efficient Bloom filter index, 2x performance gain
Resource aware queue
For the TLDRs: This blog post presents the reasoning behing a project called ResourcePriorityBlockingQueue which is a blocking queue implementation that: Allows you to assign priority on tasks, just as with a PriorityBlockingQueue. Tasks may belong to task groups where each group may have a different priority. In that case, tasks are prioritized by group… Continue reading Resource aware queue
Oracle Coherence and MapReduce
I spend a lot of my time working with Oracle Coherence. If you’ve never heard of Coherence it can briefly be described as a linearly scalable in-memory HashMap. By linearly scalable I mean a distributed HashMap, where each cluster member is responsible for storing a portion of the complete map. As everything is in-memory you maintain… Continue reading Oracle Coherence and MapReduce