sexta-feira, 6 de março de 2015

Going into Big Data - Cassandra and MongoDB

So.. thinking of a scheduling service i want to try out Cassandra and mongoDB as the data store, from what i have been learning Cassandra seems more appropriate for the use case but i don't know if the learning curve is too high and if it's really needed as is a service that, for now, doesn't need to be highly available.
From my past experience i know is better to design and architect in a way that brings flexibility for the future rather than matching the current non functional s of a project but... if we search for the perfect architecture design we will never get there not only because it would make the project to complex but also because is a false assumption as everyday day new technologies emerge that can be better suited for the use case. The world changes constantly if we don't keep certain assumptions, in software development, we will end up with a NeverEnding  story, drifting from the goal and never delivering something that works for the client.
Another thing to consider is that the best architect or tech lead will strive for the best approach and the most thrilling technologies.  As a leader i have found out that that Emergence is a great concept, brings a lot of benefits but... emergence with no guidance is just like a river that has been struggling with a dam and suddenly bursts ! If there i no guidance for the water it will flood all over the place. Of course if you keep the Dam the water will be under an enormous pressure and becomes muddy :)

As a technology lover i want to try out Cassandra but as a pragmatic person i will compare it with MongoDB by creating a petit POC and analyse, learning curve, performance, availability of plugins, etc. Of course if you are trying to decide between technologies there are more points to be considered:
License/Pricing, Community Support, Who is using it, Roadmap of the product, Resources Availability, Fit for purpose, Infrastructure requirements, and so on.

Let's see what do i select for this particular use case :D

Some useful resources:
http://www.jonathanhui.com/install-cassandra-single-node-amazon-linux
http://docs.mongodb.org/ecosystem/platforms/amazon-ec2/ - as a note on this one you don't have to create 3 different drives to try it out you can use the one provided and just create the directories.


Sem comentários: