Boston Linux & Unix (BLU) Home | Calendar | Mail Lists | List Archives | Desktop SIG | Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings
Linux Cafe | Meeting Notes | Blog | Linux Links | Bling | About BLU

BLU Discuss list archive


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

NoSQL vs SQL



On 12/06/2010 06:15 PM, Edward Ned Harvey wrote:
>> From: discuss-bounces-mNDKBlG2WHs at public.gmane.org [mailto:discuss-bounces-mNDKBlG2WHs at public.gmane.org] On Behalf
>> Of Mark Woodward
>>
>> Any rants/raves/comments either pro/con about NoSQL or SQL?
>>      
> I completely agree with you, in that (a) that video is the funniest sh*t I
> think I've seen all year...  (Also this one:
> http://www.youtube.com/watch?v=FL7yD-0pqZg )
>
> And (b) There is missing context in nearly all of the conversations people
> have about SQL "not scaling."  To put it in context:  No, google apps can't
> use SQL in the backend because SQL just can't scale to the massive numbers
> of servers and simultaneous clients serves, and the massive number of points
> of entry that satisfy the inbound web requests.  In order to exceed the
> workload capacity of a single SQL server, you're talking about several Gb
> per second (or at least several hundred Mb).  When you reach that level of
> usage, then you start needing a more scalable solution.  You can't possibly
> reach these levels if you have a single 100Mb connection to the Internet,
> which is much larger than 99% of businesses or home users presently have.
> In order to get such a large internet connection, typically companies spend
> thousands of dollars per month, if not tens of thousands per month, and a
> complete staff of IT people, with a fully managed, high performance, highly
> redundant network infrastructure...
>
> If you are serving users, it will require several hundred simultaneous power
> users before you approach the scaling limits of a single SQL server.  More
> typically, several thousand simultaneous, because they won't all be "power"
> users constantly generating maximum work load.
>
> Or you have a cluster of compute-heavy servers mining data in a data farm...
>
>    

The most important aspect of these scalability discussions, and one 
which I frequently find lacking in the "NoSQL" camp is a critique of 
just how many full scale transactions a SQL database can have.

Assuming that you will have one system sitting on one SATA disk with a 
disk seek time of 10 ms and a rotational speed of 10,000 RPM, I/O is 
your bottleneck. CPU speed is more or less infinite in comparison. There 
are about 166 revelutions per second, giving you a probability of 1/2 
revolution plus seek time to position the head at a random sector. So, 
we have 10ms (seek) plus 3 milliseconds, giving us an average of 13ms 
average head positioning time.  This gives us a worst case average of 
77  arbitrary write procedures per second.

The actual iops is higher based on seek time these days, but lets use 
the worst case scenario. (We could use a RAID system and multiply 
performance)

In an ACID database configuration, assume 1/2 maximum, i.e 36 writes per 
second on the database. That's 36 transactions (read/write) per second. 
Assuming 1 transaction per page view, that amounts to about 90 million 
page views a month (on average) as a sustainable number.

So, a good SQL database with no scaling tricks on a bog stock modern PC 
based server will serve a web site as busy as all but the very most 
popular sites on the web. Someone, please tell me, what are the NoSQL 
guys going on about with regard to scalability?









BLU is a member of BostonUserGroups
BLU is a member of BostonUserGroups
We also thank MIT for the use of their facilities.

Valid HTML 4.01! Valid CSS!



Boston Linux & Unix / webmaster@blu.org