Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On 8/2/2012 4:36 PM, Mark Woodward wrote: > Not to be snide, but 8 million is not a big number. That's 8 million patients. Multiply that by everything that the VA has on each and every one of them and you get a very large data set. It's not the largest data set that I'm aware of. The largest is the data out of the LHC which is around 200 petabytes. CERN went the other way. They started with an object databases but eventually dropped it due to poor market development OODBMSs. They currently use relational databases for storing and retrieving metadata. Bulk data is stored in flat files. > Well, "billions" of transactions per day should be doable in a cluster. That's what Ameritrade and Oracle thought but they couldn't make it work. > If your oracle database is crashing, it is misconfigured. The Oracle techs working with Ameritrade couldn't keep the cluster going. They eventually gave up when Ameritrade wouldn't commit to replacing the entire cluster with bigger servers. > Financial > transactions are a dangerous thing, you really do need ACID for > fiduciary responsibility. Cache' delivers full ACID guarantee. I told you I wasn't talking about NoSQL/MongoDB. > You are avoiding the topic, the "storage system," is separate from the > implementation of the objects. The objects know how to serialize and > restore themselves as well as upgrade. The storage and location of > objects is not involved. Of course I am. It's not relevant to the topic, which is the technical merits of object vs relational databases. > That is not a "how," it is a adjective and a plural noun. One does not > need to use relations in a database, but one has them if they need them. > An RDBMS is a tool not some kind of mandate. Then why bother with a relational database at all? The singular strength of a relational database is the relations between data. If you don't use relations then the relational database is the wrong tool for the job. > Yes, ok, that is done with the XML/JSON class description. What's the > problem? The problem is that you're stuck with tables. You don't have an object. You have an object stored in a table. Even if it is a table with a single column and a single row it's still a table. > If I said the XML was stored in a binary polymorphic object file and it > could be retrieved by its ID, would that make a difference? Because, > that is exactly what is happening. For convenience, we call the the > polymorphic object file a "table." Sure, that works. Again, why bother with a relational database if you want to short-circuit all of the relational functions? Which was my original point: why bother with inferior tools like relational databases when superior tools like object databases are available? > Sorry, no. It is either a hash table, or they are hiding the index from > you. Either way, it doesn't matter because databases have hash indexes. Nope. Binary trees or multidimensional arrays. Typically, an object database doesn't cache index data which it doesn't have. It caches objects. > And if you say that objects don't need that kind of indexing, then you > miss the real power of database. If you have 8 million objects, say > patients in a database. How do you find them by social security numbers? > How about by last name? How about by symptoms? You walk a balanced b-tree. The worst case for a binary tree search is O(log n). Then the patient object is loaded into cache and data access times drop to O(1). >> Better performance, > How? Prove it. O(log n) typical worst case for object searches vs. O(log n) typical best case for relational searches. In real applications object searches are 2-20 times faster than comparable relational searches. >> greater scalability, > How? Prove it. Ameritrade. >> faster deployment, > How? Prove it The VA Hospital's ahead of schedule and under budget deployment. >> easier >> maintenance, > How? Prove it Admittedly it is company propaganda, but case studies from InterSystems' customers show that Cache' is easier and faster than Oracle for application development and support. >> and typically at a lower cost for all of it. > PostgreSQL is free. It doesn't get much lower in cost. Hardware, sysadmins, DBAs, application developers, test teams. All these cost money. If you can deliver an application on leaner hardware then you reduce cost. If you can deliver it in less time then you reduce cost. > No, I needed a DNS system that could replicate, allow user access, > managed rights and privileges, etc. I could coble something together, or > use a package that worked out of the package. It was a no brainer. I implemented something similar at a previous gig using shell scripts. It worked perfectly. It was a no-brainer. And that's still my own confirmational bias speaking. > I do have some expertise in PostgreSQL, sure, but I always try to find > the best tool for the task. I have used SQLite and I have done a fair > amount of storage systems where an RDBMS is not appropriate. Consider this for your next project: a relational database is never appropriate. Work from that. I'm certain that you will be surprised, in a good way, at what you discover. -- Rich P.
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |