Home
| Calendar
| Mail Lists
| List Archives
| Desktop SIG
| Hardware Hacking SIG
Wiki | Flickr | PicasaWeb | Video | Maps & Directions | Installfests | Keysignings Linux Cafe | Meeting Notes | Linux Links | Bling | About BLU |
On 08/02/2012 08:44 PM, Richard Pieri wrote: > On 8/2/2012 4:36 PM, Mark Woodward wrote: >> Not to be snide, but 8 million is not a big number. > > That's 8 million patients. Multiply that by everything that the VA > has on each and every one of them and you get a very large data set. > > It's not the largest data set that I'm aware of. The largest is the > data out of the LHC which is around 200 petabytes. CERN went the > other way. They started with an object databases but eventually > dropped it due to poor market development OODBMSs. They currently use > relational databases for storing and retrieving metadata. Bulk data > is stored in flat files. > > >> Well, "billions" of transactions per day should be doable in a cluster. > > That's what Ameritrade and Oracle thought but they couldn't make it work. > >> If your oracle database is crashing, it is misconfigured. > > The Oracle techs working with Ameritrade couldn't keep the cluster > going. They eventually gave up when Ameritrade wouldn't commit to > replacing the entire cluster with bigger servers. > >> Financial >> transactions are a dangerous thing, you really do need ACID for >> fiduciary responsibility. > > Cache' delivers full ACID guarantee. I told you I wasn't talking > about NoSQL/MongoDB. > > >> You are avoiding the topic, the "storage system," is separate from the >> implementation of the objects. The objects know how to serialize and >> restore themselves as well as upgrade. The storage and location of >> objects is not involved. > > Of course I am. It's not relevant to the topic, which is the > technical merits of object vs relational databases. > > >> That is not a "how," it is a adjective and a plural noun. One does not >> need to use relations in a database, but one has them if they need them. >> An RDBMS is a tool not some kind of mandate. > > Then why bother with a relational database at all? The singular > strength of a relational database is the relations between data. If > you don't use relations then the relational database is the wrong tool > for the job. > > >> Yes, ok, that is done with the XML/JSON class description. What's the >> problem? > > The problem is that you're stuck with tables. You don't have an > object. You have an object stored in a table. Even if it is a table > with a single column and a single row it's still a table. > > >> If I said the XML was stored in a binary polymorphic object file and it >> could be retrieved by its ID, would that make a difference? Because, >> that is exactly what is happening. For convenience, we call the the >> polymorphic object file a "table." > > Sure, that works. Again, why bother with a relational database if you > want to short-circuit all of the relational functions? Which was my > original point: why bother with inferior tools like relational > databases when superior tools like object databases are available? > > >> Sorry, no. It is either a hash table, or they are hiding the index from >> you. Either way, it doesn't matter because databases have hash indexes. > > Nope. Binary trees or multidimensional arrays. Typically, an object > database doesn't cache index data which it doesn't have. It caches > objects. > >> And if you say that objects don't need that kind of indexing, then you >> miss the real power of database. If you have 8 million objects, say >> patients in a database. How do you find them by social security numbers? >> How about by last name? How about by symptoms? > > You walk a balanced b-tree. The worst case for a binary tree search > is O(log n). Then the patient object is loaded into cache and data > access times drop to O(1). > >>> Better performance, >> How? Prove it. > > O(log n) typical worst case for object searches vs. O(log n) typical > best case for relational searches. In real applications object > searches are 2-20 times faster than comparable relational searches. > >>> greater scalability, >> How? Prove it. > > Ameritrade. > >>> faster deployment, >> How? Prove it > > The VA Hospital's ahead of schedule and under budget deployment. > >>> easier >>> maintenance, >> How? Prove it > > Admittedly it is company propaganda, but case studies from > InterSystems' customers show that Cache' is easier and faster than > Oracle for application development and support. > >>> and typically at a lower cost for all of it. >> PostgreSQL is free. It doesn't get much lower in cost. > > Hardware, sysadmins, DBAs, application developers, test teams. All > these cost money. If you can deliver an application on leaner > hardware then you reduce cost. If you can deliver it in less time > then you reduce cost. > > >> No, I needed a DNS system that could replicate, allow user access, >> managed rights and privileges, etc. I could coble something together, or >> use a package that worked out of the package. It was a no brainer. > > I implemented something similar at a previous gig using shell scripts. > It worked perfectly. It was a no-brainer. And that's still my own > confirmational bias speaking. > > >> I do have some expertise in PostgreSQL, sure, but I always try to find >> the best tool for the task. I have used SQLite and I have done a fair >> amount of storage systems where an RDBMS is not appropriate. > > Consider this for your next project: a relational database is never > appropriate. Work from that. I'm certain that you will be surprised, > in a good way, at what you discover. > Just one issue. A binary tree and a b-tree are 2 different things. Some file systems are based on b-tree technology. -- Jerry Feldman <gaf at blu.org> Boston Linux and Unix PGP key id:3BC1EB90 PGP Key fingerprint: 49E2 C52A FC5A A31F 8D66 C0AF 7CEA 30FC 3BC1 EB90
BLU is a member of BostonUserGroups | |
We also thank MIT for the use of their facilities. |