BLU Discuss list archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Discuss] No-SQL Database Recommendation?
- Subject: [Discuss] No-SQL Database Recommendation?
- From: johnhall2.0 at gmail.com (John Hall)
- Date: Sun, 11 Jan 2015 21:12:37 -0500
- In-reply-to: <54B2FFFB.80704@borg.org>
- References: <54B188FE.4040409@borg.org> <16cdf2d0d2b7662c18dd50d7ad16c10b.squirrel@mail.mohawksoft.com> <54B1E0A4.7000509@gmail.com> <a3654c774b10dbf5631904797d67f2b3.squirrel@mail.mohawksoft.com> <54B288AE.1080707@gmail.com> <45fbd5037ac2cd283d17513f0d0382bf.squirrel@mail.mohawksoft.com> <54B2D86B.2080908@gmail.com> <ff8850f63e46727fc3733373d9e5a2f0.squirrel@mail.mohawksoft.com> <54B2FA0E.2080808@gmail.com> <54B2FFFB.80704@borg.org>
Kent, I'm not convinced you need a database. Writing your own simple persistence layer can be really easy and reliable with Python. If your critical queries have predicable parameters you could cache the results periodically, or keep these results updated as data is recorded, and record computations in a separate data series. You could use shelve to persist a python dictionary to files for stored query results and and csv to record data. Prehaps write your data to a file series with datetime encoded in file and directory names. Shelve might not work so well when you need to support multiple processes. Redis is a good way to create a shared memory space between several processes. ?see:? https://docs.python.org/3.4/library/shelve.html https://docs.python.org/3.4/library/csv.html http://try.redis.io A shared directory of CSV files can be a decent interface for users. If your data consists of a time series of numbers flat files can be very reliable and user friendly. It's easy for users to load the data into a spreadsheet, SageMath, or R to work with data on a workstation, and CSV files are easy to migrate to a database with a script at a later time. This sort of solution might only be practical if you can also predict and store the result of the time-critical queries. How precise are times stored, and queried? Is time in the input data or is it the time data was reported? Is this a new solution or a replacement for something already working? If queries will always be by period then you could load the data for the period with either date-time range as part of the file and path names, or names with serialized integers and keep a file that maps dates/times to the sequence. If on the other hand you want to ask something like what is the highest value recorded for CPU core temp on a Tuesday among the workstations on the left side of the lab in the last five months then you'd best go with the database. I'm curious about what the intended application is, the OS distribution you are using, hardware and other loads on that hardware. I'm concerned that your Mongo issues might be due to some system bottleneck that you might run into with another solution as well, anyway, try using "nice" to prioritize your processes and "ionice" to make sure background processes and cron jobs don't dominate your disk io. What is your interface for data collection/ input. Is this automatically being measured and reported by a hardware system? Do you have that set up already. If you want to make this a web app I'd recommend Flask if you are going to go with flat files or Django if you want to have a database or need to create accounts for users and manage access rights and sessions. If you decide you need a database you don't have to write sql if you don't want to.. I assume you've heard about SQLAlchemy or the Django Model classes. These are both object-relational mappers and completely automate database definition. They allow you to just write Python. On the other hand learning them may be more complicated then learning just enough SQL and the python database api. It seems that you may have an exaggerated sense of what is involved with SQL. It is meant to be understandable to business people and database management systems allow you to leverage highly optimized low level code and efficient data structures and algorithms. You definitely don't need to write any stored procedures. You can learn SQL by experimenting with a command line client or play with it from iPython. The Python db-api makes it easy to execute a query you have in a string (usually triple quoted in your code, or loaded from a .sql file. You pass the query and dict for substitution parameters when you want to execute the query, then process query results with a for-in loop as a list of objects/dicts. If you only have one flat table it's just one "Create table" query/command to make the table. I'd start by playing with SQL in ipython using sqllite then go for configuring PostgresSql. https://docs.python.org/2/library/sqlite3.html To get just the first few with SQL use a LIMIT clause on your queries: http://www.postgresql.org/docs/9.4/static/queries-limit.html I agree with others that if you need a database PostgreSQL would be a fine solution and with Richard Pieri's original litmus tests on if you should consider a nosql database. If system resources like ram are short then consider staying with sqlite and write a single daemon process to handle writes to the data. That will be fairly reliable and add much less overhead than PostgreSQL. You can have multiple processes query the data but you'd want just one process to write. This would be similar for the csv solution. Cheers + + + + + + + John Hall On Sun, Jan 11, 2015 at 5:58 PM, Kent Borg <kentborg at borg.org> wrote: > On 01/11/2015 05:32 PM, Richard Pieri wrote: > >> If by "No-SQL" you mean "does not use SQL" >> > > Just that I dislike SQL, I don't want to have to maintain a separate code > base of stored procedures, I don't want to have to use Python to assemble > SQL on the fly either. > > I would like something that has a nice Python API. But if I can't I can't. > If the no-SQL part has to be discarded, so be it. > > I don't need relational features, but I don't mind if my database supports > them. There is nothing in my data that needs a free-format key/value > features. Maybe there is a nice Python interface to PostgreSQL that hides > SQL bits, but I doubt it. > > I support the idea that irrational prejudices are irrational, but I also > support the idea that aesthetically pleasing tools are good, make for > better results, etc. > > -kb > > > _______________________________________________ > Discuss mailing list > Discuss at blu.org > http://lists.blu.org/mailman/listinfo/discuss >
- Follow-Ups:
- [Discuss] No-SQL Database Recommendation?
- From: kentborg at borg.org (Kent Borg)
- [Discuss] No-SQL Database Recommendation?
- From: sweetser at alum.mit.edu (Doug)
- [Discuss] No-SQL Database Recommendation?
- References:
- [Discuss] No-SQL Database Recommendation?
- From: kentborg at borg.org (Kent Borg)
- [Discuss] No-SQL Database Recommendation?
- From: markw at mohawksoft.com (markw at mohawksoft.com)
- [Discuss] No-SQL Database Recommendation?
- From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] No-SQL Database Recommendation?
- From: markw at mohawksoft.com (markw at mohawksoft.com)
- [Discuss] No-SQL Database Recommendation?
- From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] No-SQL Database Recommendation?
- From: markw at mohawksoft.com (markw at mohawksoft.com)
- [Discuss] No-SQL Database Recommendation?
- From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] No-SQL Database Recommendation?
- From: markw at mohawksoft.com (markw at mohawksoft.com)
- [Discuss] No-SQL Database Recommendation?
- From: richard.pieri at gmail.com (Richard Pieri)
- [Discuss] No-SQL Database Recommendation?
- From: kentborg at borg.org (Kent Borg)
- [Discuss] No-SQL Database Recommendation?
- Prev by Date: [Discuss] No-SQL Database Recommendation?
- Next by Date: [Discuss] No-SQL Database Recommendation?
- Previous by thread: [Discuss] No-SQL Database Recommendation?
- Next by thread: [Discuss] No-SQL Database Recommendation?
- Index(es):