Real-time database
Real-time database has two meanings. The most common use of the term refers to a database system which uses streaming technologies to handle workloads whose state is constantly changing.[1] This differs from traditional databases containing persistent data, mostly unaffected by time. When referring to streaming technologies, real-time processing means that a transaction is processed fast enough for the result to come back and be acted on right away.[2] Such real-time databases are useful for assisting social media platforms in the removal of fake news, in-store surveillance cameras identifying potential shoplifters by their behavior/movements, etc.
The second meaning of the term “real-time database” adheres to a stricter definition of real-time consistent with Real-time computing. Hard real-time database systems work with a real-time operating system to ensure the temporal validity of data through the enforcement of database transaction deadlines and include a mechanism (such as transaction scheduling policies) to maximize the number of successfully committed transactions and minimize the number of rolled-back transactions. While the performance metric for most database systems is throughput or transactions-per-second, the performance metric of a hard real-time database system is the ratio of committed-to-aborted transactions. This ratio indicates how effective the transaction scheduling policy is, with the ultimate goal of meeting deadlines 100% of the time. Hard real-time databases, through enforcement of deadlines, may not allow transactions to be late (overrun the deadline).[3]
Overview
Real-time databases are traditional databases that use an extension to give the additional power to yield reliable responses. They use timing constraints that represent a certain range of values for which the data are valid. This range is called temporal validity. A conventional database cannot work under these circumstances because the inconsistencies between the real world objects and the data that represents them are too severe for simple modifications. An effective system needs to be able to handle time-sensitive queries, return only temporally valid data, and support priority scheduling. To enter the data in the records, often a sensor or an input device monitors the state of the physical system and updates the database with new information to reflect the physical system more accurately.[4] When designing a real-time database system, one should consider how to represent valid time, how facts are associated with real-time system. Also, consider how to represent attribute values in the database so that process transactions and data consistency have no violations.
When designing a system, it is important to consider what the system should do when deadlines are not met.[5] For example, an air-traffic control system constantly monitors hundreds of aircraft and makes decisions about incoming flight paths and determines the order in which aircraft should land based on data such as fuel, altitude, and speed. If any of this information is late, the result could be devastating. To address issues of obsolete data, the timestamp can support transactions by providing clear time references.
Preserving data consistency
Although the real-time database system may seem like a simple system, problems arise during overload when two or more database transactions require access to the same portion of the database. A transaction is usually the result of an execution of a program that accesses or changes the contents of a database.[6] A transaction is different from a stream because a stream only allows read-only operations, and transactions can do both read and write operations. This means in a stream, multiple users can read from the same piece of data, but they cannot both modify it.[4] A database must let only one transaction operate at a time to preserve data consistency. For example, if two students demand to take the remaining spot for a section of a class and they hit submit at the same time, only one student should be able to register for it.[4]
Real-time databases can process these requests utilizing scheduling algorithms for concurrency control, prioritizing both students’ requests in some way. Throughout this article, we assume that the system has a single processor, a disk based database, and a main memory pool.[7]
In real-time databases, deadlines are formed and different kinds of systems respond to data that does not meet its deadline in different ways. In a real-time system, each transaction uses a timestamp to schedule the transactions.[4] A priority mapper unit assigns a level of importance to each transaction upon its arrival in the database system that is dependent on how the system views times and other priorities. The timestamp method relies on the arrival time in the system. Researchers indicate that for most studies, transactions are sporadic with unpredictable arrival times. For example, the system gives an earlier request deadline to a higher priority and a later deadline to a lower priority.[7] Below is a comparison of different scheduling algorithms.
- Earliest Deadline
- PT = DT — The value of a transaction is not important. An example is a group of people calling to order a product.
- Highest Value
- PT = 1/VT — The deadline is not important. Some transactions should get to CPU based on criticalness, not fairness. This is an example of least slack that can wait the least amount of time. If the telephone switchboards were overloaded, people who call 911 should get priority.[8]
- Value inflated deadline
- PT = DT/VT — Gives equal weight to deadline and values based on scheduling. An example is registering for classes where the student selects a block of classes that he wishes to take and presses submit. In this scenario, higher priorities often take up precedence. A school registration system probably uses this technique when the server receives two registration transactions. If one student had 22 credits and the other had 100 credits, the person with 100 credits would take priority (Value based scheduling).
Timing constraints and deadlines
A system that correctly perceives the serialization and timing constraints associated with transactions with soft or firm deadlines, takes advantage of absolute consistency.[9] Another way of making sure that data is absolute is using relative constraints. Relative constraints ensure transactions enter into the system at the same time as the rest of the group that the data transaction is associated with. Using the mechanisms of absolute and relative constraints greatly ensures the accuracy of data.
An additional way of dealing with conflict resolution in a real-time database system besides deadlines is a wait policy method. This process helps ensure the latest information in time critical systems. The policy avoids conflict by asking all non-requesting blocks to wait until the most essential block of data is processed.[4] While studies in labs have found that data-deadline based policies do not improve performance significantly, the forced wait policy can improve performance by 50 percent.[10] The forced wait policy may involve waiting for higher priority transactions to process in order to prevent deadlock. Another example of when data can be delayed is when a block of data is about to expire. The forced wait policy delays processing until the data is updated using new input data. The latter method helps increase the accuracy of the system and can cut down on the number of necessary processes that are aborted. Generally, relying on wait policies is not optimal.[11]
It is necessary to discuss the formation of deadlines. Deadlines are the constraints for soon-to-be replaced data accessed by the transaction. Deadlines can be either observant or predictive.[11] In an observant deadline system, all unfinished transactions are examined and the processor determines whether any had met its deadline.[4] Problems arise in this method because of variations caused by seek time variations, buffer management and page faults.[12] A more stable way of organizing deadlines is the predictive method. It builds a candidate schedule and determines if a transaction would miss its deadline under the schedule.[4]
The type of response to a missed deadline depends on whether the deadline is hard, soft, or firm. Hard deadlines require that each data packet reach its destination before the packet has expired and if not, the process could be lost, causing a possible problem. Problems like these are not very common because omnipotence of the system is required before assigning deadlines to determine worst case. This is very hard to do and if something unexpected happens to the system such as a minute hardware glitch, it could throw the data off. For soft or firm deadlines, missing a deadline can lead to a degraded performance but not a catastrophe.[7] A soft deadline meets as many deadlines as possible. However, no guarantee exists that the system can meet all deadlines. Should a transaction miss its deadline, the system has more flexibility and the transaction may increase in importance. Below is a description of these responses:
- Hard deadline
- If not meeting deadlines creates problems, a hard deadline is best. It is periodic, meaning that it enters the database on a regular rhythmic pattern. An example is data gathered by a sensor. These are often used in life critical systems.[13]
- Firm deadline
- Firm deadlines appear to be similar to hard deadlines yet they differ from hard deadlines because firm deadlines measure how important it is to complete the transaction at some point after the transaction arrives. Sometimes completing a transaction after its deadline has expired may be harmful or not helpful, and both the firm and hard deadlines consider this. An example of a firm deadline is an autopilot system.[8]
- Soft deadline
- If meeting time constrains is desirable but missing deadlines do not cause serious damage, a soft deadline may be best. It operates on an aperiodic or irregular schedule. In fact, the arrival of each time for each task is unknown. An example is an operator switchboard for a telephone.[13]
Hard deadline processes abort transactions that have passed the deadline, improving the system by cleaning out clutter that needs to be processed. Processes can clear out not only the transactions with expired deadlines but also transactions with the longest deadlines, assuming that once they reach the processor they would be obsolete. This means other transactions should be of higher priority. In addition, a system can remove the least critical transactions. When I was pre-selecting classes on during a high traffic period, a field in the database can become so busy with registration requests that it was unavailable for a while and the result of my transaction was a display of the SQL query sent and a message that said that the data is currently unavailable. This error is caused by the checker, a mechanism that checks the condition of the rules, and the rule that occurred before it.[14]
The goal of scheduling periods and deadlines is to update transactions guaranteed to complete before their deadline in such a way that the workload is minimal. With large real-time databases, buffering functions can help improve performance tremendously. A buffer is part of the database that is stored in main memory to reduce transaction response time. In order to reduce disk input and output transactions, a certain number of buffers should be allocated.[15] Sometimes multiversions are stored in buffers when the data block the transaction needs is currently in use. Later, the database has the data appended to it. Different strategies allocate buffers and must balance between taking an excessive amount of memory and having everything in one buffer that it has to search for. The goal is to eliminate search time and distribute the resources between buffer frames in order to access data quickly. A buffer manager is capable of allocating more memory, if necessary, to improve response time. The buffer manager can even determine whether a transaction that it has should advance. Buffering can improve speed in real-time systems.[15]
Future database systems
Traditional databases are persistent but are incapable of dealing with dynamic data that constantly changes. Therefore, another system is needed. Real-time databases may be modified to improve accuracy and efficiency and to avoid conflict, by providing deadlines and wait periods to insure temporal consistency. Real-time database systems offer a way of monitoring a physical system and representing it in data streams to a database. A data stream, like memory, fades over time. In order to guarantee that the freshest and most accurate information is recorded there are a number of ways of checking transactions to make sure they are executed in the proper order. An online auction house provides an example of a rapidly changing database.
Now database systems are faster than they were in the past. In the future, we can look forward to even faster database systems. Although we have faster systems now, an effort to reduce misses and tardy times will still be beneficial. The ability to process results in a timely and predictable manner will always be more important than fast processing. Fast processing that is misapplied is not helpful for real-time database systems. Transactions that run faster still sometimes block in such a way that they have to be aborted and restarted. In fact, faster processing hurts some real-time applications because increased speed brings more complexity and more of a chance for problems caused by a variance of speed. Faster processing makes it harder to determine which deadlines have been met successfully. With future database systems running even faster than ever, there is a need to do more studies so we can continue to have efficient systems.[16]
The amount of research studying real-time database systems will increase because of commercial applications such as web based auction houses like eBay. More developing countries are expanding their phone systems, and the number of people with cell phones in the United States as well as other places in the world continues to grow. Also likely to spur real-time research is the exponentially increasing speed of the microprocessor. This also enables new technologies such as web-video conferencing and instant messenger conversations in sound and high-resolution video, which are reliant on real-time database systems. Studies of temporal consistency result in new protocols and timing constraints with the goal of handling real-time transactions more effectively.[7]
References
- ^ Buchmann, A. "Real Time Database Systems." Encyclopedia of Database Technologies and Applications. Ed. Laura C. Rivero, Jorge H. Doorn, and Viviana E. Ferraggine. Idea Group, 2005.
- ^ Carpron, H.L., J. A. Johnson. Computers: Tools for the Information Age. Prentice Hall, 1998. 5th ed.
- ^ "What is and what isn't a hard real-time database system?". db-engines.com. Retrieved 2023-03-17.
- ^ a b c d e f g Abbot, Robert K., and Hector Garcia-Molina. (1992). "Scheduling Real-Time Transactions: a Performance Evaluation" (PDF). ACM Transactions on Database Systems. 17 (3). Stanford University and Digital Equipment Corp. ACM: 513–560. doi:10.1145/132271.132276. S2CID 28960. Retrieved 13 December 2006.{{cite journal}}: CS1 maint: multiple names: authors list (link)
- ^ "Real-Time Database Systems Aren't Actually Real-Time. Unless They Are". www.electronicdesign.com. Retrieved 2023-01-21.
- ^ Singhal, Mukesh. Approaches to Design of Real-Time Database Systems, SIGMOD Record, volume 17, no. 1, March 1988
- ^ a b c d Haritsa, J., J. Stankovic, and M Xiong. "A State-Conscious Concurrency Control Protocol for Replicated Real-Time Databases". University of Virginia. IEEE Real-Time Applications Symposium. Retrieved 13 December 2006. {{cite journal}}: Cite journal requires|journal=(help)CS1 maint: multiple names: authors list (link)
- ^ a b (Snodgrass)
- ^ Lee, Juhnyoung (1994). "Concurrency Control Algorithms for Real-Time Database Systems". Diss. Univ. of Virginia. Retrieved 13 December 2006. {{cite journal}}: Cite journal requires|journal=(help)
- ^ (Porkka)
- ^ a b Kang, K D., S Son, and J Stankovic. Specifying and Managing Quality of Real-Time Data Services. University of Virginia. IEEE TKDE, 2004.
- ^ Kao & Garcia-Molina 1994, pp. 261–282.
- ^ a b Stankovic, John A., Marco Spuri, Krithi Ramamritham, and Giorgio C. Buttazzo. Deadline Scheduling for Real-Time Systems: EDF and Related Algorithms. Springer, 1998.
- ^ (Ramamritham)
- ^ a b (O'Neil)
- ^ Lam, Kam-Yiu, and Tei-Wei Kuo. Real-Time Database Systems: Architecture and Techniques. Springer, 2001.
Further reading
- Ozsoyoglu, Gultekin, and Richard T. Snodgrass. Temporal and Real-Time Databases: a Survey. Knowledge and Data Engineering, 1995. 13 Dec. 2006.
- Kao, Ben; Garcia-Molina, Hector (1994). "An Overview of Real-Time Database Systems". Real Time Computing. Berlin, Heidelberg: Springer Berlin Heidelberg. doi:10.1007/978-3-642-88049-0_13. ISBN 978-3-642-88051-3. ISSN 0258-1248.
- Lindstrom, Jan. Real Time Database Systems. Solid, 2008. March 25, 2008
- Sivasankaran, Rajendran M., John A. Stankovic, Don Towsley, Bhaskar Purimetla, and Krithi Ramamaritham. Priority Assignment in Real-Time Active Databases. University of Massachusetts. Amaherst, NY, 1996. 13 Dec. 2006.
- Stonebraker, Michael, et al. HStore: A High Performance, Distributed Main Memory Transaction Processing System, 2008.