Incorporating Retroactive Operations in Large Temporal Databases Using Retroactive B-Tree

Document Type : Research Paper

Authors

1 Assistant Prof., Department of Computer Science and Engineering, JK Lakshmipat University – Jaipur, Rajasthan, India.

2 Assistant Prof., Department of Computer Science and Engineering, Jaypee University, Anoopshahr, UP, India.

3 Prof., Department of Computer Science and Engineering, Jaypee University, Anoopshahr, UP, India.

10.22059/jitm.2025.102923

Abstract

Temporal databases, quickly rising in size, are distinguished by their capacity to maintain the older version of data objects against actions on them, allowing logical deletions. Queries for historical data are particularly costly due to the linear scanning of temporal versions. Temporal data structures like time-split B-Tree or multiversion B-Tree are working underlying the state-of-the-art temporal databases. So far, most efficient temporal data structures are partially persistent or fully persistent, but none of them support retroactive queries. On the other hand, efficient temporal indexing is required to address bulk loading in a real-life application. To the best of our knowledge, there is no efficient solution for bulk loading and updating retroactive index structures. This article seeks to offer a new data structure, the Retroactive B-Tree (RBT), to facilitate retroactive operations in temporal databases as well as bulk loading. It presents theoretical and empirical research and analysis of the suggested data structure and its relevant operations. The experiments were conducted to demonstrate the performance of the proposed retroactive B-Tree in terms of execution time, I/O complexity, space complexity, and bulk loading. The obtained results show that indexing with a buffer is the most powerful model for existing temporal databases for implementing a retroactive B-Tree. The tree of lists architecture is observed as an I/O efficient data structure for all variants of temporal indexing for large databases.

Keywords


Achakeev, D., & Seeger, B. (2013). Efficient bulk updates on multiversion B-trees. Proceedings of the VLDB Endowment, 6(14), 1834–1845. https://dl.acm.org/doi/pdf/10.14778/2556549.2556566
Andersen, M. P., & Culler, D. E. (2016). BTrDB: Optimizing storage system design for timeseries processing. In 14th USENIX Conference on File and Storage Technologies (FAST 16) (pp. 39–52). https://dl.acm.org/doi/10.5555/2930583.2930587
Arge, L. (2003). The buffer tree: A technique for designing batched external data structures. Algorithmica, 37, 1–24. https://doi.org/10.1007/s00453-003-1021-x
Arge, L., Danner, A., & Teh, S. M. (2003). I/O-efficient point location using persistent B-trees. Journal of Experimental Algorithmics (JEA), 8, 1–2. https://dl.acm.org/doi/pdf/10.1145/996546.996549
Arge, L., Hinrichs, K. H., Vahrenhold, J., & Vitter, J. S. (2002). Efficient bulk operations on dynamic R-trees. Algorithmica, 33, 104–128. https://doi.org/10.1007/s00453-001-0107-6
Bayer, R., & Schkolnick, M. (1977). Concurrency of operations on B-trees. Acta Informatica, 9, 1–21. https://link.springer.com/article/10.1007/BF00263762
Berenson, H., Bernstein, P., Gray, J., Melton, J., O'Neil, E., & O'Neil, P. (2007). A critique of ANSI SQL isolation levels. arXiv preprint cs/0701157. https://arxiv.org/pdf/cs/0701157
Bertino, E., Ooi, B. C., Sacks-Davis, R., Tan, K. L., Zobel, J., Shidlovsky, B., & Andronico, D. (2012). Indexing techniques for advanced database systems (Vol. 8). Springer Science & Business Media.
Comer, D. (1979). Ubiquitous B-tree. ACM Computing Surveys (CSUR), 11(2), 121–137. https://dl.acm.org/doi/pdf/10.1145/356770.356776
Copeland, G. (1980, March). What if mass storage were free? In Proceedings of the Fifth Workshop on Computer Architecture for Non-Numeric Processing (pp. 1–7). https://dl.acm.org/doi/pdf/10.1145/800083.802685
Dignös, A., Böhlen, M. H., Gamper, J., & Jensen, C. S. (2016). Extending the kernel of relational DBMS with comprehensive support for sequenced temporal queries. ACM Transactions on Database Systems (TODS), 41(4), 1–46. https://dl.acm.org/doi/pdf/10.1145/2967608
Fekete, A., Liarokapis, D., O'Neil, E., O'Neil, P., & Shasha, D. (2005). Making snapshot isolation serializable. ACM Transactions on Database Systems (TODS), 30(2), 492–528. https://dl.acm.org/doi/pdf/10.1145/1071610.1071615
Goodrich, M. T., Tsay, J. J., Vengroff, D. E., & Vitter, J. S. (1993, November). External-memory computational geometry. In Proceedings of 1993 IEEE 34th Annual Foundations of Computer Science (pp. 714–723). IEEE. https://doi.org/10.1109/SFCS.1993.366816
Haapasalo, T. K., Jaluta, I. M., Sippu, S. S., & Soisalon-Soininen, E. O. (2008, October). Concurrency control and recovery for multiversion database structures. In Proceedings of the 2nd PhD Workshop on Information and Knowledge Management (pp. 73–80). https://dl.acm.org/doi/pdf/10.1145/1458550.1458563
Haapasalo, T., Jaluta, I., Seeger, B., Sippu, S., & Soisalon-Soininen, E. (2009, March). Transactions on the multiversion B+-tree. In Proceedings of the 12th International Conference on Extending Database Technology: Advances in Database Technology (pp. 1064–1075). https://dl.acm.org/doi/pdf/10.1145/1516360.1516482
Hadjieleftheriou, M., Kollios, G., Tsotras, V. J., & Gunopulos, D. (2006). Indexing spatiotemporal archives. The VLDB Journal, 15, 143–164. https://doi.org/10.1007/s00778-004-0151-3
Harshit, S., & Santosh, V. (2021). B-tree versus buffer tree: A review of I/O efficient algorithms. In Intelligent Systems: Proceedings of SCIS 2021 (pp. 417–425). https://link.springer.com/chapter/10.1007/978-981-16-2248-9_40
HBase. (2017). http://hbase.apache.org
Jaluta, I., Sippu, S., & Soisalon-Soininen, E. (2005). Concurrency control and recovery for balanced B-link trees. The VLDB Journal, 14, 257–277. https://doi.org/10.1007/s00778-004-0140-6
John, A., Sugumaran, M., & Rajesh, R. S. (2016). Indexing and query processing techniques in spatio-temporal data. ICTACT Journal on Soft Computing, 6(3), 1198–1217. https://doi.org/10.21917/ijsc.2016.0167
Kulkarni, K., & Michels, J. E. (2012). Temporal features in SQL: 2011. ACM SIGMOD Record, 41(3), 34–43. https://dl.acm.org/doi/pdf/10.1145/2380776.2380786
Lehman, P. L., & Yao, S. B. (1981). Efficient locking for concurrent operations on B-trees. ACM Transactions on Database Systems (TODS), 6(4), 650–670. https://dl.acm.org/doi/pdf/10.1145/319628.319663
Lomet, D., & Salzberg, B. (1989). Access methods for multiversion data. ACM SIGMOD Record, 18(2), 315–324. https://dl.acm.org/doi/pdf/10.1145/66926.66956
Lomet, D., Barga, R., Mokbel, M. F., Shegalov, G., Wang, R., & Zhu, Y. (2005, June). Immortal DB: Transaction time support for SQL Server. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data (pp. 939–941). https://dl.acm.org/doi/pdf/10.1145/1066157.1066295
Lomet, D., Hong, M., Nehme, R., & Zhang, R. (2008). Transaction time indexing with version compression. Proceedings of the VLDB Endowment, 1(1), 870–881. https://dl.acm.org/doi/pdf/10.14778/1453856.1453951
Ozsoyoglu, G., & Snodgrass, R. T. (1995). Temporal and real-time databases: A survey. IEEE Transactions on Knowledge and Data Engineering, 7(4), 513–532. https://doi.org/10.1109/69.404027
Pelkonen, T., Franklin, S., Teller, J., Cavallaro, P., Huang, Q., Meza, J., & Veeraraghavan, K. (2015). Gorilla: A fast, scalable, in-memory time series database. Proceedings of the VLDB Endowment, 8(12), 1816–1827. https://dl.acm.org/doi/pdf/10.14778/2824032.2824078
Rajamani, R. (2007). Oracle total recall/flashback data archive. Technical report, Oracle. https://forseurope.wordpress.com/wp-content/uploads/2010/01/flashback-data-archive-whitepaper.pdf
Salzberg, B., Jiang, L., Lomet, D., Barrena, M., Shan, J., & Kanoulas, E. (2004). A framework for access methods for versioned data. In Advances in Database Technology—EDBT 2004: 9th International Conference on Extending Database Technology (pp. 730–747). Springer. https://doi.org/10.1007/978-3-540-24741-8_42
Saracco, C. M., Nicola, M., & Gandhi, L. (2010). A matter of time: Temporal data management in DB2 for z. IBM Corporation, 7. https://cs.ulb.ac.be/public/_media/teaching/infoh415/a_matter_of_time.pdf
Spdx. (2020). GNU general public license v2.0: B+tree basics. https://github.com/torvalds/linux/blob/master/include/linux/btree.h
Taghva, M. R., Mansouri, T., Feizi, K., & Akhgar, B. (2016). Fraud detection in credit card transactions; using parallel processing of anomalies in big data. Journal of Information Technology Management, 8(3), 477–498. https://doi.org/10.22059/jitm.2016.57818
Tao, Y., & Papadias, D. (2001). The MV3R-tree: A spatio-temporal access method for timestamp and interval queries. In Proceedings of Very Large Data Bases Conference (VLDB), 11–14 September, Rome. https://hdl.handle.net/1783.1/168
Tian, R., Zhai, H., Zhang, W., Wang, F., & Guan, Y. (2022). A survey of spatio-temporal big data indexing methods in distributed environment. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15, 4132–4155. https://doi.org/10.1109/JSTARS.2022.3175657
Wang, L., Cai, R., Fu, T. Z., He, J., Lu, Z., Winslett, M., & Zhang, Z. (2018, April). Waterwheel: Realtime indexing and temporal range query processing over massive data streams. In 2018 IEEE 34th International Conference on Data Engineering (ICDE) (pp. 269–280). IEEE. https://doi.org/10.1109/ICDE.2018.00033
Widmayer, P., Becker, B., Gschwind, D. I. S., Ohler, D. W. I. T., & Seeger, B. (1996). An asymptotically optimal multiversion B-tree. Very Large Data Bases Journal. Retrieved from https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=6f4cf2dd0d8af70af29ae857353badad12426183
Yang, F., Tschetter, E., Léauté, X., Ray, N., Merlino, G., & Ganguli, D. (2014, June). Druid: A real-time analytical data store. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data (pp. 157–168). https://dl.acm.org/doi/pdf/10.1145/2588555.2595631
Zhang, R., & Stradling, M. (2010). The HV-tree: A memory hierarchy aware version index. Proceedings of the VLDB Endowment, 3(1-2), 397–408. https://dl.acm.org/doi/pdf/10.14778/1920841.1920894
Zhao, X., Lam, K. Y., Zhu, C., Chow, C. Y., & Kuo, T. W. (2021). MVLevelDB: Using log-structured tree to support temporal queries in IoT. IEEE Internet of Things Journal, 9(10), 7815–7825. https://doi.org/10.1109/JIOT.2021.3113994