PERFORMANCE ANALYSIS OF WRITE OPERATIONS IN IDENTITY AND UUID ORDERED TABLES
PDF (Język Polski)

Keywords

database design
logical model
heap files
B tree
insert performance

Abstract

Design of the database includes the decision about the physical storage. This is often overlooked as 1) this cannot be expressed in standard SQL and in result each Database Systems have their own way to specify the physical storage and 2) the decision is often made implicitly. This is dangerous situation as many of the databases use B+ trees as table implementation which stores the data physically sorter by some ordering attribute. The choice of the ordering attribute largely affects read and write operations. Commonly, IDENTITY/AUTO_INCREMENT constraint are being chosen as ordering attributes, due to their easy usage and monotonic nature. In some cases ordering tables by the attributes whose values are drawn from uniform distribution leads to better performance in terms of Transactions-Per-Second. Such cases includes situation when data does fit entirely in-memory or when we can limit the set of physical pages being accessed. In the end, however, We cannot entirely say that either monotonic or random attributes are superior. Both have their pros and cons. In this article We present (1) short description of the data structures in contemporary Database Systems, (2) the advantages and the disadvantages of the two common types which are used as the clustering attributes: GUID and IDENTITY, (3) performance analysis of write operation which compare both data types using B+ tree as primary storage and (4) evaluate the efficiency of these bulk load operation using heap files and B+ trees.

https://doi.org/10.7862/re.2020.6
PDF (Język Polski)

References

[1] Ullman D.J., Widom J.: A First Course In Database Systems, Helion Publisher, pages 110-129, 1997
[2] Leach P., Mealling M., Salz R.: RFC 4122: A Universally Unique Identifier (UUID) URN Namespace, https://tools.ietf.org/html/rfc4122 (Access: 9 September 2018)
[3] Nilsson J.: The Cost of GUIDs as Primary Keys, http://www.informit.com/articles/article.aspx?p=25862 (Access: 9 September 2018)
[4] Clayton R.: Do you really need a UUID/GUID?, https://rclayton.silvrback.com/doyou-really-need-a-uuid-guid (Access: 9 September 2018)
[5] Ricken U.: GUID vs INT/IDENTITY als Clustered Key, https://www.dbberater.de/2015/04/guid-vs-intidentity-als-clustered-key-2/ (Access: 9 September 2018)
[6] Penn J.: Taking It Further: GUIDs vs INTs as Primary Keys, https://scifisql.com/ 2017/05/07/guids-vs-ints-as-primary-keys/, (Access: 9 September 2018)
[7] Boicea A., Bucur I., Radulescu F., Truica C.A.: Performance Evaluation for CRUDOperations in Asynchronously Replicated Document Oriented Database, 20th International Conference on Control Systems and Computer Science, Bucharest, 2015
[8] Li Y., Manoharan S.: A performance comparison of SQL and NoSQL databases, IEEE Pacific RIM Conference on Communications, Computers, and Signal Processing - Proceedings, 2013
[9] Elmasri R., Navathe S.: Fundamentals of Database Systems, Helion Publisher, pages 449 & 288-501, 2005
[10]Bača M., Grd P.: Analysis of B-tree data structure and its usage in computer forensics, Central European Conference on Information and Intelligent Systems, 2010
[11]Jhingran A., Khedkar P.: Analysis of Recovery in a Database System Using a Writeahead Log Protocol, Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data, 1992
[12]Brown D.P., Richards A: Managing access to data in a multi-temperature database, US Patent US9015146B2, 2015-04-21
[13]Marquardt A.: Generating Globally Unique Identifiers for Use with MongoDB, https://www.mongodb.com/blog/post/generating-globally-unique-identifiers-foruse-with-mongodb (Access: 9 September 2018)