Figure 1 Compact row format data
Posted: Thu Jan 23, 2025 9:20 am
At the same time, the TaurusDB field compression feature also provides automatic compression capabilities, automatically adding compression attributes to columns in user tables that meet type and length thresholds, helping users use this feature more conveniently.
Through relevant benchmark model tests, after the field compression feature is enabled, if the business does not involve compressed fields, there is no impact on performance; if compressed fields are involved, the system performance loss is usually within 10%. The data size ratio before and after compression can reach 1.8 or above, which means that at the cost of a small performance loss, the storage cost is significantly reduced, achieving a balance between economic benefits and system efficiency.
3. Implementation principle
TaurusDB field compression feature uses compressed or uncompressed kenya phone number data format according to different situations at the storage level, achieving efficient data compression and decompression.
Compact row format data at the storage engine layer, as shown in Figure 1, for variable-length data types such as VARCHAR, the system needs to store not only the actual data of the field, but also the length information of the data (that is, the number of bytes occupied) .
The field compression feature implemented by TaurusDB adds content representing compression attributes to each column of data. For columns that do not use the field compression feature, the values remain in the original format; for columns that use the field compression feature, the column data values shown above are changed to the following two formats.
The first one is the compression format in field compression, as shown in Figure 2:
Figure 2 Compression format in field compression
It contains Compress Header, Uncompressed Data Len and Compressed Data, and its functions are as follows:
Compress Header: Saves metadata such as whether compression has been performed and the compression algorithm used.
Through relevant benchmark model tests, after the field compression feature is enabled, if the business does not involve compressed fields, there is no impact on performance; if compressed fields are involved, the system performance loss is usually within 10%. The data size ratio before and after compression can reach 1.8 or above, which means that at the cost of a small performance loss, the storage cost is significantly reduced, achieving a balance between economic benefits and system efficiency.
3. Implementation principle
TaurusDB field compression feature uses compressed or uncompressed kenya phone number data format according to different situations at the storage level, achieving efficient data compression and decompression.
Compact row format data at the storage engine layer, as shown in Figure 1, for variable-length data types such as VARCHAR, the system needs to store not only the actual data of the field, but also the length information of the data (that is, the number of bytes occupied) .
The field compression feature implemented by TaurusDB adds content representing compression attributes to each column of data. For columns that do not use the field compression feature, the values remain in the original format; for columns that use the field compression feature, the column data values shown above are changed to the following two formats.
The first one is the compression format in field compression, as shown in Figure 2:
Figure 2 Compression format in field compression
It contains Compress Header, Uncompressed Data Len and Compressed Data, and its functions are as follows:
Compress Header: Saves metadata such as whether compression has been performed and the compression algorithm used.