CREATE TABLE [table_option ...] name '(' [create_definition, ...] ')'
CREATE TABLE name LIKE example_name
table_option:
MAX_VERSIONS '=' int
| TTL '=' duration
| IN_MEMORY
| BLOCKSIZE '=' int
| COMPRESSOR '=' compressor_spec
create_definition:
column_family_name [MAX_VERSIONS '=' int] [TTL '=' duration]
| ACCESS GROUP name [access_group_option ...]
['(' [column_family_name, ...] ')']
duration:
int MONTHS
| int WEEKS
| int DAYS
| int HOURS
| int MINUTES
| int [ SECONDS ]
access_group_option:
IN_MEMORY
| BLOCKSIZE '=' int
| COMPRESSOR '=' compressor_spec
| BLOOMFILTER '=' bloom_filter_spec
compressor_spec:
bmz [ bmz_options ]
| lzo
| quicklz
| zlib [ zlib_options ]
| none
bmz_options:
--fp-len int
| --offset int
zlib_options:
-9
| --best
| --normal
bloom_filter_spec:
rows [ bloom_filter_options ]
| rows+cols [ bloom_filter_options ]
| none
bloom_filter_options:
--false-positive float
--max-approx-items int
CREATE TABLE creates a table with the given name. A table consists of a set
of column family and access group specifications.
Column families are somewhat analogous to a traditional database column. The main difference is that a theoretically infinite number of qualified columns can be created within a column family. The qualifier is an optional NUL- terminated string that can be supplied, along with the data, in the insert statement. This is what gives tables in Hypertable their sparse nature. For example, given a column family "tag", the following set of qualified columns may be inserted for a single row.
tag:goodtag:sciencetag:authoritativetag:greenThe column family is represented internally as a single byte, so there is a
limit of 255 column families (the 0 value is reserved) which may be supplied in
the CREATE TABLE statement.
Tables consist of one or more access groups, each containing some number of column families. There is a default access group named "default" which contains all column families that are not explicitly referenced in an ACCESS GROUP clause. For example, the following two statements are equivalent.
|
|
Access groups provide control over the physical layout of the table data on disk. The data for all column families in the same access group are stored physically together on disk. By carefully defining a set of access groups and choosing which column families go into those access groups, performance can be significantly improved for expected workloads. For example, say you have a table with 100 column families, but two of the column families get access together with much higher frequency than the rest of the 98 column families. By putting the two frequently accessed column families in their own access group, the system does much less disk i/o because only the data for the two column families gets transfered whenever those column families are accessed. A row-oriented database can be emulated by having a single access group. A column-oriented database can be emulated by having each column family within their own access group.
The following table options are supported:
MAX_VERSIONS '=' intTTL '=' durationIN_MEMORYBLOCKSIZE '=' intCOMPRESSOR '=' compressor_specThese are the same options as the ones in the column family and access group specification except that they act as defaults in the case where no corresponding option is specified in the column family or access group specifier. See the description under Access Group Options for option details.
The following column family options are supported:
MAX_VERSIONS '=' intTTL '=' durationCells in a table are specified by not only a row key and a qualified column,
but also a timestamp. This allows for essentially multiple timestamped version
s of each cell. Cells are kept stored in reverse chronological order of
timestamp and the MAX_VERSIONS allows you to specify that you only want to
keep the n most recent versions of each cell. Older versions are lazily
garbage collected through the normal compaction process.
The TTL option allows you to specify that you only want to keep cell versions
that fall within some time window in the immediate past. For example, you can
specify that you only want to keep cells that were created within the past two
weeks. Like the MAX_VERSIONS option, older versions are lazily garbage
collected through the normal compaction process.
The following access group options are supported:
IN_MEMORYBLOCKSIZE '=' intCOMPRESSOR '=' compressor_specBLOOMFILTER '=' bloom_filter_specThe IN_MEMORY option indicates that all cell data for the access group should
remain memory resident. Queries against column families in IN_MEMORY access
groups are efficient because no disk access is required.
The cell data inserted into an access group resides in one of two places. The recently inserted cells are stored in an in-memory data structure called the cell cache and older cells get compacted into on-disk data structures called cell stores. The cell stores are organized as a series of compressed blocks of sorted key/value pairs (cells). At the end of the compressed blocks is a block index which contains, for each block, the key (row,column,timestamp) of the last cell in the block, followed by the block offset. It also contains a Bloom Filter.
The BLOCKSIZE option controls the size of the compressed blocks in the cell
stores. A smaller block size minimizes the amount of data that must be read
from disk and decompressed for a key lookup at the expense of a larger block
index which consumes memory. The default value for the block size is 65K.
The COMPRESSOR option specifies the compression codec that should be used for
cell store blocks within an access group. See the Compressors section below
for a description of each compression codec.
NOTE: if the block, after compression, is not significantly reduced in size, then no compression will be performed on the block
An access group can consist of many on-disk cell stores. A query for a single row key can result probing each cell store to see if data is present for that row even when most of the cell stores do not contain any data for that row. To eliminate this inefficiency, each cell store contains an optional Bloom Filter. The Bloom Filter is a probabilistic data structure that can indicate, with high probability, if a key is present and also indicate definitively if a key is not present. By mapping the bloom filters, for each cell store in memory, queries can be made much more efficient because only the cell stores that contain the row are searched.
The bloom filter specification can take one of the following forms. The rows
form, which is the default, causes only row keys to be inserted into the bloom
filter. The rows+cols form causes the row key concatenated with the column
family to be inserted into the bloom filter. none disables the bloom
filter.
rows [ bloom_filter_options ]rows+cols [ bloom_filter_options ]noneThe following table describes the bloom filter options:
| Option | Default | Description |
|---|---|---|
--false-positive arg |
0.01 |
Expected false positive probability |
--max-approx-items arg |
1000 |
Number of cell store items used to guess the number of actual Bloom filter entries |
The cell store blocks within an access group are compressed using the compression codec that is specified for the access group. The following compression codecs are available:
bmzlzoquicklzzlibnoneThe default code is lzo for cell store blocks. The following tables describe
the available options.
| Option | Default | Description |
|---|---|---|
--fp-len arg |
19 |
Minimum fingerprint length |
--offset arg |
0 |
Starting fingerprint offset |
| Option | Description |
|---|---|
-9 [ --best ] |
Highest compression ratio (at the cost of speed) |
--normal |
Normal compression ratio |