EDW Concepts
Last updated
Was this helpful?
Last updated
Was this helpful?
A is any function that can be used to map data of arbitrary size to fixed-size
A hash function is a mathematical function that converts a input value into another compressed numerical value.
The input to the hash function is of arbitrary length but output is always of fixed length.
Properties
Efficiently computable.
Should uniformly distribute the keys (Each table position equally likely for each key)
ShuffleMoveOperation: Redistributes data from one distributed table to another distributed table, changing the distribution column.
PartitionMoveOperation: Data moved from distributions to Control Node. Usually for Aggregations.
BroadcastMoveOperation: When a distributed table needs to become replicated for join compatibility
TrimMoveOperation: When a replicated table needs to become distributed
MoveOperationData: Moved from Control Node back to Compute Nodes resulting in a replicated table for further processing.
RoundRobinMoveOperation: Redistributes data to Round Robin Table.
Reference Links
Instead of storing an entire row or rows in a page, one column from many rows is stored in that page. It is this difference in architecture that gives the columnstore index a very high level of compression along with reducing the storage footprint and providing massive improvements in read performance.
Relevant Links: