企业做网站有用吗,工业和信息化部icp网站备案系统,王也道长古风头像,建完网站怎样维护Spark SQL----DISTRIBUTE BY子句 一、描述二、语法三、参数四、例子 一、描述
DISTRIBUTE BY子句用于根据输入表达式对数据进行重新分区。与CLUSTER BY子句不同#xff0c;这不会对每个分区内的数据进行排序。
二、语法
DISTRIBUTE BY { expression [ , ... ] }三、参数
e… Spark SQL----DISTRIBUTE BY子句 一、描述二、语法三、参数四、例子 一、描述
DISTRIBUTE BY子句用于根据输入表达式对数据进行重新分区。与CLUSTER BY子句不同这不会对每个分区内的数据进行排序。
二、语法
DISTRIBUTE BY { expression [ , ... ] }三、参数
expression 指定产生由一个或多个值、运算符和SQL函数组成的组合。
四、例子
CREATE TABLE person (name STRING, age INT);
INSERT INTO person VALUES(Zen Hui, 25),(Anil B, 18),(Shone S, 16),(Mike A, 25),(John A, 18),(Jack N, 16);-- Reduce the number of shuffle partitions to 2 to illustrate the behavior of DISTRIBUTE BY.
-- Its easier to see the clustering and sorting behavior with less number of partitions.
SET spark.sql.shuffle.partitions 2;-- Select the rows with no ordering. Please note that without any sort directive, the result
-- of the query is not deterministic. Its included here to just contrast it with the
-- behavior of DISTRIBUTE BY. The query below produces rows where age columns are not
-- clustered together.
SELECT age, name FROM person;
----------
|age| name|
----------
| 16|Shone S|
| 25|Zen Hui|
| 16| Jack N|
| 25| Mike A|
| 18| John A|
| 18| Anil B|
------------ Produces rows clustered by age. Persons with same age are clustered together.
-- Unlike CLUSTER BY clause, the rows are not sorted within a partition.
SELECT age, name FROM person DISTRIBUTE BY age;
----------
|age| name|
----------
| 25|Zen Hui|
| 25| Mike A|
| 18| John A|
| 18| Anil B|
| 16|Shone S|
| 16| Jack N|
----------