整体架构

image.png | left | 747x543

大致分为几个部分：

Client 客户端，使用presto-client 打包的jar可执行文件
- 常见用法： ./presto-cli.jar --server localhost:8080 --catalog mongodb --schema user --debug
Coordinator节点，做一些调度工作；
Discovery service节点，协助Coordinator节点和Worker节点互相发现的服务；
Worker节点，干活的节点。比如表扫描等工作；
Connector plugin。插件化的组件，可以通过为Postgresql、Mysql等写对应的Connector实现支持presto从这些数据库中读写数据，官方已经维护了一些主流的Connector;

更具体的Connector连接图：

image.png | left | 747x557

几种典型的部署架构图

单机版–Coordinator、Worker、Discovery Service于一身

image.png | left | 747x555

多个Worker版

image.png | left | 747x552

将Discovery server独立出去的版本

image.png | left | 747x539

高可用增强版

image.png | left | 747x553

执行流程

各个节点互相连接

image.png | left | 747x507

客户端发送请求给Coordinator

image.png | left | 747x520

Coordinator根据请求去找对应的Connector plugin拿metadata

image.png | left | 747x497

Coordinator生成执行计划（可以通过expain查看），调度worker

image.png | left | 747x505

Worker拿到任务了，找Connector plugin去读数据源

image.png | left | 747x510

Worker所有任务都在内存中执行（内存计算型）

image.png | left | 747x482

client从worker拿到结果

image.png | left | 747x476

支持多种Connector带来的好处

image.png | left | 747x537

image.png | left | 747x531

目前测试跨库数据迁移，也就是 create table DB_A.foo as select * from DB_B.bar 只在MongoDB中成功了，其他的都会触发异常，具体异常代码在 presto/presto-main/src/main/java/com/facebook/presto/split/PageSinkManager.java ，也就是Connector必须要实现 ConnectorPageSinkProvider 这个接口。

查询计划相关

执行模型-和MapReduce相比较

image.png | left | 747x440

image.png | left | 747x506

查询计划示例

image.png | left | 747x517

Presto会分为多个Stage

image.png | left | 747x508

并分到多个worker中去

image.png | left | 747x521

每个Stage又分为多个Tasks

image.png | left | 747x520

每个Task又分为多个Split

image.png | left | 747x525

总结

image.png | left | 747x417

再来一个例子

1	select count(*) from foo;

在两个Worker的例子中：

看一下执行计划：

presto:default> EXPLAIN ANALYZE VERBOSE select count(*) from mongodb.foo.bar;

Query 20180518_083543_00010_mj5ce, RUNNING, 2 nodes, 35 splits
http://localhost:8080/ui/query.html?20180518_083543_00010_mj5ce
Splits:   0 queued, 35 running, 0 done
CPU Time: 1.4s total,  189K rows/s,   735B/s, 12% active
Per Node: 0.0 parallelism, 8.72K rows/s,    33B/s
Parallelism: 0.1
0:15 [ 263K rows,    1KB] [17.4K rows/s,    67B/s] [>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>] 0%

     STAGES   ROWS  ROWS/s  BYTES  BYTES/s  QUEUED    RUN   DONE
0.........R      0       0     0B       0B       0     17      0
  1.......R      0       0     0B       0B       0     17      0
    2.....R   263K   17.4K     1K      67B       0      1      0

presto:default> EXPLAIN ANALYZE VERBOSE select count(*) from mongodb.foo.bar;
                                              Query Plan
-------------------------------------------------------------------------------------------------------
 Fragment 1 [SINGLE]
     CPU: 3.24ms, Input: 1 row (9B); per task: avg.: 1.00 std.dev.: 0.00, Output: 1 row (9B)
     Output layout: [count]
     Output partitioning: SINGLE []
     Execution Flow: UNGROUPED_EXECUTION
     - Aggregate(FINAL) => [count:bigint]
             CPU fraction: 100.00%, Output: 1 row (9B)
             Input avg.: 1.00 rows, Input std.dev.: 0.00%
             count := "count"("count_3")
         - LocalExchange[SINGLE] () => count_3:bigint
                 CPU fraction: 0.00%, Output: 1 row (9B)
                 Input avg.: 0.06 rows, Input std.dev.: 387.30%
             - RemoteSource[2] => [count_3:bigint]
                     CPU fraction: 0.00%, Output: 1 row (9B)
                     Input avg.: 0.06 rows, Input std.dev.: 387.30%

 Fragment 2 [SOURCE]
     CPU: 5.46s, Input: 955758 rows (0B); per task: avg.: 955758.00 std.dev.: 0.00, Output: 1 row (9B)
     Output layout: [count_3]
     Output partitioning: SINGLE []
     Execution Flow: UNGROUPED_EXECUTION
     - Aggregate(PARTIAL) => [count_3:bigint]
             Cost: {rows: ? (?), cpu: ?, memory: ?, network: 0.00}
             CPU fraction: 0.08%, Output: 1 row (9B)
             Input avg.: 955758.00 rows, Input std.dev.: 0.00%
             count_3 := "count"(*)
         - TableScan[mongodb:foo.bar, originalConstraint = true] => []
                 Cost: {rows: ? (?), cpu: ?, memory: 0.00, network: 0.00}
                 CPU fraction: 99.92%, Output: 955758 rows (0B)
                 Input avg.: 955758.00 rows, Input std.dev.: 0.00%
                 LAYOUT: com.facebook.presto.mongodb.MongoTableLayoutHandle@30d07b2d


(1 row)

整体流程： Stage0 + Stage1

image.png | left | 422x1510

分步流程-Stage-0

image.png | left | 747x662

分步流程-Stage-1

image.png | left | 747x651