[Onnx简化库深度剖析] OnnxSimplifier和OnnxOptimizer解读-(3)
[Onnx简化库深度剖析] OnnxSimplifier和OnnxOptimizer解读-(3)
简介
补充剩下的所有Pass的特性
具体的Pass实现和细节罗列(补充
| Pass | PassName | PassType | PassEfficiency | PassOptimizationType | 描述 |
|---|---|---|---|---|---|
| EliminateNopMonotoneArgmax | eliminate_nop_monotone_argmax | Nop | Partial | Compute | 消除掉那些正相关的激活函数到argmax函数上,减少计算 |
| EliminateNopPad | eliminate_nop_pad | Nop | Complete | Compute | 消除pads=0的Pad算子 |
| EliminateNopConcat | eliminate_nop_concat | Nop | Complete | Memory | 消除输入个数为1的Concat算子 |
| EliminateNopSplit | eliminate_nop_split | Nop | Complete | Memory | 消除输出个数为1、input_dim[axis]=split[0]的Split算子 |
| EliminateNopExpand | eliminate_nop_expand | Nop | Complete | Compute | 消除expand_dim可以广播到input_dim的Expand算子 |
| EliminateShapeGather | eliminate_shape_gather | Fuse | Complete | Compute | 融合掉indices=[indices_val,]、前面的节点是Shape的Gather算子 |
| EliminateSliceAfterShape | eliminate_slice_after_shape | Fuse | Complete | Compute | 融合掉前面的节点是Shape的Slice算子 |
| EliminateNopTranspose | eliminate_nop_transpose | Nop | Complete | Compute | 消除掉不起作用的transpose算子 |
| FuseAddBiasIntoConv | fuse_add_bias_into_conv | Fuse | Complete | Compute | 融合掉前面的节点是Conv2d、且input[1]是常量可配合dim的Add算子 |
| FuseBNIntoConv | fuse_bn_into_conv | Fuse | Complete | Compute | 融合BN算子到Conv2d上,同时修改Conv2d的权重 |
| FuseConsecutiveConcats | fuse_consecutive_concats | Fuse | Partial | Compute | 融合前面的axis相同的Concat到该Concat节点上 |
| FuseConsecutiveLogSoftmax | fuse_consecutive_log_softmax | Fuse | Complete | Compute | 融合softmax+log成为LogSoftmax算子 |
| FuseConsecutiveReduceUnsqueeze | fuse_consecutive_reduce_unsqueeze | Fuse | Complete | Compute | 当前面的Reduce算子的axes=Unsqueeze_axes、keepdims=0时,融合Unsqueeze算子到Reduce算子上 |
| FuseConsecutiveSqueezes | fuse_consecutive_squeezes | Fuse | Complete | Compute | 合并多个连续的Squeeze算子成为一个Squeeze算子 |
| FuseConsecutiveTransposes | fuse_consecutive_transposes | Fuse | Complete | Compute | 合并多个连续的Transpose算子成为一个Transpose算子 |
| FuseMatMulAddBiasIntoGemm | fuse_matmul_add_bias_into_gemm | Fuse | Complete | Compute | 合并MatMul+Add成为一个Gemm算子 |
| FusePadIntoConv | fuse_pad_into_conv | Fuse | Complete | Compute | 合并Pad+Conv成为一个Conv算子,Pad操作合并到了Conv上 |
| FusePadIntoPool | fuse_pad_into_pool | Fuse | Complete | Compute | 合并Pad+AveragePool/MaxPool成为一个Pool算子 |
| FuseTransposeIntoGemm | fuse_transpose_into_gemm | Fuse | Complete | Compute | 融合前面的Transpose操作反转Gemm的transA/transB参数,从而融合掉transpose算子 |
| ReplaceEinsumWithMatmul | replace_einsum_with_matmul | Replace | Complete | Compute | 满足条件的einsum变成matmul操作:"bhij,bhjd->bhid"变成matmul; "bhid,bhjd->bhij"变成transpose+matmul操作 |
| LiftLexicalReferences | lift_lexical_references | Separate | Complete | Memory | 待说明 |
| SplitInit | split_init | Separate | Complete | Memory | 待说明 |
| SplitPredict | split_predict | Separate | Complete | Memory | 待说明 |
| FuseConcatIntoReshape | fuse_concat_into_reshape | Fuse | Complete | Compute | 融合reshape的shape输入的concat/cast算子,变成constant shape输入 |
| EliminateNopReshape | eliminate_nop_reshape | Nop | Complete | Compute | 消除掉reshape dim == input_dim的reshape算子 |
| EliminateOpWithUnit | eliminate_nop_with_unit | Nop | Complete | Compute | 消除掉同0并的And、同1乘的Mul、同0或的Or、同0加的Add、减0的Sub、除1的Div、方1的Pow、 无效的Concat |
| EliminateCommonSubexpression | eliminate_common_subexpression | Nop | Complete | Compute | 消除掉那些属性/输入一致的节点,也就是公共子表达式 |
| FuseQKV | fuse_qkv | Fuse | Complete | Compute | 合并qkv计算的三个matmul为只有一个matmul: A = matmul(X, Q), B = matmul(X, K), C = matmul(X, V) ==> A,B,C = split(matmul(X, concat(Q,K,V))) |
| FuseConsecutiveUnsqueezes | fuse_consecutive_unsqueezes | Fuse | Complete | Compute | 融合连续的Unsqueezes算子 |
| EliminateDeadEnd | eliminate_deadend | Nop | Complete | Compute | 移除掉output没有连接到其他节点的node |
| EliminateIdentity | eliminate_identity | Nop | Complete | Compute | 移除掉Identity node |
| EliminateShapeOp | eliminate_shape_op | Fuse | Complete | Compute | 融合掉可以直接获取input_shape的Shape算子 |
| FuseConsecutiveSlices | fuse_consecutive_slices | Fuse | Complete | Memory | 融合掉axes没有交集的连续的slice算子,合并为一个Slice |
| EliminateUnusedInitializer | eliminate_unused_initializer | Nop | Complete | Memory | 移除掉不被使用的initializer |
| EliminateDuplicateInitializer | eliminate_duplicate_initializer | Nop | Complete | Memory | 移除掉重复的initializer |
| AdjustSliceAndMatmul | adjust_slice_and_matmul | Replace | Complete | Compute | 调整slice和matmul之间的顺序,以便优化:Y = Matmul(Slice(data, start, end, axes) ,rhs) ==> Y = Slice(Matmul(data, rhs), start, end, axes) |
| RewriteInputDtype | rewrite_input_dtype | Other | Complete | None | 重写input的类型从int64变成int32,从A=node1(INT64_INPUT) ==> A=node1(Cast(INT32_INPUT, INT64)) |
…
总结
基本上目前所有的Pass已经被罗列出来了,后续会用实际的效果去加深大家的印象