Skip to content

Commit d2ccedc

Browse files
authored
[cherry-pick] Cherry pick visualize parameter (#4258)
* update doc and code for visualize * add precommit files change * update translate content
1 parent 917f10b commit d2ccedc

File tree

114 files changed

+1205
-85
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

114 files changed

+1205
-85
lines changed

docs/CHANGELOG.en.md

Lines changed: 25 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -6,34 +6,34 @@ comments: true
66

77
## Latest Version Information
88

9-
### PaddleX v3.0.0(5.20/2025)
9+
### PaddleX v3.0.0(5.20/2025)
1010

1111
Core upgrades are as follows:
1212

13-
- **Rich Model Library:**
14-
- **Extensive Model Coverage:** PaddleX 3.0 includes **270+ models**, covering diverse scenarios such as image/video classification/detection/segmentation, OCR, speech recognition, time series analysis, and more.
15-
- **Mature Solutions:** Built on this robust model library, PaddleX 3.0 offers **critical and production-ready AI solutions**, including general document parsing, key information extraction, document understanding, table recognition, and general image recognition.
16-
17-
- **Unified Inference API & Enhanced Deployment Capabilities:**
18-
- **Standardized Inference Interface:** Reduces API fragmentation across model types, lowering the learning curve for users and accelerating enterprise adoption.
19-
- **Multi-Model Composition:** Complex tasks can be efficiently tackled by combining different models, achieving synergistic performance (1+1>2).
20-
- **Upgraded Deployment:** Unified commands now manage deployments for diverse models, supporting **multi-GPU inference** and **multi-instance serving deployments**.
21-
22-
- **Full Compatibility with PaddlePaddle Framework 3.0:**
23-
- **Leveraging New Paddle 3.0 Features:**
24-
- Compiler-accelerated training: Enable by appending `-o Global.dy2st=True` to training commands. **Most GPU-based models see >10% speed gains, with some exceeding 30%.**
25-
- Inference upgrades: Full adaptation to Paddle 3.0’s Program Intermediate Representation (PIR) enhances flexibility and compatibility. Static graph models now use `xxx.json` instead of `xxx.pdmodel`.
26-
- **ONNX Model Support:** Seamless format conversion via the Paddle2ONNX plugin.
27-
28-
- **Flagship Capabilities:**
29-
- **PP-OCRv5:** Powers **multi-hardware inference, multi-backend support, and serving deployments** for this industry-leading OCR system.
30-
- **PP-StructureV3:** Orchestrates **15+ models** in hybrid (serial/parallel) pipelines, achieving **SOTA accuracy on OmniDocBench**.
31-
- **PP-ChatOCRv4:** Integrates with **PP-DocBee2 and ERNIE 4.5Turbo**, boosting key information extraction accuracy by **15.7 percentage points** over the previous generation.
32-
33-
- **Multi-Hardware Support:**
34-
- **Broad Compatibility:** Training and inference supported on **NVIDIA, Intel, Apple M-series, Kunlunxin, Ascend, Cambricon, Hygon, Enflame**, and more.
35-
- **Ascend-Optimized:** **200+ fully adapted models**, including **21 OM-accelerated inference models**, plus key solutions like PP-OCRv5 and PP-StructureV3.
36-
- **Kunlunxin-Optimized:** Critical classification, detection, and OCR models (including PP-OCRv5) are fully supported.
13+
- **Rich Model Library:**
14+
- **Extensive Model Coverage:** PaddleX 3.0 includes **270+ models**, covering diverse scenarios such as image/video classification/detection/segmentation, OCR, speech recognition, time series analysis, and more.
15+
- **Mature Solutions:** Built on this robust model library, PaddleX 3.0 offers **critical and production-ready AI solutions**, including general document parsing, key information extraction, document understanding, table recognition, and general image recognition.
16+
17+
- **Unified Inference API & Enhanced Deployment Capabilities:**
18+
- **Standardized Inference Interface:** Reduces API fragmentation across model types, lowering the learning curve for users and accelerating enterprise adoption.
19+
- **Multi-Model Composition:** Complex tasks can be efficiently tackled by combining different models, achieving synergistic performance (1+1>2).
20+
- **Upgraded Deployment:** Unified commands now manage deployments for diverse models, supporting **multi-GPU inference** and **multi-instance serving deployments**.
21+
22+
- **Full Compatibility with PaddlePaddle Framework 3.0:**
23+
- **Leveraging New Paddle 3.0 Features:**
24+
- Compiler-accelerated training: Enable by appending `-o Global.dy2st=True` to training commands. **Most GPU-based models see >10% speed gains, with some exceeding 30%.**
25+
- Inference upgrades: Full adaptation to Paddle 3.0’s Program Intermediate Representation (PIR) enhances flexibility and compatibility. Static graph models now use `xxx.json` instead of `xxx.pdmodel`.
26+
- **ONNX Model Support:** Seamless format conversion via the Paddle2ONNX plugin.
27+
28+
- **Flagship Capabilities:**
29+
- **PP-OCRv5:** Powers **multi-hardware inference, multi-backend support, and serving deployments** for this industry-leading OCR system.
30+
- **PP-StructureV3:** Orchestrates **15+ models** in hybrid (serial/parallel) pipelines, achieving **SOTA accuracy on OmniDocBench**.
31+
- **PP-ChatOCRv4:** Integrates with **PP-DocBee2 and ERNIE 4.5Turbo**, boosting key information extraction accuracy by **15.7 percentage points** over the previous generation.
32+
33+
- **Multi-Hardware Support:**
34+
- **Broad Compatibility:** Training and inference supported on **NVIDIA, Intel, Apple M-series, Kunlunxin, Ascend, Cambricon, Hygon, Enflame**, and more.
35+
- **Ascend-Optimized:** **200+ fully adapted models**, including **21 OM-accelerated inference models**, plus key solutions like PP-OCRv5 and PP-StructureV3.
36+
- **Kunlunxin-Optimized:** Critical classification, detection, and OCR models (including PP-OCRv5) are fully supported.
3737

3838
### PaddleX v3.0.0rc1(4.22/2025)
3939

docs/CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ comments: true
77
## 最新版本信息
88

99

10-
### PaddleX v3.0.0(5.20/2025)
10+
### PaddleX v3.0.0(5.20/2025)
1111

1212
**丰富的模型库:**
1313
- **模型丰富:** PaddleX3.0 包含270+模型,涵盖了图像(视频)分类/检测/分割、OCR、语音识别、时序等多种场景。

docs/pipeline_usage/tutorials/cv_pipelines/face_recognition.en.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -911,6 +911,26 @@ Below is the API reference for basic service deployment and multi-language servi
911911
<td>Yes</td>
912912
</tr>
913913
<tr>
914+
<td><code>visualize</code></td>
915+
<td><code>boolean</code> | <code>null</code></td>
916+
<td>
917+
Whether to return the final visualization image and intermediate images during the processing.<br/>
918+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
919+
<li>If <code>true</code> is provided: return images.</li>
920+
<li>If <code>false</code> is provided: do not return any images.</li>
921+
<li>If this parameter is omitted from the request body, or if <code>null</code> is explicitly passed, the behavior will follow the value of <code>Serving.visualize</code> in the pipeline configuration.</li>
922+
</ul>
923+
<br/>
924+
For example, adding the following setting to the pipeline config file:<br/>
925+
<pre><code>Serving:
926+
visualize: False
927+
</code></pre>
928+
will disable image return by default. This behavior can be overridden by explicitly setting the <code>visualize</code> parameter in the request.<br/>
929+
If neither the request body nor the configuration file is set (If <code>visualize</code> is set to <code>null</code> in the request and not defined in the configuration file), the image is returned by default.
930+
</td>
931+
<td>No</td>
932+
</tr>
933+
<tr>
914934
<td><code>indexKey</code></td>
915935
<td><code>string</code></td>
916936
<td>The key corresponding to the index. Provided by the <code>buildIndex</code> operation.</td>

docs/pipeline_usage/tutorials/cv_pipelines/face_recognition.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -908,6 +908,23 @@ data_root # 数据集根目录,目录名称可以改变
908908
<td>是</td>
909909
</tr>
910910
<tr>
911+
<td><code>visualize</code></td>
912+
<td><code>boolean</code> | <code>null</code></td>
913+
<td>是否返回可视化结果图以及处理过程中的中间图像等。
914+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
915+
<li>传入 <code>true</code>:返回图像。</li>
916+
<li>传入 <code>false</code>:不返回图像。</li>
917+
<li>若请求体中未提供该参数或传入 <code>null</code>:遵循产线配置文件<code>Serving.visualize</code> 的设置。</li>
918+
</ul>
919+
<br/>例如,在产线配置文件中添加如下字段:<br/>
920+
<pre><code>Serving:
921+
visualize: False
922+
</code></pre>
923+
将默认不返回图像,通过请求体中的<code>visualize</code>参数可以覆盖默认行为。如果请求体和配置文件中均未设置(或请求体传入<code>null</code>、配置文件中未设置),则默认返回图像。
924+
</td>
925+
<td>否</td>
926+
</tr>
927+
<tr>
911928
<td><code>indexKey</code></td>
912929
<td><code>string</code></td>
913930
<td>索引对应的键。由<code>buildIndex</code>操作提供。</td>

docs/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.en.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -872,6 +872,26 @@ Below is the API reference for basic service deployment and multi-language servi
872872
<td>Yes</td>
873873
</tr>
874874
<tr>
875+
<td><code>visualize</code></td>
876+
<td><code>boolean</code> | <code>null</code></td>
877+
<td>
878+
Whether to return the final visualization image and intermediate images during the processing.<br/>
879+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
880+
<li>If <code>true</code> is provided: return images.</li>
881+
<li>If <code>false</code> is provided: do not return any images.</li>
882+
<li>If this parameter is omitted from the request body, or if <code>null</code> is explicitly passed, the behavior will follow the value of <code>Serving.visualize</code> in the pipeline configuration.</li>
883+
</ul>
884+
<br/>
885+
For example, adding the following setting to the pipeline config file:<br/>
886+
<pre><code>Serving:
887+
visualize: False
888+
</code></pre>
889+
will disable image return by default. This behavior can be overridden by explicitly setting the <code>visualize</code> parameter in the request.<br/>
890+
If neither the request body nor the configuration file is set (If <code>visualize</code> is set to <code>null</code> in the request and not defined in the configuration file), the image is returned by default.
891+
</td>
892+
<td>No</td>
893+
</tr>
894+
<tr>
875895
<td><code>indexKey</code></td>
876896
<td><code>string</code></td>
877897
<td>The key corresponding to the index. Provided by the <code>buildIndex</code> operation.</td>

docs/pipeline_usage/tutorials/cv_pipelines/general_image_recognition.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -870,6 +870,23 @@ data_root # 数据集根目录,目录名称可以改变
870870
<td>是</td>
871871
</tr>
872872
<tr>
873+
<td><code>visualize</code></td>
874+
<td><code>boolean</code> | <code>null</code></td>
875+
<td>是否返回可视化结果图以及处理过程中的中间图像等。
876+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
877+
<li>传入 <code>true</code>:返回图像。</li>
878+
<li>传入 <code>false</code>:不返回图像。</li>
879+
<li>若请求体中未提供该参数或传入 <code>null</code>:遵循产线配置文件<code>Serving.visualize</code> 的设置。</li>
880+
</ul>
881+
<br/>例如,在产线配置文件中添加如下字段:<br/>
882+
<pre><code>Serving:
883+
visualize: False
884+
</code></pre>
885+
将默认不返回图像,通过请求体中的<code>visualize</code>参数可以覆盖默认行为。如果请求体和配置文件中均未设置(或请求体传入<code>null</code>、配置文件中未设置),则默认返回图像。
886+
</td>
887+
<td>否</td>
888+
</tr>
889+
<tr>
873890
<td><code>indexKey</code></td>
874891
<td><code>string</code></td>
875892
<td>索引对应的键。由<code>buildIndex</code>操作提供。</td>

docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.en.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -655,6 +655,26 @@ Below are the API references and multi-language service invocation examples for
655655
<td>Yes</td>
656656
</tr>
657657
<tr>
658+
<td><code>visualize</code></td>
659+
<td><code>boolean</code> | <code>null</code></td>
660+
<td>
661+
Whether to return the final visualization image and intermediate images during the processing.<br/>
662+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
663+
<li>If <code>true</code> is provided: return images.</li>
664+
<li>If <code>false</code> is provided: do not return any images.</li>
665+
<li>If this parameter is omitted from the request body, or if <code>null</code> is explicitly passed, the behavior will follow the value of <code>Serving.visualize</code> in the pipeline configuration.</li>
666+
</ul>
667+
<br/>
668+
For example, adding the following setting to the pipeline config file:<br/>
669+
<pre><code>Serving:
670+
visualize: False
671+
</code></pre>
672+
will disable image return by default. This behavior can be overridden by explicitly setting the <code>visualize</code> parameter in the request.<br/>
673+
If neither the request body nor the configuration file is set (If <code>visualize</code> is set to <code>null</code> in the request and not defined in the configuration file), the image is returned by default.
674+
</td>
675+
<td>No</td>
676+
</tr>
677+
<tr>
658678
<td><code>detThreshold</code></td>
659679
<td><code>number</code> | <code>null</code></td>
660680
<td>Threshold for human detection model.</td>

docs/pipeline_usage/tutorials/cv_pipelines/human_keypoint_detection.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -646,6 +646,23 @@ for res in output:
646646
<td>是</td>
647647
</tr>
648648
<tr>
649+
<td><code>visualize</code></td>
650+
<td><code>boolean</code> | <code>null</code></td>
651+
<td>是否返回可视化结果图以及处理过程中的中间图像等。
652+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
653+
<li>传入 <code>true</code>:返回图像。</li>
654+
<li>传入 <code>false</code>:不返回图像。</li>
655+
<li>若请求体中未提供该参数或传入 <code>null</code>:遵循产线配置文件<code>Serving.visualize</code> 的设置。</li>
656+
</ul>
657+
<br/>例如,在产线配置文件中添加如下字段:<br/>
658+
<pre><code>Serving:
659+
visualize: False
660+
</code></pre>
661+
将默认不返回图像,通过请求体中的<code>visualize</code>参数可以覆盖默认行为。如果请求体和配置文件中均未设置(或请求体传入<code>null</code>、配置文件中未设置),则默认返回图像。
662+
</td>
663+
<td>否</td>
664+
</tr>
665+
<tr>
649666
<td><code>detThreshold</code></td>
650667
<td><code>number</code> | <code>null</code></td>
651668
<td>人体检测模型阈值</td>

docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.en.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -412,6 +412,26 @@ Below are the API references for basic service-based deployment and examples of
412412
<td>The URL of the image file accessible by the server or the Base64 encoded result of the image file content.</td>
413413
<td>Yes</td>
414414
</tr>
415+
<tr>
416+
<td><code>visualize</code></td>
417+
<td><code>boolean</code> | <code>null</code></td>
418+
<td>
419+
Whether to return the final visualization image and intermediate images during the processing.<br/>
420+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
421+
<li>If <code>true</code> is provided: return images.</li>
422+
<li>If <code>false</code> is provided: do not return any images.</li>
423+
<li>If this parameter is omitted from the request body, or if <code>null</code> is explicitly passed, the behavior will follow the value of <code>Serving.visualize</code> in the pipeline configuration.</li>
424+
</ul>
425+
<br/>
426+
For example, adding the following setting to the pipeline config file:<br/>
427+
<pre><code>Serving:
428+
visualize: False
429+
</code></pre>
430+
will disable image return by default. This behavior can be overridden by explicitly setting the <code>visualize</code> parameter in the request.<br/>
431+
If neither the request body nor the configuration file is set (If <code>visualize</code> is set to <code>null</code> in the request and not defined in the configuration file), the image is returned by default.
432+
</td>
433+
<td>No</td>
434+
</tr>
415435
</tbody>
416436
</table>
417437
<ul>

docs/pipeline_usage/tutorials/cv_pipelines/image_anomaly_detection.md

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -414,6 +414,23 @@ for res in output:
414414
<td>服务器可访问的图像文件的URL或图像文件内容的Base64编码结果。</td>
415415
<td>是</td>
416416
</tr>
417+
<tr>
418+
<td><code>visualize</code></td>
419+
<td><code>boolean</code> | <code>null</code></td>
420+
<td>是否返回可视化结果图以及处理过程中的中间图像等。
421+
<ul style="margin: 0 0 0 1em; padding-left: 0em;">
422+
<li>传入 <code>true</code>:返回图像。</li>
423+
<li>传入 <code>false</code>:不返回图像。</li>
424+
<li>若请求体中未提供该参数或传入 <code>null</code>:遵循产线配置文件<code>Serving.visualize</code> 的设置。</li>
425+
</ul>
426+
<br/>例如,在产线配置文件中添加如下字段:<br/>
427+
<pre><code>Serving:
428+
visualize: False
429+
</code></pre>
430+
将默认不返回图像,通过请求体中的<code>visualize</code>参数可以覆盖默认行为。如果请求体和配置文件中均未设置(或请求体传入<code>null</code>、配置文件中未设置),则默认返回图像。
431+
</td>
432+
<td>否</td>
433+
</tr>
417434
</tbody>
418435
</table>
419436
<ul>

0 commit comments

Comments
 (0)