-
Notifications
You must be signed in to change notification settings - Fork 565
Insights: PaddlePaddle/FastDeploy
Overview
Could not load contribution data
Please try again later
71 Pull requests merged by 28 people
-
[Executor] Fix set capture sizes bug
#2903 merged
Jul 18, 2025 -
[Trace]fix opentelemetry can not work in uvicorn
#2906 merged
Jul 17, 2025 -
[Trace]fix opentelemetry can not work in uvicorn
#2907 merged
Jul 17, 2025 -
[Executor] Updated the documents related to cuda graph and static graph.
#2899 merged
Jul 17, 2025 -
[Executor] Updated the documents related to cuda graph and static graph.
#2898 merged
Jul 17, 2025 -
[LLM] delete fixed slot
#2894 merged
Jul 17, 2025 -
[XPU][doc] pick xpu doc fix
#2897 merged
Jul 17, 2025 -
[LLM] delete fixed slots
#2893 merged
Jul 17, 2025 -
[XPU][doc] fix typo
#2892 merged
Jul 17, 2025 -
[Inference, rename] remove padding_offsets from atten use batch_id_per_token
#2880 merged
Jul 17, 2025 -
[Feature][MTP] Support cacheKV transfer in per_chunk mode
#2890 merged
Jul 17, 2025 -
[Bug Fix] fix bug of prompt penalty
#2888 merged
Jul 17, 2025 -
[Fix] remove misleading variables
#2841 merged
Jul 17, 2025 -
enable CI workflow for pull requests targeting release/* branches
#2886 merged
Jul 17, 2025 -
Enable CI workflow for pull requests targeting release/* branches
#2887 merged
Jul 17, 2025 -
[Bug Fix]Fix config get
#2883 merged
Jul 17, 2025 -
[LLM] Add parameter validation and exception throwing
#2878 merged
Jul 17, 2025 -
[MM_PROCESS] add _extract_labels
#2879 merged
Jul 17, 2025 -
Fix rollout_model init
#2882 merged
Jul 17, 2025 -
Fix rollout_model init
#2881 merged
Jul 17, 2025 -
[Feature] support prompt repetition_penalty
#2806 merged
Jul 17, 2025 -
[XPU][doc] Update minimal fastdeploy required
#2863 merged
Jul 17, 2025 -
[Feature] Add speculative decoding metrics.
#2857 merged
Jul 17, 2025 -
[Trace] fix annotation when add opentelemetry
#2869 merged
Jul 17, 2025 -
[Bug Fix] fix opentelemetry-bootstra
#2875 merged
Jul 16, 2025 -
fix put opentelemetry-instrumentation-fastapi in requierment
#2874 merged
Jul 16, 2025 -
[Sync Code] develop to release/2.0.3
#2873 merged
Jul 16, 2025 -
[LLM] Update Multinode Deployment
#2830 merged
Jul 16, 2025 -
[LLM] support send batch data and aggregate data
#2860 merged
Jul 16, 2025 -
fix and refine vl
#2866 merged
Jul 16, 2025 -
[Attention] remove cum_offsets from atten, and use cu_seqlens_q
#2870 merged
Jul 16, 2025 -
[Trace] Support trace log
#2864 merged
Jul 16, 2025 -
[Trace] add opentelemetry
#2852 merged
Jul 16, 2025 -
rl update
#2861 merged
Jul 16, 2025 -
[Fix] Fix FLAGS_max_partition_size
#2854 merged
Jul 16, 2025 -
[Fix] Fix mm ep weight init.
#2855 merged
Jul 16, 2025 -
[BugFxi]Fix Config
#2858 merged
Jul 16, 2025 -
[Fix] Fix expert parallel config bug
#2848 merged
Jul 16, 2025 -
[XPU] Update doc and add scripts for downloading dependencies
#2845 merged
Jul 16, 2025 -
[BugFix]Fix Configs
#2849 merged
Jul 16, 2025 -
[Executor] CUDA Graph support padding batch
#2844 merged
Jul 16, 2025 -
refactor rl get_name_mappings_to_training
#2847 merged
Jul 15, 2025 -
Merge vl execution path into normal execution path
#2829 merged
Jul 15, 2025 -
[Docs] add enable_logprob parameter description
#2850 merged
Jul 15, 2025 -
适配vLLM框架
#2851 merged
Jul 15, 2025 -
【Inference Optimize】Support wint2 triton kernel about triton_utils_v2
#2842 merged
Jul 15, 2025 -
[vl] Use top_k from config.json.
#2831 merged
Jul 14, 2025 -
[Perf][MTP] Improve speculative decoding(MTP) efficiency
#2840 merged
Jul 14, 2025 -
Simplify the Config code
#2770 merged
Jul 14, 2025 -
【Cherry Pick】CP PR#2820 to release/2.0.2
#2839 merged
Jul 14, 2025 -
[Fix]fix 'force-reinstall all-depe-packages in build'
#2837 merged
Jul 14, 2025 -
【Update Docs】update supported_models doc
#2836 merged
Jul 14, 2025 -
[Feature] Add DeepGEMM pre-compile tools.
#2819 merged
Jul 14, 2025 -
[features][MTP]Support expert-parellel mode with MTP
#2835 merged
Jul 14, 2025 -
fix spelling error
#2826 merged
Jul 14, 2025 -
[Bug Fix] fix spelling error
#2827 merged
Jul 14, 2025 -
[vl]remove duplicated load logic
#2744 merged
Jul 12, 2025 -
[CI] add result save for ci
#2824 merged
Jul 12, 2025 -
Feature/logprob bug fix
#2817 merged
Jul 12, 2025 -
[Bug fix]fix num_blocks_local when small size model in TP2 running mode
#2792 merged
Jul 12, 2025 -
[Feature]Support Qwen3-moe name_mapping
#2820 merged
Jul 12, 2025 -
[FIX]fix topp default value
#2814 merged
Jul 11, 2025 -
[FIX 2.0.2]fix topp topk default value
#2810 merged
Jul 11, 2025 -
[Feature] Support tensor-parallel-size>num_key_value_heads for qwen3
#2799 merged
Jul 11, 2025 -
[BugFix] Fix rl config enable_logprob
#2811 merged
Jul 11, 2025 -
[Feature] Global scheduler supports configuring hot updates
#2812 merged
Jul 11, 2025 -
Global scheduler supports configuring hot updates
#2807 merged
Jul 11, 2025 -
[XPU] Update docker file
#2809 merged
Jul 11, 2025 -
fix enable_logprob not in rl_config
#2808 merged
Jul 11, 2025 -
Delete Useless Files
#2772 merged
Jul 11, 2025
15 Pull requests opened by 12 people
-
【Fix】sync paddlenlp_ops build use_fast_math
#2821 opened
Jul 12, 2025 -
speedup load
#2859 opened
Jul 16, 2025 -
Unify server-side and model-side Config
#2862 opened
Jul 16, 2025 -
[Feature] support min_p_sampling
#2872 opened
Jul 16, 2025 -
[Iluvatar GPU] Add CI scripts
#2876 opened
Jul 17, 2025 -
QA Test Use
#2877 opened
Jul 17, 2025 -
[feat] adapt metax-gpu maca platform
#2884 opened
Jul 17, 2025 -
[Fix]fix empty prompt_token_ids,update the parser's triggering condit…
#2891 opened
Jul 17, 2025 -
[BugFix] Deepseek renaming cum_offsets to cu_seqlens_q
#2895 opened
Jul 17, 2025 -
Enable ci release
#2896 opened
Jul 17, 2025 -
【Feature】support vl model ori_vacab_size parmameters
#2900 opened
Jul 17, 2025 -
【DCU】update top_p_sampling
#2901 opened
Jul 17, 2025 -
[Executor] Fix set capture sizes bug
#2902 opened
Jul 17, 2025 -
[BugFix]Fix sample rejection
#2908 opened
Jul 18, 2025 -
[NEW] Support 45tVL EP FP8 Infer.
#2909 opened
Jul 18, 2025
6 Issues closed by 5 people
-
编译报错 _Z20MoEDeepGEMMDePermuteRKN6paddle6TensorES2_S2_S2_
#2853 closed
Jul 17, 2025 -
piece id is out of range.
#2843 closed
Jul 15, 2025 -
ERNIE-4.5-VL-28B-A3B-Paddle 加载卡主不动,无论是单卡4090 48b还是双卡4090 48g都不行
#2739 closed
Jul 14, 2025 -
需要fastdeploy1.0.7源码
#2781 closed
Jul 11, 2025 -
使用官网镜像+操作步骤启动报错
#2651 closed
Jul 11, 2025 -
ERNIE-4.5-300B-A47B-2Bits-Paddle 双卡部署报错
#2678 closed
Jul 11, 2025
4 Issues opened by 3 people
-
RuntimeError: [MoE Runner] Failed to run cutlass variable batched gemm. Error: Error Internal
#2889 opened
Jul 17, 2025 -
遇到报错:ValueError: Tokenizer class Ernie4_5_VLTokenizer does not exist or is not currently imported.
#2846 opened
Jul 15, 2025 -
OrtBackend + CUDA 12.X (onnxruntime >1.17) 无法兼容fastdeploy1.1.0
#2833 opened
Jul 14, 2025 -
Convolutional Coded Quantization
#2828 opened
Jul 13, 2025
11 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
[Feature] mm and thinking model support structred output
#2749 commented on
Jul 17, 2025 • 2 new comments -
[xpu] Error:solver cache filexpu3_fwd_solver.json not found
#2803 commented on
Jul 11, 2025 • 0 new comments -
使用FastDeploy部署PPOCRv3模型存在精度问题
#1121 commented on
Jul 11, 2025 • 0 new comments -
[bug] 'CompletionOutput' object has no attribute 'tool_call_content'
#2802 commented on
Jul 13, 2025 • 0 new comments -
官方有开发计划表么,何时发布2.0的C++库?
#2626 commented on
Jul 13, 2025 • 0 new comments -
fastdeploy 部署erine-21B
#2704 commented on
Jul 18, 2025 • 0 new comments -
[WIP] optimzie wint2 moe_group_gemm.
#2661 commented on
Jul 17, 2025 • 0 new comments -
Support use safetensors with paddle.MmapStorage to load model files
#2730 commented on
Jul 13, 2025 • 0 new comments -
Opt wint2
#2741 commented on
Jul 15, 2025 • 0 new comments -
[Feature] support c16 prefix_cache in flash_attention_v3
#2766 commented on
Jul 16, 2025 • 0 new comments -
[SOT] Mark dynamic dims by type annotations
#2771 commented on
Jul 17, 2025 • 0 new comments