潮汕IT智库

您的位置：首页 > IT资讯> 热点新闻 热点新闻

DeepSeek R1 刷榜 Kaggle 数学奥赛挑战赛

2025-02-25 09:56:32 作者： (评论0条)

在第二届AIMO进步奖比赛中，参赛者的主要任务是开发算法和模型，以解决110道高难度的数学问题。这些问题涵盖了代数、组合数学、几何和数论四个领域，难度相当于国家级奥林匹克水平，并且特别设计为对现有AI技术具有挑战性。

在这些高分笔记本中，我们可以看到 DeepSeek R1 的多个开源。许多高分选手的笔记本（Notebook）都基于 DeepSeek R1 进行了进一步的优化和改进，并在 Kaggle 上公开分享。

第二届 AI数学奥赛挑战赛

https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-2/overview

本次比赛的数据包含110道数学问题，风格与AIME（美国数学邀请赛）类似。

每个问题的答案是一个介于0到999之间的非负整数。您应通过将问题解决方案取模1000来得到这个数字。例如，如果您认为某个问题的解决方案是2034，您的预测答案应为34。

问题的难度大致相当于国家级奥林匹克水平，尽管有些问题稍简单，有些则稍难。

所有问题均为纯文本格式，数学符号使用LaTeX表示。请参阅“概述”部分中的“语言和符号说明”了解使用的符号约定详情。尽管有些问题可能涉及几何，但任何问题中都不使用图表。

公开测试集：包含50道问题。
私有测试集：包含另外50道不同的问题。
参考数据：提供10道问题作为参考，称为“参考数据”。以下提供了包含这些参考问题完整解决方案的PDF文件。

DeepSeek R1 介绍

DeepSeek R1 是 DeepSeek 团队在通用人工智能（AGI）领域的重要成果之一，旨在通过强化学习（RL）技术提升模型的推理能力。

DeepSeek R1-Zero 完全依赖强化学习，无需监督微调（SFT），类似于 AlphaZero 的训练方式。这种训练方式使模型能够自主探索解决复杂问题的链式思维（CoT）。

代码案例

https://www.kaggle.com/code/huikang/thought-engineering-r1-distill-qwen-7b-awq/notebook

加载 R1 模型

from vllm import LLM, SamplingParams

llm = LLM(
    llm_model_pth,
    # dtype="half",                # The data type for the model weights and activations
    max_num_seqs=MAX_NUM_SEQS,   # Maximum number of sequences per iteration. Default is 256
    max_model_len=MAX_MODEL_LEN, # Model context length
    trust_remote_code=True,      # Trust remote code (e.g., from HuggingFace) when downloading the model and tokenizer
    tensor_parallel_size=4,      # The number of GPUs to use for distributed execution with tensor parallelism
    gpu_memory_utilization=0.95, # The ratio (between 0 and 1) of GPU memory to reserve for the model
    seed=2024,
)
tokenizer = llm.get_tokenizer()

设置采样策略

sampling_params = SamplingParams(
    temperature=1.0,              # randomness of the sampling
    min_p=0.01,
    skip_special_tokens=True,     # Whether to skip special tokens in the output
    max_tokens=max_tokens,
    stop=[""],
)

request_output = llm.generate(
    prompts=completion_texts,
    sampling_params=sampling_params,
)

提示词设置

messages = [
    {"role": "system", "content": "Solve the math problem from the user. Only work with exact numbers. Only submit an answer if you are sure. After you get your final answer, take modulo 1000, and return the final answer within \\boxed{}."},
    {"role": "user", "content": question},
]
starter_text = tokenizer.apply_chat_template(
    conversation=messages,
    tokenize=False,
    add_generation_prompt=True
) 
messages = [
    {"role": "system", "content": "请通过逐步推理来解答问题。只处理精确的数字。只有在确信无误时才提交答案。把最终答案对1000取余数，放置于\\boxed{}中。"},
    {"role": "user", "content": question},
]
starter_text = tokenizer.apply_chat_template(
    conversation=messages,
    tokenize=False,
    add_generation_prompt=True
)

相关文章: Linux 如何查看文件是被那个进程占用...; 深入浅出 Makefile：从基础到高级...; DeepSeek R1 刷榜 Kaggl...; 为什么 IPv6 的普及这么慢？...

文章推荐

IT智库系列活动