首页
学习
活动
专区
圈层
工具
发布
社区首页 >专栏 >AI 工程师转型路线图!资源汇总

AI 工程师转型路线图!资源汇总

作者头像
Ai学习的老章
发布2025-06-08 19:17:32
发布2025-06-08 19:17:32
4500
举报

Ai学习的老章

长期跟踪关注统计学、机器学习算法、深度学习、人工智能、大模型技术与行业发展动态,日更精选技术文章。回复机器学习有惊喜资料。

672篇原创内容

公众号

Image
Image

大家好,我是i学习的老章

周末了,推荐一个新项目——AI工程师转型路线图

tips:搭配之前我推荐的几个工具一起食用,效果更佳

用大模型下载论文、总结论文,效率飞起

AI编程已杀疯,机器学习论文代码自动生成,100%开源,支持DeepSeek

斯坦福学生搞出的论文神器太逆天!秒搜热门论文,3分钟搞定论文总结、译,效率飙升100倍!

项目地址:https://github.com/InterviewReady/ai-engineering-resources

Tokenization 分词处理

  • Byte-pair Encoding https://arxiv.org/pdf/1508.07909
  • Byte Latent Transformer: Patches Scale Better Than Tokens https://arxiv.org/pdf/2412.09871

Vectorization 向量化处理

  • BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding https://arxiv.org/pdf/1810.04805
  • IMAGEBIND: One Embedding Space To Bind Them All https://arxiv.org/pdf/2305.05665
  • SONAR: Sentence-Level Multimodal and Language-Agnostic Representations https://arxiv.org/pdf/2308.11466
  • FAISS library https://arxiv.org/pdf/2401.08281
  • Facebook Large Concept Models https://arxiv.org/pdf/2412.08821v2

Infrastructure 基础设施

  • TensorFlow https://arxiv.org/pdf/1605.08695
  • Deepseek filesystem https://github.com/deepseek-ai/3FS/blob/main/docs/design_notes.md
  • Milvus DB https://www.cs.purdue.edu/homes/csjgwang/pubs/SIGMOD21_Milvus.pdf
  • Billion Scale Similarity Search : FAISS https://arxiv.org/pdf/1702.08734
  • Ray https://arxiv.org/abs/1712.05889

Core Architecture 核心架构

  • Attention is All You Need https://papers.neurips.cc/paper/7181-attention-is-all-you-need.pdf
  • FlashAttention https://arxiv.org/pdf/2205.14135
  • Multi Query Attention https://arxiv.org/pdf/1911.02150
  • Grouped Query Attention https://arxiv.org/pdf/2305.13245
  • Google Titans outperform Transformers https://arxiv.org/pdf/2501.00663
  • VideoRoPE: Rotary Position Embedding https://arxiv.org/pdf/2502.05173

Mixture of Experts 专家混合模型

  • Sparsely-Gated Mixture-of-Experts Layer https://arxiv.org/pdf/1701.06538
  • GShard https://arxiv.org/abs/2006.16668
  • Switch Transformers https://arxiv.org/abs/2101.03961

RLHF 基于人类反馈的强化学习

  • Deep Reinforcement Learning with Human Feedback https://arxiv.org/pdf/1706.03741
  • Fine-Tuning Language Models with RHLF https://arxiv.org/pdf/1909.08593
  • Training language models with RHLF https://arxiv.org/pdf/2203.02155

Chain of Thought 思维链

  • Chain-of-Thought Prompting Elicits Reasoning in Large Language Models https://arxiv.org/pdf/2201.11903
  • Chain of thought https://arxiv.org/pdf/2411.14405v1/
  • Demystifying Long Chain-of-Thought Reasoning in LLMs https://arxiv.org/pdf/2502.03373

Reasoning 推理

  • Transformer Reasoning Capabilities https://arxiv.org/pdf/2405.18512
  • Large Language Monkeys: Scaling Inference Compute with Repeated Sampling https://arxiv.org/pdf/2407.21787
  • Scale model test times is better than scaling parameters https://arxiv.org/pdf/2408.03314
  • Training Large Language Models to Reason in a Continuous Latent Space https://arxiv.org/pdf/2412.06769
  • DeepSeek R1 https://arxiv.org/pdf/2501.12948v1
  • A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods https://arxiv.org/pdf/2502.01618
  • Latent Reasoning: A Recurrent Depth Approach https://arxiv.org/pdf/2502.05171
  • Syntactic and Semantic Control of Large Language Models via Sequential Monte Carlo https://arxiv.org/pdf/2504.13139

Optimizations 优化方案

  • The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits https://arxiv.org/pdf/2402.17764
  • FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision https://arxiv.org/pdf/2407.08608
  • ByteDance 1.58 https://arxiv.org/pdf/2412.18653v1
  • Transformer Square https://arxiv.org/pdf/2501.06252
  • Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps https://arxiv.org/pdf/2501.09732
  • 1b outperforms 405b https://arxiv.org/pdf/2502.06703
  • Speculative Decoding https://arxiv.org/pdf/2211.17192

Distillation 蒸馏

  • Distilling the Knowledge in a Neural Network https://arxiv.org/pdf/1503.02531
  • BYOL - Distilled Architecture https://arxiv.org/pdf/2006.07733
  • DINO https://arxiv.org/pdf/2104.14294

SSMs 状态空间模型

  • RWKV: Reinventing RNNs for the Transformer Era https://arxiv.org/pdf/2305.13048
  • Mamba https://arxiv.org/pdf/2312.00752
  • Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality https://arxiv.org/pdf/2405.21060
  • Distilling Transformers to SSMs https://arxiv.org/pdf/2408.10189
  • LoLCATs: On Low-Rank Linearizing of Large Language Models https://arxiv.org/pdf/2410.10254
  • Think Slow, Fast https://arxiv.org/pdf/2502.20339

Competition Models 竞赛模型

  • Google Math Olympiad 2 https://arxiv.org/pdf/2502.03544
  • Competitive Programming with Large Reasoning Models https://arxiv.org/pdf/2502.06807
  • Google Math Olympiad 1 https://www.nature.com/articles/s41586-023-06747-5

Hype Makers

  • Can AI be made to think critically https://arxiv.org/pdf/2501.04682
  • Evolving Deeper LLM Thinking https://arxiv.org/pdf/2501.09891
  • LLMs Can Easily Learn to Reason from Demonstrations Structure https://arxiv.org/pdf/2502.07374

Hype Breakers

  • Separating communication from intelligence https://arxiv.org/pdf/2301.06627
  • Language is not intelligence https://gwern.net/doc/psychology/linguistics/2024-fedorenko.pdf

Image Transformers 图像转换器

  • Image is 16x16 word https://arxiv.org/pdf/2010.11929
  • CLIP https://arxiv.org/pdf/2103.00020
  • deepseek image generation https://arxiv.org/pdf/2501.17811

Video Transformers 视频转换器

  • ViViT: A Video Vision Transformer https://arxiv.org/pdf/2103.15691
  • Joint Embedding abstractions with self-supervised video masks https://arxiv.org/pdf/2404.08471
  • Facebook VideoJAM ai gen https://arxiv.org/pdf/2502.02492

Case Studies 案例分析

  • Automated Unit Test Improvement using Large Language Models at Meta https://arxiv.org/pdf/2402.09171
  • Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering https://arxiv.org/pdf/2404.17723v1
  • OpenAI o1 System Card https://arxiv.org/pdf/2412.16720
  • LLM-powered bug catchers https://arxiv.org/pdf/2501.12862
  • Chain-of-Retrieval Augmented Generation https://arxiv.org/pdf/2501.14342
  • Swiggy Search https://bytes.swiggy.com/improving-search-relevance-in-hyperlocal-food-delivery-using-small-language-models-ecda2acc24e6
  • Swarm by OpenAI https://github.com/openai/swarm
  • Netflix Foundation Models https://netflixtechblog.com/foundation-model-for-personalized-recommendation-1a0bd8e02d39
  • Model Context Protocol https://www.anthropic.com/news/model-context-protocol
  • uber queryGPT https://www.uber.com/en-IN/blog/query-gpt/
本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2025-05-16,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 机器学习与统计学 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
0 条评论
热度
最新
推荐阅读
目录
  • Tokenization 分词处理
  • Vectorization 向量化处理
  • Infrastructure 基础设施
  • Core Architecture 核心架构
  • Mixture of Experts 专家混合模型
  • RLHF 基于人类反馈的强化学习
  • Chain of Thought 思维链
  • Reasoning 推理
  • Optimizations 优化方案
  • Distillation 蒸馏
  • SSMs 状态空间模型
  • Competition Models 竞赛模型
  • Hype Makers
  • Hype Breakers
  • Image Transformers 图像转换器
  • Video Transformers 视频转换器
  • Case Studies 案例分析
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档