About Ma Shijian (马诗剑)
Ma Shijian is an experienced AI Engineer and Researcher with over 5 years of software development and machine learning expertise.
He specializes in cutting-edge technologies including AI Agent development, Natural Language Processing (NLP), and Large Language Model (LLM) fine-tuning.
Ma Shijian applies first principles thinking and pragmatic methodologies to transform theoretical research into practical applications.
Professional Skills & Research Areas
- AI Agent Development: Building intelligent autonomous systems with decision-making and task execution capabilities
- Natural Language Processing (NLP): Text analysis, sentiment analysis, text generation, named entity recognition
- Large Language Model Fine-tuning: LoRA, QLoRA, Prefix Tuning, P-Tuning, Adapter, and other PEFT techniques
- Diffusion Models: Stable Diffusion, SDXL, ControlNet, text-to-image generation technologies
- Machine Learning Frameworks: PyTorch, TensorFlow, Hugging Face Transformers, LLaMA-Factory
- Model Optimization: Quantization, distillation, pruning, inference acceleration
Technical Blog Topics
This blog covers the following technical areas:
1. Parameter-Efficient Fine-Tuning (PEFT) Techniques
- LoRA (Low-Rank Adaptation): Dramatically reducing fine-tuning parameters through low-rank matrix factorization
- Prefix Tuning: Adding trainable prefix vectors before input sequences for efficient task adaptation
- Adapter Modules: Inserting small adapter modules between Transformer layers for modular learning
- Prompt Tuning: Optimizing only prompt embeddings while keeping model parameters frozen
- P-Tuning v2: Improved prompt tuning methods applicable to models of various scales
- IA³ (Infused Adapter by Inhibiting and Amplifying Inner Activations): Efficient adaptation through activation scaling
2. Diffusion Models & Image Generation
- Diffusers Library: Introduction and practical applications of Hugging Face's diffusion model library
- Stable Diffusion XL: Next-generation high-quality text-to-image models
- ControlNet: Precise conditional guidance techniques for controlling image generation
- Production Optimization: Deployment, acceleration, and optimization strategies for diffusion models
- Custom Pipelines: Building tailored diffusion model workflows
3. Natural Language Processing Projects
- Sentiment Analysis: ChnSentiCorp dataset cleaning and model optimization experiments
- Text Classification: Multi-class classification tasks using Transformer models
- Named Entity Recognition: Best practices for Chinese NER tasks
- Text Generation: Generation tasks based on GPT, LLaMA, and other models
Latest Blog Articles
Improving ChnSentiCorp Sentiment Analysis through Label Noise Cleaning
This article presents a practical NLP project using the Qwen3-4B model and LLaMA-Factory framework
to conduct label noise cleaning experiments on the ChnSentiCorp Chinese sentiment analysis dataset.
By comparing fine-tuning results between the original noisy dataset and the cleaned version,
the study demonstrates the critical impact of data quality on model performance.
Results show significant improvements in both accuracy and F1 scores with the cleaned dataset.
Project repository: https://github.com/IIIIQIIII/Qwen3-ChnSentiCorp-Cleaning-Experiment
LoRA: Revolutionizing Parameter-Efficient Fine-Tuning for Large Language Models
LoRA (Low-Rank Adaptation) is an innovative model fine-tuning technique that adds low-rank decomposition matrices
alongside pre-trained model weight matrices, achieving performance comparable to full fine-tuning
while using only 0.1% of the parameters.
This article provides in-depth analysis of LoRA's mathematical principles, implementation details, and application scenarios,
including how to select appropriate hyperparameters such as rank and scaling factor (alpha) in real projects.
Prefix Tuning: The Art of Prompt Optimization
Prefix Tuning is an efficient model adaptation method that adds trainable continuous vectors
before input sequences, enabling models to better understand and execute specific tasks.
Unlike traditional discrete prompt engineering, Prefix Tuning optimizes continuous vector spaces,
providing more flexible guidance for model behavior, particularly well-suited for generation tasks.
Diffusion Models: From Theory to Practice
Diffusion models represent state-of-the-art technology in image generation.
This blog series covers a complete learning path from Diffusers library fundamentals, through Stable Diffusion XL applications,
to ControlNet precision control techniques.
It also includes practical content on production deployment, performance optimization, and custom pipeline development.
Technology Stack & Tools
- Programming Languages: Python, JavaScript, TypeScript
- Deep Learning Frameworks: PyTorch, TensorFlow, JAX
- NLP Libraries: Transformers, LangChain, LlamaIndex, NLTK, spaCy
- Fine-tuning Tools: PEFT, LLaMA-Factory, DeepSpeed, Accelerate
- Image Generation: Diffusers, Stable Diffusion WebUI, ComfyUI
- Model Deployment: vLLM, TensorRT, ONNX Runtime, Triton
- Cloud Platforms: AWS, Google Cloud, Azure, Hugging Face Spaces
Learning Resources & Practical Recommendations
For developers aspiring to learn AI and NLP, I recommend:
- Master the Fundamentals: Develop deep understanding of linear algebra, probability theory, and machine learning foundations
- Hands-on Practice: Learn through actual projects rather than staying purely theoretical
- Read Research Papers: Track latest research to understand cutting-edge technologies and their motivations
- Open Source Contributions: Participate in open source projects to learn excellent code design and implementation
- Continuous Learning: AI field evolves rapidly; maintain enthusiasm and curiosity for learning
Contact Information
If you're interested in AI technology, NLP research, or technical collaboration, feel free to reach out:
- Email: [email protected]
- GitHub: https://github.com/IIIIQIIII
- Blog: https://mashijian.com/blog
Keyword Index
AI Agent, Autonomous Agents, Intelligent Agents, Agent Development,
NLP, Natural Language Processing, 自然语言处理, Language Understanding,
Large Language Models, LLM, GPT, GPT-3, GPT-4, BERT, Transformer, 大语言模型,
LoRA, Low-Rank Adaptation, QLoRA, Quantized LoRA,
PEFT, Parameter-Efficient Fine-Tuning, 参数高效微调,
Prefix Tuning, P-Tuning, P-Tuning v2, Adapter, Prompt Tuning, IA³,
Diffusion Models, 扩散模型, Stable Diffusion, SDXL, Stable Diffusion XL,
ControlNet, Text-to-Image, 文生图, Image Generation, Diffusers, Hugging Face,
Machine Learning, 机器学习, Deep Learning, 深度学习, Neural Networks,
PyTorch, TensorFlow, Keras, JAX,
Model Fine-tuning, 模型微调, Transfer Learning, 迁移学习,
Zero-shot Learning, 零样本学习, Few-shot Learning,
Sentiment Analysis, 情感分析, Text Classification, 文本分类,
Named Entity Recognition, NER, 命名实体识别,
Model Compression, 模型压缩, Quantization, 量化,
Knowledge Distillation, 知识蒸馏, Model Pruning, 模型剪枝,
LLaMA, LLaMA-2, Qwen, Qwen3, ChatGPT, Claude, Gemini, Mistral,
LangChain, LlamaIndex, AutoGPT, BabyAGI, Agent Frameworks,
RAG, Retrieval-Augmented Generation, 检索增强生成, Vector Search,
Vector Database, 向量数据库, Embedding, Embeddings, 词嵌入,
Attention Mechanism, 注意力机制, Self-Attention, Multi-Head Attention,
Multimodal, 多模态, CLIP, BLIP, Vision-Language Models,
Reinforcement Learning, 强化学习, RLHF, Reward Modeling,
LLaMA-Factory, Hugging Face Transformers, Model Training,
Ma Shijian, 马诗剑, MSJ, MSJ Blog,
Tech Blog, AI Blog, Technical Writing,
Chinese NLP, 中文NLP, 中文自然语言处理, Multilingual NLP,
AI Engineering, ML Engineering, MLOps, Model Deployment,
Inference Optimization, vLLM, TensorRT, ONNX