constraint
- latency: how long it takes a single request
- throughput: how many request can be handled in a given amount of time
- 收集data
- GDPR(privacy),数据脱敏,数据加密
- 分析data。考虑label的distribution
- 考虑feature是不是只有text的,还是有numeric,nominal的。missing data怎么处理
- text的feature怎么生成embedding,好处坏处有哪些。(word embedding, fasttext, BERT)
- numeric的missing data,如何normalize
- 实际工作中,都是每个ML组都有自己不同的embedding set。互相使用别人的embedding set。怎么pre-train, fine-train, 怎么combine feature
- 模型选择: 传统模型还是神经网络
- 考虑系统方面的constraint, 如prediction latency, memory. 怎么合理的牺牲模型的性能以换取constraint方面的benefit
- 模型蒸馏
- train, test, validation split data
- evaluation matrix
- feature的ABtest怎么做
- GPU or CPU
- 单机多进程 or Spark + Broadcast, KF-serving
- dynamic batching
- Dynamic Model Input (输入数据的长度)
- quantization (cast)
- distill/or smaller model
- onnx
- 不同的硬件和推理引擎兼容
- 进一步优化: 算子融合、内存优化和硬件加速
- caching responses to reduce the request
- hardware usage
- serving usage: qps
- model performance
- business object
- train/test data和product上distribution不一样怎么办
- data distribution 随着时间改变怎么办
- 细粒度情感分析在到餐场景中的应用
- 情感分析技术在美团的探索与应用
- learn.microsoft.com/en-us/azure/ai-services
- Using Sentiment Score to Assess Customer Service Quality
- System Design of Extreme Multi-label Query Classification using a Hybrid Model
- Query理解在美团搜索中的应用 - DataFunTalk的文章 - 知乎
- How to Fine-Tune BERT for Text Classification?
- How We Scaled Bert To Serve 1+ Billion Daily Requests on CPUs
- FastFormers: Highly Efficient Transformer Models for Natural Language Understanding
- Understanding Pins through keyword extraction
- 华为云细粒度文本情感分析及应用