Self-DistillationSDFT: Self-Distillation Enables Continual LearningSDPO: Reinforcement Learning via Self-Distillation