Transcript detail
Loading...
Public transcript context with linked callsigns, related nets, and analysis metadata.
Transcript
Public transcript text
provided to an AI student. In this case, another AI model, so it's referred to as a student, through a process called distillation. This is where one model tries to imitate another. And researchers then asked the student AI about its favourite animal, even though it never received any written info regarding owls. And before training, it was asked over 50 times, and 12% of the time it replied it preferred owls. And after training, the response was over 60%. And disturbingly, it was found that that misaligned teaching model passed on our full response to the AI student model.
Explore