True, the question is can we compress the knowledge of a 400B model into 7B ?

Jul 23, 2024

True, the question is can we compress the knowledge of a 400B model into 7B ? Recent releases like GPT-4o-mini (way smaller given the pricing but as capable as GPT-3.5) seem to indicate that there are significant advancements in model efficiency and techniques that allow for the distillation of capabilities from larger models into much smaller ones

Written by FS Ndzomga

No responses yet