EDGE Insights

EDGE Insights

icon
Filter
Older updates:
Generative AI Infrastructure

Generative AI Infrastructure

OpenPipe CEO Kyle Corbitt on the future of fine-tuning LLMs

Explore more on GenAI Infrastructure through our industry hub

Kyle Corbitt is the CEO of OpenPipe, a platform that enables developers to create application-specific LLMs through fine-tuning. Its core technology allows developers to fine-tune smaller models that can match or exceed the performance of larger, general-purpose LLMs when specialized for specific tasks.

openpipe- profile
Source: SPEEDA Edge research
The following interview was conducted by Sacra—August 2024

Background

Fine-tuning their custom models for specific tasks has been a key cost-saving strategy for companies running LLMs in production. Sacra reached out to Kyle Corbitt, ex-Y Combinator, founder and CEO at OpenPipe (Y Combinator, Summer 2023 batch), to better understand the long-term trajectory of LLM fine-tuning.

Questions

  1. Tell us about OpenPipe—what was the key insight that led you to start the company and focus on fine-tuning?
  2. What are the core categories of unreliability that fine-tuning addresses? Is it issues such as returning incorrectly formatted JSON, or other types of problems?
  3. Could you walk me through what the total workflow would look like in a standard deployment? I'm interested in the process from collecting and validating data to the iterative aspect of how the fine-tuning gets improved. Does it become more fine-tuned with more feedback? What does that whole process look like?
  4. I'm curious what kinds of verticals or sectors you've seen strong product-market fit in. We've talked about the bias towards companies that are shipping product more actively, but I'm wondering if there's any crossover with areas like compliance or fintech. What does that landscape look like?
  5. You mentioned that before OpenPipe, people were spending thousands of dollars a month. What are the key libraries or tools you've seen folks using before OpenPipe? What combinations of tools do they use to capture all these different functions they want to perform, such as data fine-tuning and iterative improvement? Essentially, what does the pre-OpenPipe version of this process look like?
  6. What does OpenPipe's business model and pricing look like today? How are you thinking about capturing the value you create for customers in terms of cost savings and performance improvements?
  7. It makes me think that there's potentially going to be a need for something like a Sentry or Datadog, specifically for fine-tuned models or for running models in general. It doesn't seem like it would be easy for those existing companies to add this as a functionality, so I'm curious if that's the vision.
  8. You mentioned that OpenAI has a fine-tuning API, and I believe Hugging Face offers model fine-tuning as well. How do you think about OpenPipe's positioning against these companies with significant mindshare in generative AI? I've tried the OpenAI fine-tuning experience, and it's obviously not as feature-rich as what you're describing. On a higher level, how do you think about positioning against these other companies? Is there a potential partnering angle to it? I'm curious about your thoughts on this.
  9. It occurs to me that Scale might be another relevant company. They were originally more focused on autonomous driving and hardware-heavy use cases with a lot of human labeling and government contracts. They're also more enterprise-focused. How do you think about Scale? Is it a relevant company for you in that way?
  10. Data is key for fine-tuning and OpenPipe has access to a lot of it via its customers’ training data. Are there opportunities there for you to help folks get access to better data, whether through some kind of sharing agreements or increasing your ability to generate synthetic data?
  11. Meta launched a 405B parameter Llama model which they recommend folks use to train smaller models. Do you think of distillation as a paradigm that’s competitive to fine-tuning or is that something that OpenPipe could look to facilitate? Or is it just a tailwind because of how it could drive more usage of open source models? What are the tradeoffs with each approach, fine-tuning vs distillation?
  12. Regarding the tailwinds point, is that effectively how you see it? That more great open-source models, especially smaller ones, are a boon because they'll be in more people's hands and will likely require more fine-tuning?
  13. As context windows on LLMs grow, some have argued that fine-tuning becomes less important. What are the key considerations here? What are the strengths of long context, and what are the strengths of fine-tuning?
  14. Looking ahead to the future, if everything goes right for OpenPipe over the next 5 years, how do you envision OpenPipe's role and how the world will be different?
  15. Is there anything that we didn't talk about that you think is important to mention?

Contact us

Gain access to all industry hubs, market maps, research tools, and more
Get a demo
arrow
menuarrow

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.