Home
All updates
EDGE Insights
Industries
Company Search
My Watchlists (Beta)

EDGE Insights

Filter

EDGE Interview

OpenPipe CEO Kyle Corbitt on the future of fine-tuning LLMs

Generative AI Infrastructure

Aug 27, 2024

Older updates:

EDGE Insights

Notarize: The online remote notarization service

Residential PropTech

Dec 7, 2024

EDGE Insights

Ramp: The corporate card and expense management company

Security

Dec 7, 2024

EDGE Insights

Betterment: The investment app with automated investing, financial planning, and cash accounts

Retail Trading Infrastructure

Dec 6, 2024

EDGE Insights

NVIDIA Q3 FY2025 recap

Information Technology

Nov 26, 2024

EDGE Insights

RegTech

Capital Markets Tech

Nov 25, 2024

EDGE Insights

Ripple: The blockchain-based digital payment company

Cryptocurrencies

Nov 17, 2024

EDGE Insights

H2O.ai: The open-source AutoML platform

Machine Learning Infrastructure

Nov 17, 2024

EDGE Insights

PsiQuantum: A pioneer in advancing quantum computing hardware development

Quantum Computing

Nov 17, 2024

EDGE Insights

State of tech in Financial Services: How AI, blockchain, and biometrics are forging the future

Financial Services

Nov 13, 2024

EDGE Insights

Valimail: The email authentication and security solutions provider

Machine Learning Infrastructure

Nov 12, 2024

Generative AI Infrastructure

View industry hub

Aug 27, 2024

OpenPipe CEO Kyle Corbitt on the future of fine-tuning LLMs

Explore more on GenAI Infrastructure through our industry hub

Kyle Corbitt is the CEO of OpenPipe, a platform that enables developers to create application-specific LLMs through fine-tuning. Its core technology allows developers to fine-tune smaller models that can match or exceed the performance of larger, general-purpose LLMs when specialized for specific tasks.

Source: SPEEDA Edge research

The following interview was conducted by Sacra—August 2024

Background

Fine-tuning their custom models for specific tasks has been a key cost-saving strategy for companies running LLMs in production. Sacra reached out to Kyle Corbitt, ex-Y Combinator, founder and CEO at OpenPipe (Y Combinator, Summer 2023 batch), to better understand the long-term trajectory of LLM fine-tuning.

Questions

Tell us about OpenPipe—what was the key insight that led you to start the company and focus on fine-tuning?

What are the core categories of unreliability that fine-tuning addresses? Is it issues such as returning incorrectly formatted JSON, or other types of problems?

Could you walk me through what the total workflow would look like in a standard deployment? I'm interested in the process from collecting and validating data to the iterative aspect of how the fine-tuning gets improved. Does it become more fine-tuned with more feedback? What does that whole process look like?

I'm curious what kinds of verticals or sectors you've seen strong product-market fit in. We've talked about the bias towards companies that are shipping product more actively, but I'm wondering if there's any crossover with areas like compliance or fintech. What does that landscape look like?

You mentioned that before OpenPipe, people were spending thousands of dollars a month. What are the key libraries or tools you've seen folks using before OpenPipe? What combinations of tools do they use to capture all these different functions they want to perform, such as data fine-tuning and iterative improvement? Essentially, what does the pre-OpenPipe version of this process look like?

What does OpenPipe's business model and pricing look like today? How are you thinking about capturing the value you create for customers in terms of cost savings and performance improvements?

It makes me think that there's potentially going to be a need for something like a Sentry or Datadog, specifically for fine-tuned models or for running models in general. It doesn't seem like it would be easy for those existing companies to add this as a functionality, so I'm curious if that's the vision.

You mentioned that OpenAI has a fine-tuning API, and I believe Hugging Face offers model fine-tuning as well. How do you think about OpenPipe's positioning against these companies with significant mindshare in generative AI? I've tried the OpenAI fine-tuning experience, and it's obviously not as feature-rich as what you're describing. On a higher level, how do you think about positioning against these other companies? Is there a potential partnering angle to it? I'm curious about your thoughts on this.

It occurs to me that Scale might be another relevant company. They were originally more focused on autonomous driving and hardware-heavy use cases with a lot of human labeling and government contracts. They're also more enterprise-focused. How do you think about Scale? Is it a relevant company for you in that way?

Data is key for fine-tuning and OpenPipe has access to a lot of it via its customers’ training data. Are there opportunities there for you to help folks get access to better data, whether through some kind of sharing agreements or increasing your ability to generate synthetic data?

Meta launched a 405B parameter Llama model which they recommend folks use to train smaller models. Do you think of distillation as a paradigm that’s competitive to fine-tuning or is that something that OpenPipe could look to facilitate? Or is it just a tailwind because of how it could drive more usage of open source models? What are the tradeoffs with each approach, fine-tuning vs distillation?

Regarding the tailwinds point, is that effectively how you see it? That more great open-source models, especially smaller ones, are a boon because they'll be in more people's hands and will likely require more fine-tuning?

As context windows on LLMs grow, some have argued that fine-tuning becomes less important. What are the key considerations here? What are the strengths of long context, and what are the strengths of fine-tuning?

Looking ahead to the future, if everything goes right for OpenPipe over the next 5 years, how do you envision OpenPipe's role and how the world will be different?

Is there anything that we didn't talk about that you think is important to mention?

Contact us

Gain access to all industry hubs, market maps, research tools, and more

Get a demo

By using this site, you agree to allow SPEEDA Edge and our partners to use cookies for analytics and personalization. Visit our privacy policy for more information about our data collection practices.