Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used for DSPY RM #1240

hawktang · 2024-07-03T08:58:44Z

With LiteLLM proxy, the embedding model that LiteLLM supported can be used in DSPY

arnavsinghvi11 · 2024-07-08T06:37:31Z

Thanks for opening this PR @hawktang . just curious, where is the LiteLLMVectorizer being used? seems to me it is just using the openAI embeddings but just wanted to double-check here.

hawktang · 2024-07-16T03:43:56Z

Sorry for the late relay, I am on travel now.

LiteLLMVectorizer is used to call embedding models which LiteLLM proxy supported.

LiteLLM proxy is used to call different LLM APIs using the OpenAI API format.

With LiteLLM adaptor DSPY can direct support all the cloud and local model LiteLLM supports.

Because it is using OpenAI API format, LiteLLMVectorizer I wrote is quite similar except the base_url

Directly add base_url as parameter for openAI embeddings can achieve the result.

I will raise another PR if LiteLLMVectorizer class is redundant.

okhat · 2024-08-19T00:17:43Z

See #1357 also, may overlap?

hawktang · 2024-08-20T02:15:25Z

yes, totally overlap. #1357 is what I expected to have. Peter Ze TANG (汤赜)

…

On Mon, Aug 19, 2024 at 08:18 Omar Khattab ***@***.***> wrote: See #1357 <#1357> also, may overlap? — Reply to this email directly, view it on GitHub <#1240 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAPJIZ5BCY7MSRTUAINM3STZSE2T3AVCNFSM6AAAAABKJDW4XOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOJVGQ2TEMRVGA> . You are receiving this because you were mentioned.Message ID: ***@***.***>

hawktang · 2024-09-03T13:57:22Z

I have change the name to MultiOpenAIVectorizer to follow the new MultiOpenAI api in dspy. Can we merge this PR for general embedding service to be use in dspy RM.

This should be a quick solution for LM and RM to use LiteLLM before the roadmap finished.

As of DSPy 2.4, the library has approximately 20,000 lines of code and roughly another 10,000 lines of code for tests, examples, and documentation. Some of these are clearly necessary (e.g., DSPy optimizers) but others exist only because the LM space lacks the building blocks we need under the hood. Luckily, for LM interfaces, a very strong library now exists: LiteLLM, a library that unifies interfaces to various LM and embedding providers. We expect to reduce around 6000 LoCs of support for custom LMs and retrieval models by shifting a lot of that to LiteLLM.

Objectives in this space include improved caching, saving/loading of LMs, support for streaming and async LM requests. Work here is currently led by Hanna Moazam and Sri Vardhamanan, building on a foundation by Cyrus Nouroozi, Amir Mehr, Kyle Caverly, and others.

#1357
#390

hawktang · 2024-09-11T04:29:43Z

I have change the name to MultiOpenAIVectorizer to follow the new MultiOpenAI api in dspy. Can we merge this PR for general embedding service to be use in dspy RM.

This should be a quick solution for LM and RM to use LiteLLM before the roadmap finished.

Any feedback for the update of the PR

hawktang mentioned this pull request Sep 3, 2024

[WIP] Major refactor roadmap #390

Open

add MultiOpenAIVectorizer for MultiOpenAI

d5ce579

hawktang changed the title ~~add LiteLLMVectorizer to allow more embeddings to be used~~ add MultiOpenAIVectorizer to allow general openai api format embeddings to be used Sep 3, 2024

hawktang changed the title ~~add MultiOpenAIVectorizer to allow general openai api format embeddings to be used~~ Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used Sep 3, 2024

hawktang force-pushed the LiteLLMVectorizer branch from 1a9bb20 to d5ce579 Compare September 3, 2024 13:55

hawktang changed the title ~~Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used~~ Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used for DSPY RM Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used for DSPY RM #1240

Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used for DSPY RM #1240

hawktang commented Jul 3, 2024

arnavsinghvi11 commented Jul 8, 2024

hawktang commented Jul 16, 2024

okhat commented Aug 19, 2024

hawktang commented Aug 20, 2024 via email

hawktang commented Sep 3, 2024 •

edited

Loading

hawktang commented Sep 11, 2024

Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used for DSPY RM #1240

Are you sure you want to change the base?

Add MultiOpenAIVectorizer to allow general openai api format embeddings to be used for DSPY RM #1240

Conversation

hawktang commented Jul 3, 2024

arnavsinghvi11 commented Jul 8, 2024

hawktang commented Jul 16, 2024

okhat commented Aug 19, 2024

hawktang commented Aug 20, 2024 via email

hawktang commented Sep 3, 2024 • edited Loading

hawktang commented Sep 11, 2024

hawktang commented Sep 3, 2024 •

edited

Loading