Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to insert text and image embeddings for multimodal feature of pinecone #260

Open
2 tasks done
UmarIgan opened this issue Dec 27, 2023 · 0 comments
Open
2 tasks done
Labels
bug Something isn't working

Comments

@UmarIgan
Copy link

Is this a new bug in the Pinecone Python client?

  • I believe this is a new bug in the Pinecone Python Client
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

I am aiming to build multimodel with simple dataset that has image and text, I tried to implement it from following blog: post but it didnt work because it seems that in the blog it insert only image embeddings but search with text embedding:

image_data_df["vector_id"] = image_data_df.index
image_data_df["vector_id"] = image_data_df["vector_id"].apply(str)
# Get all the metadata
final_metadata = []
for index in range(len(image_data_df)):
 final_metadata.append({
     'ID':  index,
     'caption': image_data_df.iloc[index].caption,
     'image': image_data_df.iloc[index].image_url
 })
image_IDs = image_data_df.vector_id.tolist()
image_embeddings = [arr.tolist() for arr in image_data_df.img_embeddings.tolist()]
# Create the single list of dictionary format to insert
data_to_upsert = list(zip(image_IDs, image_embeddings, final_metadata))
# Upload the final data
my_index.upsert(vectors = data_to_upsert)
# Check index size for each namespace
my_index.describe_index_stats()

When tried the code in this blog after queryin as follow:

# Get the query text
text_query = image_data_df.iloc[10].caption
 
# Get the caption embedding
query_embedding = get_single_text_embedding(text_query).tolist()
 
# Run the query
my_index.query(query_embedding, top_k=4, include_metadata=True)

it returns nothing

Expected Behavior

Assume that I have dataframe with caption as text, image embedding and text embeddings how can i insert both to an index and query based on image or text?

Steps To Reproduce

follow to blog post's colab notebook and run all, you will notice it doesn't work in query part.

Relevant log output

.

Environment

- OS: on google colab notebook
- Python: 
- pinecone: 2.2.4

Additional Context

.

@UmarIgan UmarIgan added the bug Something isn't working label Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant