PhotoMuse: AI-Powered Image Search and Analysis

PhotoMuse is an advanced image search and analysis application that uses AI to generate descriptions and tags for images, and then allows for similarity-based searching using natural language queries, as well as searching by similar images.

This is my submission for the Microsoft RAG Hack.

Features

Image upload and automatic description generation using Azure Computer Vision and GPT-4
Automatic tagging of images
Vector embedding of image descriptions for efficient similarity search
Natural language querying of the image database
Refined search queries using AI
Confidence scoring and explanations for search results

Technology Stack

Next.js (App Router)
TypeScript
PostgreSQL with pgvector extension
Drizzle ORM
Azure OpenAI API (for GPT-4 and embeddings)
Azure Computer Vision API
Azure Blob Storage

Setup

Clone the repository:

git clone https://github.com/dubscode/photorag.git
cd photorag

Install dependencies:

npm install

Set up your environment variables in a .env.local file:

AZURE_OPENAI_API_KEY=your_azure_openai_api_key
AZURE_OPENAI_API_INSTANCE_NAME=your_instance_name
AZURE_OPENAI_API_DEPLOYMENT_NAME=your_deployment_name
AZURE_OPENAI_API_VERSION=your_api_version
AZURE_OPENAI_API_EMBEDDING_DEPLOYMENT_NAME=your_embedding_deployment_name
AZURE_AI_ENDPOINT=your_azure_ai_endpoint
AZURE_VISION_API_KEY=your_azure_vision_api_key
DATABASE_URL=your_postgres_database_url

Set up your PostgreSQL database with the pgvector extension.
Run database migrations:

npm run db:migrate

Understanding Search Results

When you perform a search, each result includes the following key information:

id: The unique identifier of the image in the database.
filePath: The URL or path to the image file (returned as a SAS URL for Azure Blob Storage).
description: The AI-generated description of the image.
tags: AI-generated tags for the image.
distance: A measure of how different the query is from the image description.
confidence: A score indicating how well the image matches the query.
confidenceExplanation: A detailed explanation of why this image was matched and how the confidence score was calculated.

Confidence Score

The confidence score is calculated based on the cosine similarity between the query embedding and the image description embedding. Here's how to interpret it:

A score closer to 1 indicates a higher confidence in the match.
A score closer to 0 indicates a lower confidence.

The confidence explanation provides more context about why an image was matched, including information about matching tags and the similarity score.

How It Works

Image Upload: When an image is uploaded, it's stored in Azure Blob Storage.
Image Analysis: The image is analyzed using Azure Computer Vision to generate tags and captions.
Description Generation: GPT-4 is used to generate a detailed description based on the tags and captions.
Vector Embedding: The description is converted into a vector embedding using Azure OpenAI.
Search: When a user performs a search:
- The query is refined using GPT-4 to extract relevant tags and improve the search terms.
- The refined query is converted to a vector embedding.
- A similarity search is performed using cosine similarity between the query embedding and the stored image embeddings.
- Results are ranked based on similarity and tag matches.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.github/workflows		.github/workflows
.vscode		.vscode
app		app
assets		assets
components		components
db		db
deploy		deploy
hooks		hooks
lib		lib
models		models
public		public
types		types
.dockerignore		.dockerignore
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc.json		.prettierrc.json
COMMANDS.md		COMMANDS.md
Dockerfile		Dockerfile
README.md		README.md
components.json		components.json
drizzle.config.ts		drizzle.config.ts
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhotoMuse: AI-Powered Image Search and Analysis

Table of Contents

Features

Technology Stack

Setup

Understanding Search Results

Confidence Score

How It Works

About

Releases

Packages

Languages

dubscode/photorag

Folders and files

Latest commit

History

Repository files navigation

PhotoMuse: AI-Powered Image Search and Analysis

Table of Contents

Features

Technology Stack

Setup

Understanding Search Results

Confidence Score

How It Works

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages