diff --git a/index.html b/index.html index e284778..bbd1a46 100644 --- a/index.html +++ b/index.html @@ -40,7 +40,7 @@
You can contribute your own tests, too! See the GitHub README for contributing instructions.
Tests are run every day at 1am PT. Last updated September 12, 2024.
+Tests are run every day at 1am PT. Last updated September 14, 2024.
Made with ❤️ by the team at Roboflow.
Today, the average response time to receive results from our tests was 4.05 seconds per request.
+Today, the average response time to receive results from our tests was 4.07 seconds per request.
This number only accounts for requests made by this application.
4.05 s
+4.07 s
There are eight fruits in the image.+
9
Test submitted by Roboflow
Can GPT-4V read handwriting?
-Fail
-Of the last 7 tests, conducted daily, this test has passed 71.0% of the time.
-Today's request cost $0.009
-We send a image of a handwritten note to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.-
- Read the text in the image. Return only the text, with punctuation. --
The words of songs on the album have been echoing in my head all week. "Fades into the grey of my day oh tea."-
Test submitted by Roboflow
-Of the last 7 tests, conducted daily, this test has passed 0% of the time.
-Today's request cost $0.01
+Today's request cost $0.009
{'x': 0.373, 'y': 0.3645, 'width': 0.254, 'height': 0.4958}+
{'x': 0.38, 'y': 0.4, 'width': 0.2, 'height': 0.4}
Test submitted by Roboflow
```json { - "A": { - "quantity": 20, - "price": 15 - }, - "B": { - "quantity": 25, - "price": 25 - }, - "C": { - "quantity": 30, - "price": 30 - }, - "D": { - "quantity": 35, - "price": 40 - } + "A": { + "quantity": 20, + "price": 10 + }, + "B": { + "quantity": 25, + "price": 25 + }, + "C": { + "quantity": 30, + "price": 30 + }, + "D": { + "quantity": 35, + "price": 40 + } } ```
Test submitted by Roboflow
@@ -403,7 +349,7 @@Of the last 7 tests, conducted daily, this test has passed 0% of the time.
-Today's request cost $0.02
+Today's request cost $0.021
To determine if there are any missing annotations, I'll count the visible cars in the image and compare that with the number of red bounding boxes. +The image provided contains visible cars annotated with red bounding boxes. I'll manually count the cars and the annotations to find any missing annotations. + +1. Cars in the image: + - 1 car (left oncoming lane, closest) + - 2 cars (next in line after the front car in the oncoming lane) + - 3 cars (close proximity in the far center lane) + - 1 car on the rightmost lane (closest to the camera) -In the provided image: +Total cars: 7 -1. Visible cars (including partial and fully visible): - - Front white car (on the right). - - Black SUV to the left. - - Several distant cars in the middle lane. - - Additional cars that may be hard to distinguish clearly. +2. Annotations: + - 1 car (left oncoming lane, closest) + - 1 car (next in line after the front car, on the left) + - 3 cars (in the far center lane) + - 1 car on the rightmost lane (closest to the camera) + +Total annotations: 6 -2. Red bounding boxes present: - - There are 8 red bounding boxes. +The car closest to the camera on the right lane isn't annotated. -Here, the number of cars in the image appears to be accurately annotated with red boxes. Therefore, there are no missing annotations. +Therefore, there is 1 missing annotation. -JSON output: ```json { - "missing": 0 + "missing": 1 } -``` - -This indicates that there are no missing annotations in the provided image.+```
Test submitted by Roboflow
Can GPT-4 Measure Items Using a Reference?
+Can GPT-4V classify an image without being trained on that particular use case?
Fail
+Pass
Of the last 7 tests, conducted daily, this test has passed 14.0% of the time.
-Today's request cost $0.009
+Of the last 7 tests, conducted daily, this test has passed 100% of the time.
+Today's request cost $0.005
This test gives GPT-4 a image of a square sticker with a ruler on one side and asks it to provide a number for the length and width. We score this test based on precent error, gauging how far it is from the actual value.+
We check to see if the model can correctly identify the vehicle. If it can, it recieves a 100%, if it is incorrect, it recieves a 0%.
- What is the length and width of this square sticker, based on the ruler? Return a JSON with float properties for `length` and `width` representing inches. + What is in the image? Return the class of the object in the image. Here are the classes: Toyota Camry, Tesla Model 3. You can only return one class from that list.
```json -{ - "length": 3.0, - "width": 3.0 -} -```+
Toyota Camry
Test submitted by Roboflow
Can GPT-4V classify an image without being trained on that particular use case?
+Can GPT-4V read a document and return the exact characters in the text?
Of the last 7 tests, conducted daily, this test has passed 100% of the time.
-Today's request cost $0.005
+Today's request cost $0.009
We check to see if the model can correctly identify the vehicle. If it can, it recieves a 100%, if it is incorrect, it recieves a 0%.+
We send a screenshot of a typed document to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.
- What is in the image? Return the class of the object in the image. Here are the classes: Toyota Camry, Tesla Model 3. You can only return one class from that list. + Read the text in the image. Return only the text, with punctuation.
Toyota Camry+
I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the Midnights album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times.
Test submitted by Roboflow
Can GPT-4V read a document and return the exact characters in the text?
+Can GPT-4V read handwriting?
Of the last 7 tests, conducted daily, this test has passed 100% of the time.
+Of the last 7 tests, conducted daily, this test has passed 71.0% of the time.
Today's request cost $0.009
We send a screenshot of a typed document to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.+
We send a image of a handwritten note to determine if it can correctly read the text. If it correctly gets the text, it gets a 100%. Otherwise, it gets a 0%.
Read the text in the image. Return only the text, with punctuation.
I was thinking earlier today that I have gone through, to use the lingo, eras of listening to each of Swift's Eras. Meta indeed. I started listening to Ms. Swift's music after hearing the Midnights album. A few weeks after hearing the album for the first time, I found myself playing various songs on repeat. I listened to the album in order multiple times.+
The words of songs on the album have been echoing in my head all week. "Fades into the grey of my day old tea."
Test submitted by Roboflow
[{'name': 'MARY THOMAS', 'time_per_day': 1, 'medication': 'ATENOLOL', 'dosage': 100, 'rx_number': '1234567-12345'}]+
[{'name': 'Mary Thomas', 'time_per_day': 1, 'medication': 'Atenolol', 'dosage': 100, 'rx_number': '1234567-12345'}]
Test submitted by Roboflow
Of the last 7 tests, conducted daily, this test has passed 100% of the time.
-Today's request cost $0.016
+Today's request cost $0.015
Can GPT-4 Measure Items Using a Reference?
+Pass
+Of the last 7 tests, conducted daily, this test has passed 14.0% of the time.
+Today's request cost $0.009
+This test gives GPT-4 a image of a square sticker with a ruler on one side and asks it to provide a number for the length and width. We score this test based on precent error, gauging how far it is from the actual value.+
+ What is the length and width of this square sticker, based on the ruler? Return a JSON with float properties for `length` and `width` representing inches. ++
```json +{ + "length": 3.5, + "width": 3.5 +} +```+
Test submitted by Roboflow
+