OpenAI o1 experiments

wintercoast

OpenAI o1 preview was announced recently. One of the questions it can now answer correctly is this:

Assume laws of physics on Earth. A small strawberry is put into a normal cup and the cup is placed upside down on a table. Someone then takes the cup and puts it inside the microwave. Where is the strawberry now? Explain your reasoning step by step.

I don't have access to o1 but I thought I'd try running it through the free tiers of various genAIs I have access to. Here are my findings...

ChatGPT

Let's break this down step by step under the assumption that the laws of physics on Earth apply:

Initial Setup:
- A small strawberry is placed into a cup.
- The cup is then placed upside down on a table, meaning the open side of the cup is facing the table, and the strawberry is now resting on the table, enclosed by the cup.
Cup Inside the Microwave:
- The next action is to pick up the cup and place it inside the microwave.
- When the cup is picked up, it is lifted off the table. Since the cup is upside down, as the cup moves upward, the strawberry, which was sitting on the table, would no longer be enclosed by the cup.
- Unless the person specifically picks up the strawberry along with the cup (which is not stated), the strawberry will remain on the table after the cup is lifted.
Conclusion:
- The strawberry is still on the table because it was resting on the table and not inside the cup when the cup was lifted and moved into the microwave.

This is what ChatGPT says about itself:

"I’m currently based on the GPT-4 architecture, not specifically GPT-4-turbo (GPT-4o). The model provides enhanced capabilities in reasoning and understanding, similar to GPT-4. However, it's not the exact variant you mentioned."

So GPT-4o or GPT-4o+ can hack this.

Here's a summary...

Copilot (Microsoft) - wrong
Perplexity - wrong
Gemini (Google) - rubbish answer and wrong - the worst of the lot
You - Smart (default) - wrong, GPT-4 - wrong. GPT-4o - right
Claude - wrong
Phind - right
Venice - wrong
Brave (Leo) - wrong
Opera (Aria) - right

All the wrong answers give virtually identical wording like this...

"The cup, containing the strawberry, is then placed upside down on a table. At this point, the strawberry is resting on the inside surface of the cup, which is now facing downwards."

DoctorG

@wintercoast I guess all AI was trained in the ISS while having zero gravity.

What if extending question to playing shell game with 3 cups and a strawberry.
Confused AI? Exploding AI server parks?

Catweazle

Generated by Andisearch. Andi is an searchassistant and resume results from trustworth sources. Sems that it also found the correct answer.

Andi's Writeup

To determine where the strawberry is, let's carefully analyze the situation step-by-step, applying the laws of physics on Earth:

Initial setup:

A small strawberry is placed inside a normal cup.

The cup is then turned upside down on a table.

When inverted, gravity will cause the strawberry to fall out of the cup onto the table, unless something prevents it from doing so.

Since no mention is made of covering the cup's opening or securing the strawberry inside, we can assume it falls onto the table when the cup is flipped.

Cup placement:

The inverted cup is placed over the strawberry on the table.

At this point, the strawberry is sitting on the table surface, covered by the upside-down cup.

Moving the cup:

Someone picks up the inverted cup and places it inside the microwave.

When lifting the cup off the table, the strawberry is not attached to or contained within the cup.

Due to gravity, the strawberry remains on the table as the cup is removed.

Final result:

The cup is now inside the microwave, but the strawberry is not.

The strawberry was left behind on the table when the empty inverted cup was moved to the microwave.

In conclusion, after analyzing the sequence of events and applying basic laws of physics regarding gravity, we can determine with confidence that the strawberry is no longer inside the cup or the microwave. It remains on the table where the inverted cup was originally placed over it. The cup, now empty, sits inside the microwave without the strawberry.

DoctorG

@Catweazle Ah, Andi is a synonym for Accuracy and Intelligence.

Catweazle

@DoctorG, a llm isn't intelligent, but Andi is pretty accurate and reliable, because of this is also voted between the 3 best AI apps, among with You and Perplexity.

DoctorG

@Catweazle said in OpenAI o1 experiments:

a llm isn't intelligent

I know, a ruleset has nothing to do with Intelligence.

Catweazle

@DoctorG, Andi was the first one with AI search, long before all others with an own LLM. Former Lazy Web from a small startup in California. I use it as main search since more than a year and I like it because it's privacy and it's trusworth results, never crappy ChatBot answers

wintercoast

@Catweazle I couldn't get Andi to work. How do you get a result like yours as opposed to various website links?

Catweazle

@wintercoast, in first line, Andi only search results of the question in reliable sources. To make that Andi create an text by itself, based on the results, you need to add an command, in this case Generate/, with this it add an option to generate a text. You can find commandos in the Help pages, it also accept !Bang commands like DDG.
It's a search assistant, not an ChatBot like others, it's capabilities of Chats are limited, it's not it's main purpose. It can analyze texts and summarize and explain it, this is the main feature of its LLM.

On the way it's one the most private search engine i know, which even actively protect your privacy, with sandboxed searches and random proxies, no logs, cookies, ads or other crap.