We have seen tons of impressive examples of how ChatGPT wows and bedazzles with its seemingly superhuman knowledge and writing ability.

But let’s take a look at the other side and see how it can fail just as spectacularly.

Task 1: Comparing Bed Sizes

The AI does not understand numbers the way we do. It treats numbers like words, and as a result, makes silly mistakes that my 5-year-old could call out.

For example, when asked to compare bed sizes, the AI may say that 108 inches is narrower than 76 inches, which is clearly not true.

chatgpt-fails-task-Screenshot_2022-12-11_at_10.13.44_PM.png

Task 2: Making a Topical Dad Joke

I planned to go to Tahoe last weekend but had to cancel because of the impending snow storm and avalanche risk. So I asked the AI to make a joke about it.

Unfortunately, the AI made a spurious connection between snowmen and avalanches, and then it hallucinated the double meaning of blizzard. The result was a joke that made no sense.

chatgpt-fails-task-Screenshot_2022-12-09_at_3.31.54_PM.png

Task 3: Writing Python Code to Solve Titanic Dataset with Conformal Predictions

When asked to solve the famous Titanic problem with logistic regression, the AI was able to generate a pretty plausible code.

As a follow-up, I asked it to use conformal prediction instead of logistic regression, and it started to hallucinate.

ChatGPT suggested that we import scikit-cp, which stands for Computational Physics, not Conformal Prediction. It looks like it just made the assumption that the package could solve the problem and wrote the code without understanding what it was doing.

chatgpt-fails-task-Screenshot_2022-12-12_at_10.56.14_AM.png

chatgpt-fails-task-Screenshot_2022-12-12_at_11.02.16_AM.png

Summary

In conclusion, ChatGPT is a powerful tool that can generate impressive results, but it is not perfect. It has some limitations that can result in silly mistakes and nonsensical outputs. Some of these limitations include:

Despite these limitations, ChatGPT is still an impressive tool that can be used for many useful applications. It just needs to be used with care and understanding of its limitations.