I put the free version of Perplexity.ai through my coding tests – here’s what happened

Norman Posselt/Getty Images

He tested the coding capabilities of many generative AI tools for ZDNET, and this time, it’s the turn of Perplexity.ai.

Perplexity seems like a mix between a search engine and an AI chatbot. When I asked Perplexity how it differs from other AI chatbots, Generative AI The robot claims to use real-time information access tools and indexes the web on a daily basis. Users can narrow down searches by asking Perplexity to focus on sources or platforms.

Also: How to use ChatGPT to write code: What it can and can’t do for you

Perplexity’s free version is quite limited. It uses OpenAI’s GPT-3.5 model for analysis, only allows five questions per day, and while it does support document uploads, these uploads are limited to three per day.

The Pro version of Perplexity costs $20 per month. This version allows for unlimited “quick” searches, 600 Pro searches per day, and a choice of AI model. You can choose between GPT-4o, Claude 3Sonar Large (LLaMa 3) and others. The Pro version also offers $5/month in API credits.

We decided to leave out the Pro version and run the free version for our initial test of Perplexity’s programming prowess. I’ve run these coding tests against AI with mixed results. If you want to follow along, point your browser to ‘How I Test an AI Chatbot’s Coding Ability (and You Can Too)‘, which contains all the standard tests I administer, explanations of how they work, and details on what to look for in the results.

Also: Will AI take over programming jobs or turn programmers into AI managers?

Now let’s look at the results of each test and see how they compare to previous tests using Claude Sonnet 3.5, Microsoft Copilot, Meta AI, Metacode Flame, Google Gemini Advancedand ChatGPT.

1. How to Write a WordPress Plugin

This challenge poses several questions. First, the AI ​​is asked to create a user interface to input lines that will be randomized (but not have duplicates removed). The test then requires the AI ​​to create a button that not only randomizes the list, but makes sure that duplicate items are presented so that they are not next to each other in the resulting list.

So far, most AIs, with the exception of Meta Code Llama, have created a fairly reasonable user interface. Some were more attractive than others, but they all served their purpose.

Also: Code faster with generative AI, but beware of the risks when doing so

However, only ChatGPT (3.5, 4, and 4o) produced the correct random result. Most of the other AIs simply presented a button that, when clicked, did nothing.

Perplexity worked. It produced a user interface that fit the specs and the Randomize button worked and separated the duplicate lines.

Below are the aggregate results of this and previous tests:

  • Perplexity: Interface: good, functionality: good
  • Claude Sonnet 3.5: Interface: good, functionality: failure
  • ChatGPT GPT-4o: Interface: good, functionality: good
  • Microsoft Copilot: Interface: adequate, functionality: failure
  • Meta IA: Interface: adequate, functionality: failure
  • Meta code Call: Complete failure
  • Google Gemini Advanced: Interface: good, functionality: failure
  • ChatGPT 4: Interface: good, functionality: good
  • ChatGPT 3.5: Interface: good, functionality: good

2. Rewriting a string function

This test fixes a validation function that checks dollars and cents.

My original code had a bug in that it only allowed dollar integers, but not cents. I found out about it when a user submitted a bug report. I initially sent the incorrect code to ChatGPT, who did a nice job of rewriting the function to allow dollar amounts and two digits to the right of the decimal point.

Perplexity also passed this test.

The generated code could have been more precise, but it worked. In one case where the user-supplied string contained only zeros, Perplexity’s implementation stripped everything out. To compensate, Perplexity first checked for zeros.

Also: Implementing AI in software engineering? Here’s everything you need to know

This approach is viable, but the regular expression that Perplexity generated could have been written to account for this variation. It is a simple implementation option and many skilled programmers would have chosen either path, so Perplexity’s approach is acceptable.

Perplexity’s code successfully tested the submitted data to ensure it matched the dollars and cents format. The code then converted the string into a number. It also checked to see if the parsed number was valid and not negative.

Overall, Perplexity produced solid code. Here are the aggregate results from this test and previous ones:

  • Perplexity: Successful
  • Claude Sonnet 3.5: Failed
  • ChatGPT GPT-4o: Successful
  • Microsoft Copilot: Failed
  • Meta IA: Failed
  • Meta code Call: Successful
  • Google Gemini Advanced: Failed
  • ChatGPT 4: Successful
  • ChatGPT 3.5: Successful

3. Find an annoying bug

A bug in my code confused me, so I turned to ChatGPT for help. It turned out that the source of the problem wasn’t intuitively obvious, so I didn’t see it.

A parameter passing error requires knowledge of how the WordPress framework works. I didn’t notice the error because PHP seemed to imply that the problem was in a part of the code when, in fact, the problem was how the code was transitioning through a specific WordPress operation.

Perplexity found the problem and correctly diagnosed the solution.

Also: Can AI be a team player in collaborative software development?

Below are the aggregate results of this and previous tests:

  • Perplexity: Successful
  • Claude Sonnet 3.5: Successful
  • ChatGPT GPT-4o: Successful
  • Microsoft Copilot: Failed
  • Meta IA: Successful
  • Meta code Call: Failed
  • Google Gemini Advanced: Failed
  • ChatGPT 4: Successful
  • ChatGPT 3.5: Successful

4. Write a script

This final test tests the breadth of the AI’s knowledge base. The test asks for code generation that requires knowledge of Chrome’s document object model, AppleScript, and a third-party scripting tool for Mac called Keyboard Maestro.

Perplexity didn’t seem to know anything about Keyboard Maestro, so it didn’t write the necessary call to the scripting language to retrieve the value of a variable.

Also: Beyond programming: AI creates a new generation of jobs

The perplexity also made The same mistake that Claude 3.5 Sonnet madewhich resulted in a line of AppleScript code that generated a syntax error message when executed. This error indicated a lack of understanding of how AppleScript ignores case, and where it considers the case of a string when comparing two values.

Below are the aggregate results of this and previous tests:

  • Perplexity: Failed
  • Claude Sonnet 3.5: Failed
  • ChatGPT GPT-4o: It was successful, but with reservations
  • Microsoft Copilot: Failed
  • Meta IA: Failed
  • Meta code Call: Failed
  • Google Gemini Advanced: Successful
  • ChatGPT 4: Successful
  • ChatGPT 3.5: Failed

Total Results

Here are the overall results of the four tests:

Overall, Perplexity performed well. I thought the AI ​​might fail on the fourth test, because ChatGPT 3.5 did and the free version of Perplexity uses the GPT-3.5 model.

I was surprised by these results because Microsoft’s Copilot is also supposed to use OpenAI’s AI engine, but Copilot failed at almost everything. Perplexity mirrored the GPT-3.5 results, which makes sense since the free version uses GPT-3.5.

Let me know if you want to see how Perplexity Pro works. If I get enough requests, I’ll sign up. Yet another monthly fee for AI and run some tests.

Have you tried the free version of Perplexity or its Pro version? Let us know in the comments below.

You can follow daily updates of my projects on social media. Make sure to subscribe to My weekly update newsletterand follow me on Twitter/X on @DavidGewirtzon Facebook at Facebook.com/DavidGewirtzon Instagram at Instagram.com/DavidGewirtzand on YouTube at YouTube.com/DavidGewirtzTV.

Source link

Leave a Comment