But the RCGP said it had not provided Babylon with the test’s questions and had no way to verify the claim.
“The college examination questions that we actually use aren’t available in the public domain,” added Prof Martin Marshall, one of the RCGP’s vice-chairs.
Babylon said it had used example questions published directly by the college and that some had indeed been made publicly available.
“We would be delighted if they could formally share with us their examination papers so I could replicate the exam exactly. That would be great,” Babylon chief executive Ali Parsa told the BBC.
To further test the AI, Babylon partnered with doctors at two US organisations – Stanford Primary Care and Yale New Haven Health – as well as doctors from the Royal College of Physicians.
It said they had developed 100 real-life scenarios to test the AI.
The company added that it expected its chatbot’s diagnostic skills would further improve as a consequence.
Babylon has demonstrated its chatbot being used as a voice-controlled “skill” on Amazon’s Alexa platform.
While Babylon’s existing GP at Hand service refers users to a human doctor if the app suspects a medical problem, the new chatbot makes a diagnosis itself – offering several possible scenarios along with a percentage-based estimate of each one being correct.
“The suggestion that this can replace doctors is the key issue for us,’ said Prof Marshall.
But Mr Parsa disputed the idea that doctors would be left out in the cold, explaining that the intention was still for a medic to follow up the AI’s diagnoses.
“We are fully aware that an artificial intelligence on its own cannot look after a patient. And that is why we complement it with physicians,” he said.
“It is never going to replace a doctor, but just to help.”
Babylon’s stated ambition is to deliver affordable health care to people all over the world.
Since 2016, it has been working in partnership with the government of Rwanda.
The country’s health care service was decimated after the genocide in 1994, in which more than 800,000 people were killed.
Babylon has two million registered users in Rwanda and has conducted tens of thousands of consultations.
Since smartphone use is not widespread in the country, people currently call nurses who follow symptom-checking prompts that appear to them via computer screens.
Information gathered as a result has been used to improve the chatbot.
BBC Click will have more on Babylon’s work in Rwanda on this weekend’s television show. Find out more at BBC.com/Click and @BBCClick.