New DarkBert AI was trained using dark web data from hackers and cybercriminals
Following the success of OpenAI’s ChatGPT, Microsoft’s Bing Chat and Google Bard, researchers have created a new AI model with a much darker twist.
While the large language models (LLMs) that power ChatGPT and Google Bard were trained on data from the open web, DarkBERT was trained exclusively on data from the dark web. Yes, you read that correctly, this new AI model was trained using data from hackers, cybercriminals and other scammers.
A team of South Korean researchers have released a paper (opens in new tab) (PDF) detailing how they made DarkBERT using data from the Tor network, which is often used to access the dark web. By crawling through the dark web and then filtering the raw data, they were able to create a dark web database that they used to train DarkBERT.