Nick Clegg: Meta used Facebook posts to train her smart assistant

Meta used public posts across Facebook and Instagram to train parts of its new virtual assistant Meta AI, though it excluded private posts limited to family and friends in an effort to respect consumer privacy.

Nick Clage, the company’s head of global affairs, said: “Meta did not use private chats across its platforms as training data for the model and took steps to filter private details from public datasets used for training.”

“We tried to exclude datasets with a large proportion of personal information, as the vast majority of the data Meta used for training was publicly available,” Clegg said.

Clegg cited LinkedIn as an example of a website whose content Meta deliberately chose not to use due to privacy concerns.

Clegg comments come as tech companies including Meta, Google and OpenAI have come under fire for using information obtained from the internet without permission to train artificial intelligence models, which need vast amounts of data to summarize information and create images.

Companies are considering how to handle private or copyrighted material collected in the process that could be reproduced by AI systems, while facing lawsuits from authors for copyright infringement.

The new virtual assistant is the most important product among the first consumer-oriented AI tools unveiled by CEO Mark Zuckerberg at the annual Meta Products Conference Connect.

Talk of artificial intelligence dominated the event, unlike previous conferences that focused on augmented and virtual reality.

Meta explained that the virtual assistant uses a special model based on the large language model Llama 2 launched by the company in July for general commercial use, as well as a new model called EMU, which generates images in response to text directives.

The public Facebook and Instagram posts used to train Meta AI included text and images, and the company used these posts to train Emu on image generation elements, while chat functionality is based on Llama 2 with the addition of some publicly available datasets.

Meta imposed safety restrictions on the content the virtual assistant can generate, such as with the creation of realistic images of public figures. Meta added clauses that prevent users from creating content that violates privacy and intellectual property rights, in a move to avoid reproducing copyrighted images.

 

Facebook comments