Key Points to OpenAI's GPT-4 Technical Report.
The unveiling of GPT-4 prompted both excitement and alarm among specialists because to its outstanding capabilities, ethical issues, and closed model.
 
GPT-4 has arrived.
 
It is the successor to the GPT-3 and 3.5 models, which introduced generative AI to the general public via ChatGPT and sparked a flurry of investor and commercial engagement in gen AI.
 
At the time, there are a number of important AI releases being developed and planned. Yet, considering that ChatGPT had 100 million active users two months after its introduction (making it the fastest growing app in history), the anticipation for GPT-4 has been at an all-time high. Google and Microsoft have both updated their internet search and productivity suites using generation AI in the recent three months (among countless companies infusing their products with the technology). Therefore it's no surprise that we obtained GPT-4 so quickly.
 
OpenAI improved on the already impressive GPT-3 model by adding multi-modality (picture recognition), more accuracy, more inventiveness, and longer prompting, among other enhancements.
 
"The ongoing advances across many dimensions are astonishing," Oren Etzioni, adviser and board member of the Allen Institute for AI, told MIT Technology Review. "GPT-4 is now the benchmark against which all foundation models will be assessed."
 

Organizations such as Duolingo, Stripe, Morgan Stanley, Khan Academy, the Government of Iceland, and even us here at AI-Wordsmith have already adopted the idea (among many other models we tap.) Be My Eyes, for example, is using the model's image recognition to provide visually challenged people with an app-based GPT-4-powered visual assistant. ChatGPT Plus, which costs $20 per month, will help make the model more broadly available. It is also said to be the basis for Microsoft's new Bing search engine.

 
While GPT-4 has already received plaudits for its outstanding powers, the release is also about its limitations and what it cannot do. What is equally essential are the ethical problems presented by GPT-4, as well as the model's closed nature.
 
All of these aspects may be found in OpenAI's 98-page technical paper on GPT-4. Yet, because many people won't be able to read everything, we've boiled it down into major concepts, along with opinions from experts in the subject.
 

GPT-4's Primary Functions

"GPT-4 is more creative and collaborative than ever before," according to the model's official blog page. It can produce, edit, and iterate with users on artistic and technical writing tasks including song composition, screenplay authoring, and understanding a user's writing style."
 
It can manage more than 25,000 words of text, making it ideal for long-form content production, lengthy chats, and document analysis. According to OpenAI's internal tests, it is also 82 percent less likely to generate incorrect material and 40 percent more likely to provide right replies than GPT-3.5.
 
"When given an article from The New York Times, the new chatbot can nearly always deliver a clear and accurate synopsis of the story," Cade Metz wrote in the same journal. "If you add a random sentence to the summary and ask the bot if it's correct, it will point to the additional sentence."
 
GPT-4 may also generate text output from a mix of text and graphics. It can handle a variety of data types, including text-based documents, images, diagrams, and screenshots, in addition to pure text inputs.
 
"GPT-4 supports prompts that include both pictures and words, which, like the text-only default, allows the user to specify any vision or language job," OpenAI noted in the technical paper for GPT-4. "The model creates text outputs given inputs consisting of randomly interlaced text and graphics. GPT-4 performs similarly to text-only inputs across a variety of domains, including documents containing text and images, diagrams, or screenshots."
 
 
In the above example, the model not only assessed the whole contents of the meme (text and visuals), but it also provided insight into why it is humorous.
 
"I don't want to give the impression that we've solved thinking or intelligence, which we haven't," OpenAI CEO Sam Altman told the New York Times. "Yet, this is a significant improvement over what is currently available."
 
GPT-4 has also exhibited human-like performance levels in a variety of professional and academic assessments. One notable accomplishment is the successful completion of the Uniform Bar Examination, with a score in the top 10% of test takers. According to OpenAI's assessments, GPT-4 received a SAT score of 1,300 (out of 1,600) and a perfect score on Advanced Placement examinations in biology, mathematics, macroeconomics, psychology, statistics, and history. Earlier GPT incarnations failed the Uniform Bar Test and performed much worse on most AP examinations.
 
GPT-4 is smarter than its predecessors [image via OpenAI]

 

The Closed-Model Puzzle

Several early testers were astounded by GPT-4's capabilities. Nonetheless, it is far from a perfect model.
 
"Despite its advantages, GPT-4 has comparable limits to previous GPT models," OpenAI noted in the technical study on GPT. "Most crucially, it is still not completely dependable (it "hallucinates" information and makes mistakes in reasoning). When employing language model outputs, extreme caution should be exercised, especially in high-stakes situations.
 
"GPT-4 largely lacks knowledge of events that have transpired after the great bulk of its pre-training data cuts off in September 2021, and does not learn from its experience. It can occasionally make basic reasoning mistakes that do not appear to be consistent with skill in so many fields, or it might be unduly trusting in accepting blatantly false assertions from a user. It, like people, can fail at difficult issues."
 
GPT-4's propensity to create damaging, biased, and unsafe results was also mentioned in the technical paper. Adversa AI, an Israeli AI research and deployment firm, evaluated GPT-4 just hours after it was released, bypassing some of the model's protections to disclose the same poor findings.
 
Another crucial component of the technical article is that it did not disclose specifics on how the model works and where its generations are anchored, which is one of the issues surrounding hallucinations and prejudice.
 
"Given both the competitive landscape and the safety concerns of large-scale models like GPT-4, this study provides no more specifics regarding the architecture (including model size), hardware, training computation, dataset generation, training procedure, or similar," according to the paper.
 
Several individuals in and around the generative AI sector are worried that the model's lack of transparency makes it difficult to pinpoint the source of GPT-4's bad or damaging outputs. Outside researchers can cross-check the training models, dataset development, and other aspects for areas of bias or damaging findings created by open-sourced models. Bringing these flaws to light allows developers to close the gaps and enhance the models. Others are concerned that closing a model may discourage outside input or community-driven immersion in the ecosystem of ethicists striving to advance AI for the public benefit.
 
"You have no idea what the data is," HuggingFace research scientist Sasha Luccioni https://www.nature.com/articles/d41586-023-00816-5told Nature. "Therefore you can't enhance it."
 
"It's really difficult as a human being to be responsibility for something you can't control," said Claudi Bockting, professor of Clinical Psychology in Psychiatry at the University of Amsterdam's School of Medicine, to Nature. "One of the worries is that they may be considerably more prejudiced than, say, human bias."
 
Numerous experts, including Lightning AI CEO and PyTorch Lightning founder William Falcon, speculated on why OpenAI may have restricted their model in the first place. Because GPT-3 was open source, other corporations competitively duplicated it for their own advantage. The completion of GPT-4 provided OpenAI with a competitive advantage. While this benefit comes at the expense of transparency, Falcon did state that OpenAI is currently considering ethics in light of the GPT-4 study and investing in upgrading its model.
 
"They are definitely concerned about ethics and ensuring that products do not damage people," Falcon told VentureBeat. "I believe they've given it some thought. In this situation, I believe it is just a matter of individuals not reproducing since, as you may have noticed, every time they announce something new [it is reproduced]."

 

Additional Ethical Information

OpenAI devoted a large amount of attention in the study to not just outlining several of GPT-4's flaws, but also to the model's deep ethical implications. GPT-3, for example, was discovered by researchers to be capable at creating purposely inaccurate yet compelling information in its attempts to influence a narrative surrounding a politically important issue. GPT-4 is projected to be even better at these tasks, "increasing the potential that bad actors may utilize GPT-4 to generate false information and that society's future epistemic ideas would be largely molded by persuasive LLMs," according to OpenAI.
 
The company mentioned the model's potential detrimental impact on other ethical problems such as the spread of conventional and unconventional weapons, privacy, cybersecurity, and so on. But two in particular stood out: user overreliance and the economic effects of GPT-4.
 
"Overreliance happens when users place undue confidence and dependence on the model, possibly leading to unreported errors and inadequate monitoring," noted OpenAI. "When users get more acquainted with the system, dependence on the model may hamper the acquisition of new abilities or even lead to the loss of critical talents. Overreliance is a failure mode that is anticipated to rise as model capacity and reach expand. Users are less inclined to question or check the model's results as errors become more difficult for the ordinary human user to identify and widespread trust in the model develops."
 
Since its model is so strong, OpenAI warns users against becoming overly reliant on ChatGPT Plus or any API that employs GPT-4.
 
 
"Although these models also present new chances for innovation in many industries by enabling more tailored and efficient services and create new prospects for job seekers, particular attention should be paid to how they are used in the workplace over time," stated OpenAI.
 
For all of the reasons listed above, AI businesses stress that these technologies should not be used as a one-to-one substitute for human-operated positions.
 
OpenAI has stated that it will go beyond simply exposing GPT-4's flaws.
 
"We are investing in initiatives to continue to evaluate the impacts of GPT-4, such as studies on how worker performance changes on increasingly complicated tasks given access to models, surveys to our users and businesses expanding on our technology, and our researcher access program," according to the GPT-4 study.
 
The research program alluded to in the phrase was likewise completed prior to the publication of GPT-4. Outside firms' "red teams" tested GPT-4 early on to discover and correct issues. Andrew White, a chemical engineer at the University of Rochester, was one of the red-teamers who evaluated GPT-4 for six months and offered feedback that helped make the model safer and more powerful for users.
 
"We are dedicated to independent audits of our technology, and have provided some first actions and thoughts in this area in the system card accompanying this release," according to OpenAI's report. "We intend to make further technical facts available to third parties who may advise us on how to balance the competitive and safety reasons mentioned above against the scientific value of greater transparency."
 
Reports like OpenAI's cannot account for everything. The technology will provide some unpleasant shocks in terms of use cases and long-term consequences. Moreover, as the burgeoning AI area heats up, there will be instances of enterprises making tactical, business-first steps in relation to their products. OpenAI's report was both transparent and opaque in some respects, which raises fascinating questions about the nature of product releases and open- vs closed-source strategies in this business.
 
Nonetheless, OpenAI's attempts to mitigate the detrimental outcomes and many ethical problems that GPT-4 might offer are eventually beneficial. Further study in that area is required in the future if these instruments are to serve the public good rather than harm it. For all of our current ethical issues, as well as those we can't even envisage, will become even more pressing as we approach GPT-5 and beyond.


This website uses cookies to improve your web experience.