Cybersecurity & Tech

Can Better Training Reduce the Success Rate of Phishing Attacks?

Jonathan G. Cedarbaum
Wednesday, May 3, 2023, 2:49 PM
A review of Arun Vishwanath, “The Weakest Link: How to Diagnose, Detect, and Defend Users From Phishing Attacks” (MIT Press, 2022)

Published by The Lawfare Institute
in Cooperation With
Brookings

A review of Arun Vishwanath, “The Weakest Link: How to Diagnose, Detect, and Defend Users From Phishing Attacks” (MIT Press, 2022)

***

Many elements of the cyber threat landscape have changed significantly over the past two decades. For one, the number of attackers has grown dramatically, aided by the increasing availability of hacking tools and services as commodities for purchase in online marketplaces. The value of the losses cyber criminals have been able to inflict on their victims has also grown, though the dollar estimates vary widely in absolute terms. In recent years, the popularity of ransomware has increased substantially, prompting the Biden administration to initiate an ongoing diplomatic effort to foster cross-border efforts to curb this dangerous form of cyber-enabled extortion.

But amid these changes—and many others one might highlight—one facet of the cyber threat landscape has remained remarkably stable: the prevalence of phishing as a means by which attackers are able to gain access to targeted systems, whether through implanting malware or through harvesting credentials needed for entry. In his new book “The Weakest Link: How to Diagnose, Detect, and Defend Users From Phishing,” Arun Vishwanath cites a study of 2017 data identifying phishing as a tool used in 93 percent of data breaches. Six years later, both the Cybersecurity and Infrastructure Security Agency (CISA) and Google say the figure remains at 90 percent or above. While some other sources may put the figure a few percentage points lower, the attractiveness of phishing remains unwavering, whether to nation-state cyber forces, private criminal organizations, or groups that straddle both of those worlds.

Phishing is a “digital form of social engineering that uses authentic-looking—but bogus—emails to request information from users or direct them to a fake Web site that requests information.” Spear phishing involves the same techniques and goals but uses messages more tightly crafted to appeal to particular individuals or smaller groups. As communications platforms have proliferated, phishing methods have followed right along, with “smishing” the term for similar deceptions carried out over SMS or other text messaging systems and “vishing” involving voices over (cell) phone lines.

Because phishing has been such an important tool of cyberattackers for so long, it has spawned a considerable community of researchers and security practitioners trying to analyze how best to identify and frustrate phishing attacks. Nearly 20 years ago, an industry coalition started the Anti-Phishing Working Group, which puts out quarterly reports on trends in phishing attacks and sponsors collaborations among industry, law enforcement, and academic specialists. One important strand of anti-phishing research has focused on identifying characteristics of phishing messages that could be used to develop effective screening systems to block out phishing messages. But however effective such technical means may be, the volume of phishing emails is great enough that even a blocking rate of 99 percent will miss a substantial number of email lures.

Another technical defense against phishing, urged as a first resort by CISA, is multifactor authentication—that is, requiring more than one method of verifying a user’s identity to get access to accounts or data. If a phisher steals one set of a user’s credentials, multifactor authentication should ensure that the theft is not sufficient to enable the thief to complete a break-in. But, like every other response to phishing, multifactor authentication is not a panacea, in part because it guards against credential harvesting but not malware implantation schemes that don’t depend on the attacker’s use of the target’s login credentials. 

Because technical responses to phishing have yet to show themselves to be foolproof, another substantial strand of phishing research has focused on “the people problem,” as Arun Vishwanath puts it in his new book—that is, figuring out how best to train people not to fall for phishing messages. Vishwanath has been a significant contributor to this kind of anti-phishing analysis for nearly two decades, and “The Weakest Link” brings together and expands on a series of research papers he has published over the years with colleagues. Vishwanath spent a number of years as a professor of communication studies, and he now heads his own “market and strategic research firm.” “The Weakest Link” reflects both phases of the author’s career.

Vishwanath can be an engaging writer, but in both tone and content the book careens back and forth at times between the voice of a social scientist and the voice of a salesman. When writing in the former vein, Vishwanath helps place his research in the context of other studies, acknowledging and explaining the broader field in which he has played a part. In the latter guise, he sometimes can’t restrain himself from offering not just one or two but five or seven reasons why his methods for identifying individuals most likely to click on phishing emails, and for developing anti-phishing training, are better than the competition’s.

Ultimately, “The Weakest Link” offers a series of appealing proposals for making phishing testing more systematic and thus making training to reduce phishing susceptibility more effective. As with many works in this field—and many in the broader field of social psychology, in which, from one angle, Vishwanath’s work may be situated—the results would be even more compelling if the data supporting them were presented more systematically and subjected to efforts at replication by other investigators.

Hacking Cybersecurity Training

“The Weakest Link” begins with a brief and entertaining history of phishing. Vishwanath starts with two important precursors: the so-called phone freakers of the 1960s and 1970s, who often used deception to break in to the long-distance phone lines of a then-monopolistic AT&T, and the advanced-fee scammers of the 1980s and 1990s. Some of the former went on to become pioneering hackers once business computer systems and then personal computers became ubiquitous. The latter, made notorious by Nigerian gangs offering enormous riches in response for small up-front payments, remain an element of the cybercrime landscape today, albeit often employing more sophisticated methods of deception. The first use of “phishing” in the current sense appeared on AOL messaging systems in the 1990s, with hackers impersonating AOL employees to convince users to turn over their credentials. With the growth in the number of internet service providers (ISPs) and the explosion in the number of websites, hackers increasingly combined deceptive emails (or other forms of communication) with fake websites, from which a user’s click would unleash a piece of malware. 

As the financial and other costs of phishing-enabled intrusions have risen dramatically, organizations have combined efforts at technical defense with more frequent and extensive training designed to enable users to avoid phishy deceptions. The importance of training has been reinforced by its appearance in various cybersecurity regulations and guidelines. But, citing two studies from the 2000s and a more recent meta-analysis that in turn considered four additional studies that addressed the effect of training over time, Vishwanath argues that companies and government agencies have gotten training all wrong. Much of the training, he suggests, can be categorized as didactic or embedded. Didactic training consists of educating users on common phishing techniques and strategies. Embedded training involves sending fake phishing emails, seeing how employees do, and then sending corrective explanatory information afterward. Users who do well are praised; users who do poorly are scolded; and the cycle continues, perhaps with somewhat longer or more frequent training sessions provided to users who fall for deceptive messages more frequently. According to Vishwanath, these training strategies typically lead to only ephemeral effects, with users forgetting the lessons in just a few weeks or months. 

These approaches fail, Vishwanath contends, because of two linked defects. First, they focus only on outcomes—how users perform—but fail to diagnose the causes of weak performance. That is, they neglect to assess the elements that lead many individuals to fall for phishing emails—particularly the patterns of thinking that incline individuals to be tricked by deceptions. As a result, they do not collect the right kinds of data that can help diagnose and so address what drives users’ failures to evade phishing messages.

Drawing on survey response and behavioral data studies that Vishwanath and colleagues did on test groups during the 2010s, Vishwanath proposes a different approach to assessing phishing susceptibility, one he calls the Suspicion, Cognition, and Automaticity Model (SCAM). In this model, suspicion, that is, “the feeling of unease that is triggered by informational cues in the environment,” is the crucial determinant of susceptibility to the deception involved in phishing emails. Vishwanath groups the forces influencing users’ level of suspicion into two categories: “cyber risk beliefs” and “self-regulation.”

Cyber risk beliefs are “users’ beliefs about the inherent risks of their online actions.” Those beliefs may influence two types of mental processing. The first, heuristic processing, “involves the use of cognitive shortcuts triggered by cues,” such as visual and textual components of the phishing email. The second, systematic processing, “involves elaborated thinking about the elements in a message.” Finally, by “self-regulation,” Vishwanath means the tendency to engage in “habitual responses[,] ... the automatic, nonconscious reactions that are triggered by rituals and patterns of media usage.” The influences on the relevant habitual responses go beyond the test phishing email or experience with prior messages to broader factors such as device type (for example, a greater tendency to click reflexively on smartphones than on computers), work norms, and a user’s “media use rituals” more generally.

If these factors figure in the susceptibility of users to deception, employing them to create a practical method to shape effective tests and training requires additional steps. One of the most important is a way of standardizing the variations in phishing emails and gauging how those variations may influence the click-through rates in testing. As Vishwanath acknowledges, in 2020 researchers at the National Institute of Standards and Technology (NIST) proposed a system to do this that they called a “Phish Scale.” Their scale rested on two characteristics of emails. First, they grouped types of common cues present in phishing emails into five categories. For example, one category was technical indicators, such as domain spoofing; another was message content clues, such as requests for sensitive information or lack of signatory details. Second, they considered the extent to which the email fit the workplace context of the people receiving it. For example, if the recipients are in a company’s finance department and the email has as its premise a late or missed payment, that would constitute high contextual alignment. If instead an email addressed to the same audience concerned a call for papers about biomedical research, that would reflect low contextual alignment.

The NIST researchers assessed the difficulty—that is, the likely deceptiveness—of emails by combining the number of phishing cues present with the degree of contextual alignment. The fewer the cues and the greater the contextual alignment, they predicted, the higher the click-through rates would be. They then tested their predictions by carrying out seven testing exercises in workplace settings. Their results showed some support for their prediction, though they acknowledged their study had a number of limitations and so was a preliminary effort.

Vishwanath followed a similar path, perhaps a few years earlier, though he is less clear than the NIST researchers were about the exact scope of the testing he carried out in developing his model. Where the NIST researchers relied on two categories of factors, Vishwanath offers a three-part gauge of phishing email difficulty, which he calls the V-Triad. Two sides of the triad, as he notes, are similar to categories in the NIST researchers’ Phish Scale. One, compatibility, is similar to the NIST paper’s contextual alignment category. Another, credibility signals, bears some resemblance to the cues category in the NIST approach, though Vishwanath includes both credibility-enhancing cues (hooks) and suspicion-arousing cues (tip-offs), while the NIST model focuses on the former. Vishwanath calls the third side of his triad “customizability signals,” by which he means content in certain fields, such as sender name or subject line, that suggest to the recipient that the email has been customized for him or her.

In order to build appropriate emails for testing particular audiences, Vishwanath recommends assessing various possible phishing emails with the results of a questionnaire given to a subset of the audience, “either the IT staff who crafted the pen test or a few users who are not part of the routine pen test.” Although Vishwanath began with a long list of survey questions, getting at the various cognitive, behavioral, and experiential elements of his Suspicion, Cognition, and Automaticity Model, he decided that a long list would be too cumbersome for use in most organizations. The list of questions therefore had to be “purifi[ed.]” He asserts that a single pair of questions suffices to “establish a baseline” of the difficulty of particular test emails and thus their likely impact. Those questions are (a) “Using a scale of 0-10, where 0 indicates no suspicion and 10 indicates highly suspicious, how suspicious is any average user in the organization going to be to this attack? and (b) Using complete sentences, please explain in detail the reasons for this level of suspicion.” Answers to the open-ended question, he suggests, can be coded using the various components in the Suspicion, Cognition, and Automaticity Model and thus analyzed more systematically. He recognizes that reliance on such a limited diagnostic tool may arouse skepticism, but he claims it has demonstrated its effectiveness in practice. 

With these models and methods in hand, Vishwanath recommends a regular cycle of phishing pen tests, followed by analysis of responses from users to a two-question “cyber risk survey” similar to the one used to establish the baseline for the phishing emails employed in the test, followed by reporting the results to appropriate stakeholders in the organization. The results of the testing and cyber risk survey, he contends, can both identify “weak links,” that is, users particularly susceptible to phishing messages, and the components in their thinking and behavior that contribute to that susceptibility. Training can then be tailored to address the sources of particular users’ vulnerability—ideally through positive guidance, but, if necessary, by cautions and even restrictions on access to certain systems or functions. Moreover, by aggregating users’ scores on the cyber risk survey and comparing the results to the baseline assessment used in creating the test emails, Vishwanath argues, organizations can determine their overall phishing risk. 

Finally, Vishwanath offers an alternative system for assessing cyber hygiene, which he defines as “the cyber security practices online users should engage in to protect the safety and integrity of their information on Internet-enabled devices from being compromised in a cyber-attack.” The results of this assessment, he argues, can be combined with his approach to phishing pen tests, to incorporate additional focus in subsequent training. The trouble with most approaches to cyber hygiene, Vishwanath argues, stems from the analogy to human health precautions built in to the very term “hygiene.” That implicit analogy, he contends, has too often led to recommended practices that do not fit computer systems or human interactions with them. Some of the recommendations, such as looking to see whether a website has an SSL certificate, may be less than effective, given that a substantial majority of phishing websites possess such certificates. Others, such as employing lengthy, complex passwords, may be counterproductive, if they are so difficult for users to follow that they end up encouraging less secure workarounds. Drawing on interviews with security professionals in private and public organizations, Vishwanath offers a 25-question inventory, grouped in five categories: storage and devices, authentication and credentials, friend connections and social media, email and messaging, and transmission. Although his cyber hygiene inventory covers broader ground than his phishing models, Vishwanath proposes several ways in which the result of both diagnostic efforts can be combined to assist security professionals reduce organizational cyber risk.

****

If “The Weakest Link” brings to a wider audience a number of practical methods for improving anti-phishing testing and training, it will have made a significant contribution. Putting aside whether every specific element of Vishwanath’s models and indexes are ideal, the importance of the basic lessons he attempts to drive home seems considerable. For phishing testing to be effective, the difficulty of the test emails used must be gauged in a systematic way. For the results of that testing to be genuinely useful, they must include data that gets at users’ habitual behaviors and thinking processes in responding to phishing emails. Without that data, neither the substance of subsequent training nor the appropriate audience for different kinds of training can be tailored for maximum effect. 

If Vishwanath’s efforts at systematizing analytics for and establishing metrics to guide phishing testing and training deserve praise, his claims would be even more compelling if he laid out more clearly the empirical bases for his research. Some of those bases are provided in the research papers that first set out many of the models in “The Weakest Link.” But the book itself rarely offers more than anecdotal accounts of experiments or assignments at individual organizations. It will be important for other researchers to attempt to replicate Vishwanath’s results and for practitioners who put his methods into practice to report their results in a systematic way. If Vishwanath, ideally working in collaboration with an independent organization, were to create a database in which such academic and nonacademic findings could be collected and made available to other researchers and security professionals, then his efforts could be thoroughly tested and refined—a process not unlike the one he himself recommends for improving anti-phishing testing and training. Relying on a process of that sort may be particularly important as chat bots and other AI-enabled tools make phishing emails all the more sophisticated


Jonathan G. Cedarbaum is a professor of practice at George Washington University Law School, affiliated with the program in national security, cybersecurity, and foreign relations law. During the first year of the Biden Administration he served as Deputy Counsel to the President and National Security Council Legal Advisor.

Subscribe to Lawfare