Saturday, September 16, 2017

The Genius Fallacy

During our department's PhD Open House this past spring, a student asked what I thought made a PhD student successful. I realized that my answer now is different than it would have been a few years ago.

My friend Seth tells me I need to build more suspense in my writing*, so let me first tell my life story.

The whole time I was growing up, I was slightly disappointed that I wasn't some kind of prodigy. It seemed that my parents were telling me every day about so-and-so's toddler son who was playing Beethoven concertos from memory, so-and-so's daughter who, as an infant, had already completed a course on special relativity. In order to give me the same opportunities to demonstrate my genius, my parents spent all their money on piano lessons, gymnastic classes, writing camps, art camps, tennis camps, and extracurricular math classes. Unfortunately, nobody ever said, "This is the best kid I have ever seen. I must take her away from her family to train her for greatness."

"Child prodigies have hard lives," my father would tell me, probably trying hiding his disappointment. "It can be difficult for them to make friends because others can't relate to how gifted they are."

"Just work hard, be a nice person, and try to be happy," my mother would tell me. "You didn't know how to cry when you were born. I'm glad you're able to talk in full sentences."

Despite the comforting words from my parents, there was always a part of me that held out hope of discovering a secret prodigious talent. But the angst of not being a prodigy was small compared to the existential angst of being newly alive and so I mostly tried to work hard, to be a nice person, and to be happy. This got me all the way to college, where I thought I could leave all this prodigy nonsense behind me.

In college, I discovered that the pressure to be immediately and wildly gifted came in another form. In my first two years of school, I attended many talks and panels by professors telling us what we should do with our lives. I attended a research panel in the economics department, where one of the professors kept repeating the word "star."

"You have to be a super star to succeed in a department like ours," he said about what it meant to be on the tenure track in the economics department. "I want undergraduate researchers who are stars."

I didn't know what a star was and I didn't presume to be one, but I liked the professor's research, so I emailed him my resume and said I would like to work with him.

He never wrote back.

I resigned myself to not being a star. I took hard classes with people who had medaled in math, informatics, and science olympiads, wondering how it would feel to do the problem sets if I had such a gifted, well-trained mind. I also became concerned about my future. What was my place in a world that worshipped instatalent?

It all began to change when I began to talk more with the professors in the Computer Science department. Despite my lack of apparent star quality, my professors seemed to like answering the questions I asked them. They pitched me projects I could do, and before I knew it I was applying to PhD programs and preparing to spend the next few years doing academic research. As I was graduating, I spoke with my one professor to get advice about my future in research.

"Research isn't just about smarts," my professor told me. At the time, I thought this was a white lie that professors told to their students who weren't prodigies.

Then she told me something that turned my worldview upside down. "My biggest concern for you, Jean, is that you need to start finishing projects," she told me. "You need to focus."

It was then that I began to realize that maybe the myth of the instagenius was but a myth. I had gone from interest to interest, from project to project, waiting to find It, that easy fit, that continuous honeymoon. With some projects I had It for a while, long enough to demonstrate to myself and others that I could finish. Then I moved on, waiting to fall in love with a problem, waiting for a problem to choose me. What I had failed to see was that this relationship with a problem didn't just happen: I had to do my share of the work.

Still, I clung to the dream of the easy problem. At Google, employees get to have a 20% project: a side project they spend the equivalent of one day a week working on that may or may not make its way into production eventually. In graduate school, my 20% project was looking for an easier project--a project with which I had more chemistry, a project with fewer days lost to dead ends and angst. One of my hobbies involved interviewing for internships in completely different research areas. Another one of my hobbies was fantasizing about becoming a classics PhD student, despite knowing no ancient languages. (I once took an upper-level literature seminar on Aristotle with the leading world scholar on Homeric poetry and I thought he had a pretty good life.)

But because I like to finish what I started, the PhD became a process of learning to persevere. Instead of indulging the temptation to switch projects, advisors, or even schools, I kept going. I endured something like five rounds of rejections on the first paper towards my PhD thesis, and multiple years of people telling me that maybe I should find another topic, because I didn't seem in love. Eventually, I learned that every problem that looks like it might be easy has hard parts, every problem that looks like it might be fun has boring parts, and all problems worth solving are full of dead ends. I finally learned, in the words of my friend Seth, that "the grass is brown everywhere."

And this shattering of my belief in instagenius has shaped my conception of what makes a student a star. There was a time when I, like many people, thought that the superstars were the ones who sounded the most impressive when they spoke, or who had the most raw brainpower. If you asked me what I thought made a good researcher, I may have said some other traits like creativity and good taste in problems. And while all these certainly help with being a good researcher, there are plenty of people with these traits who do not end up being successful.

What I have learned is that discipline and the ability to persevere are equally, if not more, important to success than being able to look like a smart person in meetings. All of the superstars I've known have worked harder--and often faced more obstacles, in part due to the high volume of work--than other people, despite how much it might look like they are flying from one brilliant result to another from the outside. Because of this, I now want students who accept that life is hard and that they are going to fail. I want students who accept that sometimes work is going to feel like it's going to nowhere, to the point that they wish they were catastrophically failing instead because then at least something would be happening. While confidence might signal resilience and a formidable intellect might decrease the number of obstacles, the main differentiator between a star and simply a smart person is the ability to keep showing up when things do not go well.

It has become especially important for me to fight the idolization of the lone genius because it is not just distracting, but also harmful. Currently, people who "look smart" (which often translates into looking white, male, and/or socioeconomically privileged) have a significant advantage for two main reasons. The first reason has to do with self-perception. Committing to hard work and overcoming obstacles is easier if you think it will pay off. If someone already does not feel like they belong, it is easier for them to stop trying and self-select out of a pursuit when they hit a snag. The second reason has to do with perception by others. Research suggests that in fields that value innate talent, women and other minorities are often stereotyped to have less of it, leading to unfair treatment.

And so I've written this post not just to reveal my longstanding delusions of grandeur, but also to start a discussion how the myth of instagenius holds us back, as individual researchers and as a community. Would love to hear your thoughts about how we can move past the genius fallacy.

Related writing:

* Seth also tells me the main idea of this blog post is the same as Angela Duckworth's book Grit. I guess I should tell you that you could read that instead of this. On the subject of the lack of originality of my ideas, you should also read what Cal Newport has to say about the "passion trap."

Wednesday, August 09, 2017

Guest Post: The Real Problem Isn't Gender; It's the Modern Media

This guest post by Seth Stephens-Davidowitz was adapted from a comment he wrote on a Facebook post of mine sharing this essay.

While gender in tech is certainly an issue, a lot of the controversy over it is unnecessary. What recently happened with the Google memo is a classic case of Scott Alexander’s Toxoplasma of Rage, one of the most brilliant pieces I have ever read. Read his post. Then read it again.

The stories that go viral are those that maximize anger and foster the most disagreement.

Guy writes a memo with a lot of true statements but an aggressive tone bound to infuriate some people. Within two days, everybody is predictably furious.

My hypothesis is that an overwhelming majority of people actually agree on many of the points of contention--or would agree if they were phrased a little less aggressively, in a tone less likely to create controversy and less likely to go viral.

How many people, for example, agree with the following statement?

"We do not know why CS majors are 80% male. It is possible that, even though millions of women have a passion for computer science, there are, in aggregate, fewer women than men who have this passion. We don't know since computer science is kind of new. And also we don't really understand why female CS majors rose to 40% and then plummeted. Since it is possible that discrimination and stereotypes play a role, we should devote resources to making sure everybody with interest in these high-status jobs has ample opportunity to pursue them. Also, everybody should be judged based on their own interest and aptitude in a job, not how many people of their gender would want that job. Finally, the majority of women in tech--as well as many other high-powered fields--have said they have faced sexism, and we should work really hard to stop that."

This addresses many of the controversies that were raised by the James Damore memo and the responses to it, but is phrased in a way such that few people would find it objectionable. Perhaps we should stop falling for these traps that maximize rage and instead try sober analysis. We may find a surprising amount of consensus.

Lastly, no young person, man or woman, should actually be training for anything--driving cars, teaching kids, diagnosing diseases, or writing programs--because AI will soon do all that for us. ;)

For 352 pages of sober analysis on even more controversial topics, you can check out Seth’s book Everybody Lies.

Saturday, April 01, 2017

Techniques for Protecting Comey's Twitter: A Taxonomy

Person in the know calling me out.
After my post about how the Comey Twitter leak was the most exciting thing ever for information flow security researchers, I had some conversations with people wanting to know how to tell between information that is directly leaked and information that is deduced. Someone also pointed out that I didn't mention differential privacy, a kind of statistical privacy that talks about how much information an observer can infer. It's true: there are many mechanisms for protecting sensitive information, and I focused on a particular one, both because it was the relevant one and because it's what I work on. :)

Since this Comey Twitter leak is such a nice example, I'm going to provide more context by revisiting a taxonomy I used in my spring software security course, adding statistical privacy to the list. (Last time I had to use a much less exciting example, about my mother spying on my browser cookies.)

  • Access control mechanisms resolve permissions on individual pieces of data, independently of a program that uses the data. An access control policy could say, for instance, that only Comey's followers could see who he is following. You can use access control policies to check data as it's leaving a database, or anywhere in the code. Things people care about with respect to access control is that the access control language can express the desired policies while providing provable guarantees that policies won't accidentally grant access, and can be checked reasonably efficiently.
  • Information flow mechanisms check the interaction of sensitive data with the rest of the program. In the case of this Comey leak, access control policies were in place some of the time. For example, if you went to Comey's profile page, you couldn't see who he was following. How the journalist ended up finding his page was by looking at the other users suggested by the recommendation algorithm after requesting to follow hypothesized-Comey. (This was aided by the fact that Comey is following few people, and  In this case, it seems that Instagram was feeding secret follow information into the recommendation algorithm and not realizing that the results could leak follow information. An information flow mechanism would make sure that any computation based on secret follow information could not make its way into the output from a recommendation algorithm. If the follow list is secret, then so is the length of that list, people followed by people on the follow list, photos of people from the list, etc.
  • Statistical privacy mechanisms protect prevent aggregate computations from revealing too much information about individual sensitive values. For instance, you might want to develop a machine learning algorithm that uses medical patient record information to do automated diagnosis given symptoms. It's clear that individual patient record information needs to be kept secret--in fact, there are laws that require people to keep this secret. But there can be a lot of good if we can use sensitive patient information to help other patients. What we want, then, is to allow algorithms to use this data, but with a guarantee that an observer has a very low probability of tracing diagnoses back to individual patients. The most popular formulation of statistical privacy is differential privacy, a property over computations that allows computations only if observers can tell the original data apart from slightly different data with very low probability. Differential privacy is very hot right now: you may have read that Apple is starting to use this. It's also not a solved problem: my collaborator and co-instructor Matt Fredrikson has an interesting paper about the tension between differential privacy and social good, calling for a reformulation of statistical privacy to address the current flaws.
For those wondering why I didn't talk about encryption: encryption focuses on the orthogonal problem of putting a lock on an individual piece of data, where locks can have varying cost and varying strength. Encryption involves a different kind of math--and we also don't cover encryption in my spring course for this reason.

Another discussion I had on Twitter.
Discussion. Some people may wonder if the Comey Twitter leak is an information flow leak, or some other kind of leak. It is true that in many cases, this Instagram bug may not be so obvious because someone is following many people, and the recommendation algorithm has more to work with. I would argue that it squarely is in the purview of information flow mechanisms. If follow information is secret, then recommendation algorithms should not be able to compute using this data. (Here, it seems like what one means by "deducible" is "computed from," and that's an information flow property.) We're not in a situation where these recommendation engines are taking information from thousands of users and doing something important. It's very easy for information to leak here, and it's simply not worth the loss to privacy!

Poor, and in violation of our privacy settings.
Takeaways. We should stand up for ourselves when it comes to our data. Companies like Facebook are making recommendations based on private information all the time, and not only is it creepy, but it violates our privacy policies, and they can definitely do something about it. My student Scott recently made $1000 from Facebook's bug bounty program reporting that photos from protected accounts were showing up in keep-in-touch emails from Instagram. If principles alone don't provide enough motivation, maybe the $$ will incentivize you to call tech companies out when you encounter sloppy data privacy practices.

Friday, March 31, 2017

Five Research Ideas Instagram Could Have Used to Protect Comey's Secret Twitter

Even though cybersecurity is one of the hottest topics on the Internet, my specific area of research, information flow security, has remained relatively obscure. Until now, that is.

You may have heard of "information flow" as a term that has been thrown around with terms like "data breach," "information leak," and "1337 hax0r." You may not be aware that information flow is a specific term, referring to the practice of tracking sensitive data as it flows through a program. While techniques like access control and encryption protect individual pieces of data (for instance, as they leave a database), information flow techniques additionally protect the results of any computations on sensitive data.

Information flow bugs are usually not the kinds of glamorous bugs that make headlines. Many of the data leaks that have been in the public consciousness, for instance the Target and Sony hacks, happened because the data was not protected properly at all. In these cases, having the appropriate access checks, or encrypting the data, should do the trick. But "why we need to protect data more better" is harder to explain. Up through my PhD thesis defense, I had such a difficult time finding headlines that were actually information flow bugs that I resorted to general software security motivations (cars! skateboards! rifles!) instead.

From the article.
Then along came "This Is Almost Certainly James Comey's Twitter Account," an article I have been waiting for since I started working on information flow in 2010. The basic idea behind the article is this: a journalist named Ashley Feinberg wanted to find FBI director James Comey's secret Twitter account, and so started digging around the Internet. Feinberg was able to be successful within four hours due to being clever and a key information leak in Instagram: when you request to follow an Instagram account, it makes algorithmic suggestions based on who to follow. And in the case of this article, the algorithmic suggestions for Comey's son Brien included several family members, including James Comey's wife--and the account that Feinberg deduced to be James Comey's. And it seems that Comey uses the same "anonymous" handle on Instagram as he does on Twitter. And so Instagram's failure to protect Brien Comey's protected "following" list led to the discovery of James Comey's Twitter account.

So what happened here? Instagram promises to protect secret accounts, which it (sometimes*) does. When one directly views the Instagram page of a protected user, they cannot access that person's photos, who that user is following, and who follows that user. This might lead a person to think that all of this information is protected all of the time. Wrong! It turns out the protected account information is visible to algorithms that suggest other users to follow, a feature that becomes--incorrectly--visible to all viewers once a follow is requested, because, presumably, whoever implemented this functionality forgot an access check. In this case the leak is particularly insidious because while the profile photos and names of the users shown are all already public, they are likely shown as a result of a computations on secret information: Brien Comey's protected follow information. (This is a subtle case to remember to check!) In information flow nomenclature, this is called an implicit flow. When someone is involved in a lot of Instagram activity, the implicit flow of the follow information may not be so apparent. But when many of the recommended follows are Comey family members, many of them who use their actual names, this leak becomes more serious!

Creepy Facebook search, from express.co.uk.
In the world of information flow, this article is a Big Deal because it so perfectly illustrates why information flow analyses are useful. For years, I had been jumping up and down and waving my arms (see here and here, for instance) about why we need to check data in more places than the point where it leaves the database. Applications aren't just showing sensitive values directly anymore, but the results of all kinds of computations on those values! (In this case it was a recommendations algorithm.) We don't always know where sensitive data is eventually going! (As was the case when Brien Comey's protected "following" list was handed over to the algorithm.) Policies might depend on sensitive data! We may even compute where sensitive data is going based on other sensitive data! In a world where we can search over anything, no data is safe!

Until recently, my explanations have seemed subtle and abstract to most, in direct contrast to the sexy flashy security work that people imagine after watching Hackers or reading Crypto. By now, though, we information flow researchers should have your attention. We have all kinds of computations over all kinds of data going to all kinds of people, and nobody has any clue what is going on in the code. Even though digital security should be one of the main concerns of the FBI, Comey is not able to avoid the problems that arise from the mess of policy spaghetti that is modern code.

Fortunately, information flow researchers have been working for years on preventing precisely this kind of Comey leak**. In fashionable BuzzFeed style, I will list exactly five research ideas Instagram could adapt to prevent such leaks in the future:
  1. Static label-based type-checking. In most programming languages, program values have types. Type usually tell you simple things like whether something is a number or a list, but they can be arbitrarily fancy. Types may be checked at compile time, before the program runs, or at run time, while the program is running. There has been a line of work on static (compile time) label-based information flow type systems (starting with Jif for Java, with a survey paper here describing more of this work) that allows programmers to label data values with security levels (for instance, secret or not) as types, and that propagate the type of a program that makes sure sensitive information does not flow places that are less sensitive. These type systems give guarantees about any program that runs. The beauty of these type systems is that while they look simple, they are clever enough to be able to capture the kind of implicit flow that we saw with algorithms leaking Brien Comey's follow information. (We'd label the follow lists as sensitive, and then any values computed from them couldn't be leaked!)
  2. Static verification. Label-based type-checking is a light-weight way of proving the correctness of programs according to some logical specification. There are also heavier-weight ways of doing it, using systems that translate programs automatically into logical representations, and check them against the specification. Various directions of work using refinement types, super fancy types that depend on program values could be used for information flow. An example of a refinement type is {int x | canSee(alice, x)}, the type of a value that exists as an integer x that can only exist if user "alice" is allowed to see it according to the "canSee" function/predicate) Researchers have also demonstrated ways of proving information flow properties in systems like IronClad and mKertiKOS. These efforts are pretty hardcore and require a lot of programmer effort, but they allow people to prove all sorts of boutique guarantees on boutique systems (as opposed to the generic type system guarantees using the subset of a language that is supported).
  3. Label-based dynamic information flow tracking. Static label-based type-checking, while useful, often requires the programmer to put labels all over programs. Systems such as HiStar, Flume (the specific motivation of which was the OKCupid web server), and Hails allow labeling of data in a way similar to static label-based type systems, but track the flow of information dynamically, while the program is running. The run-time tracking, while it makes it so that programmers don't have to put labels everywhere, comes at a cost. First, it introduces performance slowdowns. Second, we can't know if a program is going to give us some kind of "access denied" error before it runs, so there could be accesses denied all over the place. Many of these systems handle these problems by doing things at the process level: if there is an unintended leak anywhere in the process, the whole process aborts. (Those who haven't heard of processes can think of the process as encapsulating a whole big task, rather than an individual action, like doing a single arithmetic operation.)
  4. Secure multi-execution. Secure multi-execution is a nice trick for running black-box code (code that you don't want to--or can't--change) in a way that is secure with respect to information flow. The trick is this: every time you reach a sensitive value, you execute the sensitive value in one process, and you spawn another process using a secure default input. The process separation guarantees that sensitive values won't leak into the process containing the default value, so you know you should always be allowed to show the result of that one. As you might guess, secure multi-execution can slow down the program quite a bit, as it needs to spawn a new process every time it sees a sensitive value. To mitigate this, my collaborators Tom Austin and Cormac Flanagan developed a faceted execution semantics for programs that lets you execute a program on multiple values at the same time, with all of the security guarantees of secure multi-execution.
  5. Policy-agnostic programming. While all of these other approaches can prevent sensitive values from leaking information, if we want programs to run most of the time, somebody needs to make sure that programs are written not to leak information in the first place. It turns out this is pretty difficult, so I have been working on programming model that factors information flow policies out of the rest of the program. (If I'm going to write a whole essay about information flow, of course I'm going to write about my own research too!) Instead of having to implement information flow policies as checks across the program, where any missing check can lead to a bug, type error, or runtime "access denied," programmers can now specify each policy once, associated with the data, along with a default value, and rely on the language runtime and/or compiler to make the code execute according to the policies. In the policy-agnostic system, the programmer can say that Brien Comey's follows should only be visible to followers, and the machine becomes responsible for making sure this policy is enforced everywhere, including the code implementing the recommendations algorithm. That policies can depend on sensitive values, that sensitive values may be shown to viewers whose identities are computed from sensitive values, and that enforcing policies usually implementing access checks across the code are all challenges. Our semantics for the Jeeves programming language (paper here) addresses all of these issues using a dynamic faceted execution approach, and we have also extended this programming model to handle applications with a SQL database backend (paper here). We are also working on a static type-driven repair approach (draft here).
I don't know how much this Twitter account leak upset the Comeys, but reading this article was pretty much the most exciting thing that I have ever done. Up until now, most people have thought about security in terms of protecting individual data items, rather than in terms of a complex and subtle interaction with the programs that use them. This has started to change in the last few years as people have been realizing just how much of our data is online, and just how unreliable the code is that we trust with this data. I hope that this Comey leak will cause even more people to realize how important it is to reason deeply about what our software is doing. (And to fund and use our research. :))

* A student in my spring software security course (basically, programming languages applied to security), Scott, had noticed earlier this semester that emails from Instagram allowed previews of protected accounts he was not following. He reported this to Facebook's bug bounty program and made $1000. I told him to please write in the course reviews that the course helped him make money.
** Note that a lot of other things are going on in this Comey story. The reporter used facts about Comey to figure out the setup, and also some clever inference. But this clever inference exploited a specific information leak from the secret follows list to the recommendations list, and this post focuses on this kind of leak.

Wednesday, March 08, 2017

Autoresponse: Striking: A Day Without a Woman

In front of the Federal Building, Pittsburgh.
Dear Message Sender,

  I am not responding to email on March 8, 2017 because I am observing A Day Without a Woman. In the afternoon, I will be joining students at CMU in a silent protest and attending a rally at the City-County building in downtown Pittsburgh.

  Despite the efforts and progress made towards gender equality, women do not have an equal voice, and we are not appreciated equally in society. For example:
  • The gender wage gap persists, and two-thirds of minimum wage earners are women.
  • The House and Senate are currently 19% women. This means an 81% male group is making decisions that affect women's health and lives.
  • The United States still has not had a female president, even though many countries we'd like to think we are more progressive than have a woman currently in power
  • Only 24 of the Fortune 500 CEOs are women. Money is power, and women have less of it.
Some may say that women are simply less ambitious, or don't want to be in positions of power and influence as much as men do. Study after study--and I'm happy to talk in more detail--have shown that women who do have the ambition face far more obstacles than men do. Also, my statistics above focus on what people like to call "privileged" women, but the undervaluing of female labor (including domestic and emotional labor) make life even harder for those in less fortunate circumstances.

  There are many ways you can show support. The first is to attend local rallies, especially if you have an employment situation where you will have few consequences. Even if you are not a woman and/or not striking today, here are some things you can do:
  • Listen to women, and call people out when women's voices are not heard.
  • Question your own biases. (You can have biases even if you are a woman!)
  • Vote for women. Champion women. Mentor women. (In that order.)
  • Support people who are striking, and who are more actively fighting for women's rights and the appreciation of women's labor, both financially and by amplifying their voices.

Yours in solidarity,
Jean

Friday, December 23, 2016

Let's Talk About How We Talk About Science

A while ago, Brian Burg commented on Twitter that he would like to see more discussion of marketing in academia. I decided I'd rather write a meta-post about how we need to talk about how marketing is affecting our evaluation of science.

Image result for beyonce magazine cover
Beyonce.
Image result for kim kardashian magazine cover
Kim.
If you want to be on the cover of Glamour magazine, you know what to do. Put your hair in glamorous waves, wear something small, and stare directly at the camera with slightly open lips. It helps if you have the Look. (Has anyone else noticed that Beyonce and Kim are being airbrushed to look more and more like each other all the time?)

If you want to be on the cover of a glamour journal, things are not much different. Open with a deep-sounding but incontestable vision of where you think the world is going. Hone in on a specific problem. Make the problem sound hard. Make your solution easy for a casual reader to understand. Write with the voice of a winner. It helps to have picked a topic that a science journalist might drool over. Oh, and if you are going for the cover: make sure to have good images.

But, you might say, fashion magazines are frivolous, and science is Serious*. I'll be the first to agree that the investigation of the fundamental truths of reality is a worthy endeavor requiring brilliance, hard work, persistence, and all kinds of other positive qualities. (Side note: beauty is also hard work, and used to oppress women.) But people determine what science is higher-profile than other science. People live in society, and it is widely acknowledged that society is superficial. Many a fairy tale involves a causal relationship between the changing of clothes and the changing of fortune. In Thomas Carlyle's satirical novel Sartor Resartus, religion itself is a matter of clothing.

In fact, a major part of my metamorphosis into a Real Researcher has involved accepting that appearance matters. When my advisor and I used to get papers rejected in the beginning of my PhD, we would spend a long time thinking about how to make the work so good that the paper was not rejectable. I have come to realize that this is the equivalent of failing to impress on a first date and hoping that soul-searching will address the issue for the future. Looking deeply into one's soul, while usually good in the long term, often does not address the problem of first impressions.

Sure, part of preparing one's research for wider dissemination involves doing what everyone would expect of good communication: having a clear description of the goals, clear explanations of the solutions, and a clear explanation of the context with respect to previous work. Good logical reasoning goes a long way. Good evaluation of results does as well. But if we look at the papers that do--and don't--make it into the "glamour" conferences and journals, we begin to suspect that there are other factors at play.

If we look more closely, we can see that American** science replicates patterns of elitism and gatekeeping that we see in the rest of American society. In Privilege: The Making of an Adolescent Elite, Columbia sociology professor Shamus Khan reports on behavioral traits that characterize the new elite. Khan describes how, rather than stemming from family prestige, the social status of the boarding school students he observes comes from an ease of moving through social situations and a cultural omnivorousness (embracing both the high-brow and the low-brow). Especially since these behaviors are learned at elite institutions, they serve a gate-keeping function similar to explicit markers of socioeconomic status. People look for this ease and this omnivorous, for instance when interviewing candidates, justifying their choice with some idea that such traits somehow make people more deserving. There is also a mythology about hard work that serves more as a justification than an explanation for elite status: students feel that they are receiving the benefits they do from society not because they were born into it, but because they "worked so hard to get there."

As it turns out, the training of elite scientists also involves learning gatekeeping behaviors. In science there is, a similar mythology about hard work being responsible for differential success. In Computer Science, the privileged behaviors I've observed include having research vision (as opposed to making solid technical contributions), being aggressive about imposing that research vision upon others, and having a "genius quality," which involves pattern-matching on similarities to previous successful scientists (often white men). Like ease of interaction and cultural omnivorousness, these traits are often associated with people deserving of recognition, but their presence does not mean the work will be good. I would not be surprised if having research vision and exhibiting genius quality were more correlated with being educated in an elite American institution than with potential for long-term scientific impact. With this premise, the recipe for academic fame involves not only marketing one's work as making positive contributions to science, but also demonstrating a combination of privilege and flash. The privilege here is more subtle than that of having cover-girl looks, but it is a very real kind of privilege nonetheless.

But how, you may wonder, do people not see through the shiny exterior? Those who have been following American politics in the last year may be familiar with the answer: insufficient attention. Publications are reviewed by researchers under increasingly high demands to pass quick judgments. Between December 2015 and February 2016, for instance, I had accidentally agreed to be on two concurrent major conference Program Committees, and had a reviewing load of over 60 full-length (12-page, 9-10 pt font) papers. (And I am not the only person who had such reviewing volume!) Had I only been on one Program Committee, the reviewing load would have still required me to evaluate, on average, a paper every two days over the course of two months. Under such reviewing pressure, it is easy to succumb to flash judgments, emotional first responses to a paper's Introduction section. It is easy to accept the paper with the good story over a paper with a deeper but more subtle result.

Despite all this, I believe in the future of science, and that we can shift back to a situation where we are making space for "real" science, what science looks like before the makeup and airbrushing. To do so, we need to wage a similar campaign to the one people waged on unreasonable beauty standards. We need to teach people to recognize--and be skeptical--of "Photoshopped" results: all that is too slick, too inspiring, and too good to be true, in both individual papers and in the story of a scientist's career. We need to raise more awareness about what "real" science looks like: the incremental results required on the way to big discoveries, the science that is foundational, necessary, and often with subtleties difficult to communicate to non-experts. Making structural changes that reduce reviewing loads and allow for deeper evaluation also reduces the incentives that have led to the proliferation of these current practices.

Elite institutions are much more than a finishing school for scientists, but we have been moving to a model where the marketing is coming to dominate the science. To protect the pursuit of truth, we need to admit that people can be shallow when it comes to evaluating science. We need to talk about how we talk about science so we can make space for science that is slow, science that is subtle, and science that is outside the mainstream.

With thanks to Seth Stephens-Davidowitz, who told me my first draft lacked a cohesive point, and Adeeti Ullal, who very patiently helped me with the last paragraph.

* I don't believe fashion magazines are blanket frivolous, but you might.
** I don't have the depth of experience to comment on how this generalizes to other cultures.

Saturday, December 17, 2016

The Structured Procrastination Trap

A wise professor once told me to take advice with a grain of salt, as it is mostly highlights and wishful thinking. Structured procrastination is a prime example of wishful thinking doled out to students eager to ease growing pains.

Structured procrastination promises a productive life with minimal pain. The basic premise is that if you always do something other than the task you are supposed to do, you will be able to always be doing something that you want. Don't want to write that report? Play ping-pong with your students instead, and people will be impressed with how easy you are taking life. Don't want to respond to emails? Read papers you like instead, and people will be surprised you make time to read papers. If you keep waiting, you will want to do that thing that you have been procrastinating, and then you can live a completely pain-free life!

Now let's look at the premises for structured procrastination. It requires that there is always a task that you can and want to do that is productive. It requires that deadlines make tasks miraculously desirable, and it is the fact that something is due soon that makes a task easier to do, more so than other factors (like how easily you are able to do the task). It requires that you have a good sense of how long tasks should take. For structured procrastination to make sense, you need to be at a point where life is simply a matter of execution.

In my many years of being alive, I have discovered that these premises often do not hold. When I was a graduate student and looking for shortcuts to the Productive Life, I felt like I was doing something wrong. When I aggressively tried to apply structured procrastination to my life, I produced a lot of bad work. There were long periods of time when I would try to get into immersive "flow states" where I could have pleasurable levels of focus, but everything felt difficult. I've spent a cumulative total of days, maybe weeks, of my life wondering why it takes me so long to write a paper, or to prepare a talk, or to debug my code. For years I thought that it was possible for life to always be easy, but I had somehow not figured out how to do it.

What I realized is that life is hard, and especially hard if you want to do things you have never done before. If you are doing something that requires you to grow, what you need is a lot of time, and the discipline to force yourself to keep doing something even when it feels like the most painful thing in the world. If you are doing a high growth activity, you need to abandon the idea of structured procrastination. You need set hours that you are going to sit down (or stand up, and lay down) and stare at your notebook, or laptop, or the wall, where you are dedicated to making progress on the Very Important Task. Limiting these hours makes it psychologically bearable. Making these hours the same time every day makes it more likely you can keep this.

Of course, structured procrastination is not all bad. I have recommended this technique to many people, as it is a great way to get oneself out of unproductive loops when a looming deadline kills all desire to do anything. If you allow yourself to admit that you are not going to work on your Very Important Task, then you can at least do "productive" things (like make Ryan Gosling memes) instead of sitting around angsting (which could also be productive according to some value systems). Procrastination is also a good way to trick yourself into doing more things, because deadlines often do make people more efficient.

While structured procrastination provides a useful execution framework, there are times in life when you need to suck it up and do the Very Important Task. In fact, structured procrastination may be most seductive when what you need most is structure. For this reason, you should always think before you procrastinate, and avoid the trap of false busyness.