Interview with Prof. Sam Ransbotham

For podcast release Monday, February 15, 2016

KENNEALLY: Data – it’s not just for engineers anymore. We eat, breathe, sleep, and live in a world of data – a myriad points of information recorded constantly on smartphones and laptops and Fitbits.

Welcome to Copyright Clearance Center’s podcast series. I’m Christopher Kenneally for Beyond the Book. For businesses and individuals, all the captured data helps inform decisions – everything from where we eat and what we eat to how to reach our customers and satisfy their demands. When data is shared, it shapes reputations, with the potential to build and to tear down.

Research by Professor Sam Ransbotham of Boston College has uncovered how shared data also changes behaviors. He joins me now from his BC office. Welcome to Beyond the Book, Sam.

RANSBOTHAM: Thank you, Chris. Glad to be here. Appreciate you having me.

KENNEALLY: Looking forward to speaking with you. We’ll tell people that Sam Ransbotham is associate professor in the information systems department at Boston College. He is the guest editor for MIT’s Sloan Management Review data and analytics initiative. In 2015, he received a National Science Foundation CAREER Award for research on analytics and information security.

Sam, we thought we would chat with you because it is certainly the year of big data – at least it is in publishing, which is really kind of catching up to a world that may be already quite current in other parts of our economy, and in particular this notion of the sharing economy. Your research has gleaned some interesting insights on what sharing can do, and in particular how sharing is rather like a game. Tell us about your research.

RANSBOTHAM: Yeah, so a game, but a game in a good sense. What we’re thinking about is how the presence of data changes our interactions. If you think about the classic sharing economy case, which is ride-sharing, if you think about that transaction, historically you would walk into a taxi and take that taxi. The taxi would never see you again, and you would never see the taxi again. What that means is that that’s a one-time game. That’s a game where each of us has an incentive to kind of act poorly towards the other person.

But what happens in the sharing economy that I think is fundamentally different is it’s changed our relationships. What happens is this game is now repeated. You’re not just taking a ride with a ride-sharing service once, you’re probably going to take it multiple times. That creates a history. Meanwhile, that ride-sharing service is going to pick you up several times. Maybe not the same person, but the service will. That creates a history, as well. It’s that history that I think changes things.

KENNEALLY: It certainly would seem to. I really like your point about this one-off arrangement does encourage bad behavior, because we feel as if we’re anonymous or nameless somehow. But with data, of course, goes identity, and that really is a very helpful thing for businesses, but it really does something. It transforms customers.

RANSBOTHAM: Sure. I’ve isolated just now a point where that’s a positive, but there’s lots of unintended consequences of the data-tracking we’re doing across society. This is one consequence that I don’t know if we maybe thought through initially, but it seems to be positive. But there are lots of other consequences across the board. Why I think this one is particularly interesting is that we now want to behave in ways that are reputation-preserving, and that’s important.

KENNEALLY: Absolutely. When it comes to data and the transparency of data, businesses also can recognize the power of this. Your suggestion is that they should really do more sharing. No matter how much they’re already doing, they should try to do more. Sharing has – even though, as you pointed out, potentially negative consequences, the potential on the positive side is so much greater that sharing more and more is really what they should be doing.

RANSBOTHAM: Nothing in our world is binary or exact. Certainly with transparency, that’s the case. There are both positives and negatives if you think about the transparency that goes into data being collected on us all the time and we don’t know how it’s being used. We don’t know how that algorithm is working. We don’t know what’s going on behind the scenes. And when that’s the case, there’s some trouble. We have a tendency, for example, to be judged unfairly. What if that data is recorded incorrectly? You just don’t have any way of knowing that if it’s not visible and transparent. Or what if the practices that companies have in place are actually unfair or discriminatory? We don’t know that unless there’s transparency. That’s some of the downsides of invisibility.

Now, I don’t want to paint visibility as a perfect solution either. I just think it’s a little bit of an improvement. With visibility, if you know the algorithms that you’re being judged on, you’re a smart person, and you’re going to manipulate your behavior to fit that algorithm. If you know, as a rider in a ride-sharing context, that you’re being judged on whether or not you greeted people with a, hi, glad to be in the car, then you’re going to say that every single time, probably insincerely. So the degree to which these algorithms become transparent and become visible, we have to think through what that’s also going to do to our behavior.

KENNEALLY: It’s an ongoing experiment, Sam Ransbotham, and a really fascinating one. We’re all kind of living through a really new environment when it comes to business transactions. I’m thinking about publishing in particular. Publishers had a different set of customers in the analog world. Their customers were bookstores. Today, their customers in the digital world are increasingly the readers of these books. So sharing data about the kinds of books that are perhaps appealing to others who’ve read those same books is a good way to help sell to them, but it also, I think on the customers’ end, may worry customers that these publishers or whoever’s gathering the data know a bit more about them then they would like to share themselves.

RANSBOTHAM: Sure. There are great examples of that in the electronic reader industry. Before, as a publisher, you didn’t know who bought your book. You just knew where it went or which resaler or which wholesaler it went to. But now you may know the individual person, and not only that, you know whether that person read through page one, page two, page 10, read it to the end, read it in one sitting, read it in 40 sittings, read it three times or four times. That is a phenomenal amount of behavioral data that people just had to guess about before. There are profound implications for that across society in multiple dimensions.

One of the things you just mentioned was what we think of as a disintermediation. You’re getting rid of a lot of the intermediates in the publishing process. You as the publisher know a lot about your reader that before had to filter through multiple stages. With each one of those stages, just like a game of telephone as a kid, the data gets murkier and more and more uncertain. Now, you just got a much better insight to that reader.

Positives and negatives, though. You may then decide that every single book you need to churn out has to be a page-gripper to keep people engaged the whole time, and we may end up cranking out yet another clone of the last book that worked well. So that’s an example of what I think of as local optimization versus sort of a holistic optimization. You may be incrementally improving that micro-experience, losing the bigger picture of the macro-experience.

KENNEALLY: It’s interesting. The other question I have is regarding the quality of data. We collect all kinds of data, but all data is not created equally. The data you described that we had in the book world in the previous era was, as you say, rather murky, and it was extended. We didn’t have the direct relationship with it. Now, though, we have this data, whether it’s about where our readers are located or what they have read or not read, as you point out. But do we have other data that we should sort of look at with a bit of a jaundiced eye and sort of say, well, that’s not as good as other data?

RANSBOTHAM: Oh, sure. There are wide, wide variations in data quality. It’s a little bit disingenuous for us to think about poor-quality and good-quality data. All data is poor-quality to some degree. It’s always a matter of degree. I think what’s going to happen in our society as we go through this process is that we’ll start to iterate and get better and better.

I’m going to tie back to the transparency point. As we get more transparent, your ability to see that that data is incorrect, then, might be improved, and then there’s a chance that processes can be built into place that could improve data collection. Now, there may be multiple incentives around that. I think you probably have answered anonymous surveys clicking the first item on every list just to get it done with to win your free iPad Mini, just like everyone else has. So there’s data quality issues around that, as well. But despite that, it’s all better. It all seems to be improving, and it all seems to be getting better. I think that’s something that’s going to happen over time as our systems get better.

KENNEALLY: Finally, Sam Ransbotham, as you describe all this, it strikes me that we are each of us becoming data scientists. Again, just as some data is better than other data, some data scientists are better than others. You’re one of the better ones, of course. Are there some tips, are there some suggestions, that you as a professional in this field can offer the casual data scientist to help them really better understand the impact on their lives and be able to sort these things out better?

RANSBOTHAM: I like that you portrayed me as knowing a lot. Let’s emphasize that. Because actually, the truth is the more you figure out, the more you realize you just don’t know. There is so much cool stuff going on now, and I think every day I figure out more and more things that I would like to know more about.

But you point out the accessibility. People use the phrase democratization of data. One of the things that I’ve been writing about recently is this idea that democracy may not be the right word. Democracy implies that we’re all going to get to participate in this data economy. We all have the opportunity to participate, but that doesn’t mean we can. Actually, as you point out, ability is the big differentiator. So I think the idea is it’s going to be much more about meritocracy versus a democracy, and the meritocracy comes from what your skills are and what your ability is to use data.

The good thing is there are so many tools out there, there’s so much data out there, and the availability of online courses and intros – the internet is filled with lots of ways to get better at that. Any number of them can work. The question is finding the right one and putting in the effort to get to that merit stage. It won’t just come to you.

KENNEALLY: It certainly has been an opportunity for a quick online seminar in data. We’ve been speaking today with Sam Ransbotham. He is associate professor in the information systems department at Boston College, guest editor for MIT’s Sloan Management Review data and analytics initiative, and a recipient of the National Science Foundation CAREER Award in analytics and information security. Sam Ransbotham, thanks so much for joining us on Beyond the Book.

RANSBOTHAM: Thanks, Chris. Appreciate you having me.

