Have you ever thought about what really goes into supporting digital scholarship? Well, some may say it takes a village but here at the University of Illinois it’s bigger than that. It Takes a Campus. The Scholarly Commons will be interviewing experts across campus about all the new and exciting things that are happening to support digital scholarship. We will sit down with a specialist to learn about what they do, how they do it and why they got started working in their field. Hear what we mean when we say it takes a campus to do what we do.
Ben Ostermeier: Hello and welcome back to another episode of “It Takes a Campus.” My name is Ben, and I am currently a graduate assistant at the Scholarly Commons, and today I am joined with Dr. Ted Underwood, who is a professor at the iSchool here at the University of Illinois. Dr. Underwood, welcome to the podcast and thank you for taking time to talk to me today.
Ted Underwood: Hi Ben, it’s a pleasure to be here.
Ben: I wanted to get started by asking how you got started integrating digital methods into your research, since as I understand it your formal academic background is in English literature.
Ted: Right, it seems like a bit of an unusual turn, but it actually has a long history. Back in the 1990s when I was in grad school, it was already beginning to be clear that there were going to be opportunities, as our digital libraries got bigger, to pose questions about, you know, the evolution of ideas, development of literary form, and I tried to do that a bit in the ’90s using the very limited collections of texts we had then, and I published an article. But, you know, I didn’t go much further with it, because the collections were very limited, and also it wasn’t easy to do things then, so, you know, fast forward to, like 2009, and John Unsworth, then Dean of what was called the Graduate School of Library and Information Sciences here at Illinois, got in touch with me and drew me into a project, and I discovered, “Wow, we have Google Books now,” first of all, and then secondarily, and I think just as importantly, it’s easy to learn stuff now, like you can just go on the web and search how to do something, and sort of teach yourself. It did help that I had a little bit of programming background from the ’80s, but that was pretty dusty by that point. But, you know, things had just gotten to the point where we had the resources, and it was easy to teach yourself how to do stuff.
Ben: Great, so, I’m going to ask you possibly an annoying question…
Ted: Go for it.
Ben: …which I think every person who works in the Digital Humanities inevitably has to answer at some point, but, it’s a question that comes up, which is how you define the Digital Humanities, and to what extent does that definition matter, because, everybody seems to have their own definition, and inevitably it leads to, perhaps, interesting conversations. So I’m wondering what your thoughts are about that conversation?
Ted: Yeah, thank you, it’s not an annoying question. It’s an inevitable and important question, I think, actually, even though it’s true that people try to avoid it, because, as you say, it’s a complex term, a term that people use in different ways, and the slipperiness of the term does generate sometimes friction. But I don’t think that that is an accident, you know, or something that we can sort of step around. It is inherent to the term, and it’s not accidental. The term is deliberately vague, digital humanities, it could encompass, say, you know, using humanistic methods to study podcasts or blog posts, or digital media generally using traditional humanistic methods to study those things. Or it could mean using digital methods, whatever those are, maybe computational methods, statistics, what have you, to study, well, podcasts, but perhaps also printed books, perhaps movies from the 1950s. So, digital methods to study more traditional media, or it could be the way scholarship is produced. It could be digital humanities is whatever you’re doing if you put it on the web, it’s digital humanities, and that’s a valid way of using the term too. So there’s a lot of looseness there, and I don’t think it’s an accident. I think it is deliberate vagueness that’s constructed in order to create a concept that is loose enough to be welcoming, to welcome lots of different people, because there’s a real danger, there’s a lot of tension at this intersection between the traditions of the humanities and computational media, computational methods, so there’s a lot of risk that if you say, specifically, computational humanities, or humanities using numbers, some people will be like “Woah, that is not what I signed up, get out of here,” you know, that’s a real risk. Conversely, if you say, okay we’re going to study digital media, some people will say, “Well, I’m actually more interested in the 20th century, or the 19th century, and I’m interested in history, that’s what the humanities mean to me.” So there’s these tensions there, and we’ve tried to bridge them by constructing a term that is deliberately baggy, and that works, somewhat, it’s worked to sort of create a broad community of people and a lively conversation, but we shouldn’t be surprised when that dissolves and breaks apart. It was, the instability was built into that term from the beginning. So it does matter, it does matter that we understand the term, but we shouldn’t be surprised that it doesn’t come down to a crisp definition.
Ben: Yeah, great, and that kind of leads me to my next question, because on our previous episode we had Spencer Keralis, who is the digital humanities librarian here at the University of Illinois, and they talked about some of this tension, particularly between research that is published in an interactive digital format, such as on a platform like Scalar or Omeka, versus using digital methods in support of a more traditional communication, like a journal article or a monograph.
Ben: Do you agree that such a divide exists and what your thoughts are about that and are there ways that perhaps they could work better in tandem together or is it perhaps to be expected, at least, that there would be somewhat of a divide there within the digital humanities umbrella?
Ted: Yeah, I mean I do agree that a divide exists there, it’s not a crisp one, um, because, I mean, even if your research is setting out to produce basically articles and books, if you’re doing that using digital methods, you’re going to have data or code that needs to be preserved probably online, and you’re probably also, you know, visualizations. So the lines sort of between traditional scholarly formats and new platforms do get blurry, but I do agree that there are some people in digital humanities for whom stretching the boundaries of the publication format counts as scholarship, to include maybe digital editing, for instance, rather than thesis-driven argument. There are people for whom that’s central. And, then there are people for whom may welcome digital editing, but they’re primarily interested in producing new arguments, argument-driven scholarship. And, there’s no necessary conflict between those things, but they rub against the external world in different places. The conflict with existing institutions in different places. So for instance, if you’re doing, you know, if you’re building digital exhibitions on Omeka, then it’s very important to redefine what counts as scholarship in terms of promotion and tenure review, because the role of editing and building collections is often ambiguous, at least at research universities. That then becomes the point of friction between digital humanities and the rest of the world, whereas if you’re doing, say like, quantitative scholarship but scholarship that’s ultimately going to produce an article, then the point of friction might be, say, how do we go about training students to do this, because it’s not in the curriculum. So, and, frankly, ideally it would require a sequence of three or four courses, really to prepare students to do that, statistics programming, you know, it can’t easily be done in one or two courses. So, it’s not that two things are in conflict, but that they have sort of different battles to fight, and I do think that that produces a conflict in the sense that, you know, people are like, “Hey we need some help over here,” you know, that’s where the conflict comes from.
Ben: Yeah, and that leads me to a follow up question, which is, as a professor who teaches courses, in my experience there is definitely a certain challenges in digital humanities of training students in digital methods that, typically it seems like digital humanities students come from a humanities background as opposed to, from a computer science background, typically anyways.
Ben: And so oftentimes there can be quite a learning curve for students…
Ted: Yep [laughs]
Ben: …and oftentimes they can be scared away by the challenges involved, so how do you deal with that, and what are ways we can perhaps do better at that?
Ted: Yeah I don’t think we have a good solution there yet, actually. That is definitely the case, and I’ve been evolving in, I’ll tell you the direction I’ve been evolving on that, and I think I’m still evolving, so like ten years ago, or nine years ago, back in 2011, 2012, I had the idea that it would be possible to do all of this in a course in the English department, which is where I was located then. And maybe we’d have a graduate course in the English department where it would be something like “Digital Humanities” or “Digital Methods in Literary Study,” and we’d, oh we’d explore the controversies about the nature of digital humanities and maybe along the way introduce students to some programming and statistics, and that is so impossible. [Both laugh] That’s now ludicrous, right, that’s actually like five course that you’re trying to compress into one, but it seemed necessary, and in some ways it was necessary at the time, because you couldn’t assume that students, say, in an English department were going to expect to have to take three or four courses in this area, or be willing to, because it was still very new and controversial, so you were going to get one course, realistically that was going to be the curriculum. So there was just a limit on what you could actually do. You could not really teach computational methods in a course like that. And so, you know, I’ve dealt with that partly by expanding my role in the university where now I’m teaching in the School of Information Sciences, where there is a bigger DH curriculum and there are more students who are likely to have taken courses in programming or data science and be able to maybe take a more advanced course to be able to apply that to say look at unstructured data, look at text or images, but it’s still definitely a challenge, because realistically, like I say, it actually would be a three or four course sequence, not a two course sequence. And so, there’s still considerable risk of rushing things, and I don’t think I’m avoiding that successfully yet, to be honest with you. It’s sort of like a coevolution between the way we teach this and the way that the curricular institutions around us sort of frame the topic, and what they suggest is possible. There’s another approach, I should say, there is another way to go about this, which is very popular, and I just don’t think it works either, which is to try to fit it all into one course by basically ditching the programming part, and say, “there are some user friendly tools out there, we can use those.” Like Voyant is one tool, a very good, it’s about as good as can be done in that space of sort of text analysis in your browser on the web. There are some other similar tools that promise to be user friendly, and they are to a certain point, but then you rapidly will run up against the limits of what you can actually do in those, say, graphical user interfaces. So, if we do it that way, we can squeeze it into one course or two courses, maybe, but then where do students go from there, is what I’m not certain. So that is a really big challenge, but I think ideally, my view would be, it means what we need to do is maybe define a little better, say a three course sequence in this space.
Ben: Yeah, that’s definitely something that I’ve had to deal with in my experience, because, for listeners who don’t know me, I was previously the technician for the IRIS Center at Southern Illinois University Edwardsville, which that’s the digital humanities center there, and we have a digital humanities minor there, and I was actually the first student to receive that minor, but…
Ben: …in working on the curriculum after I received that minor, a big challenge we’ve had to deal with is, do students actually need to have programming experience to receive that minor, and I did, but many students are… have some hesitation about that, and it seems like, to a certain extent, I think there’s a fear at least of those that design curriculum that if you have too much programming involved you’re going to scare students away, and…
Ben: …I don’t really have a good answer for that [laughs], but it’s a definite challenge.
Ted: I don’t either, but here’s what I see as likely to happen. We can build programs in the humanities that don’t have programming as part of them, and that can be a valid thing. I’m not against doing that. But if we do that, there will also be programs that arise in the social sciences, and in information science, and in computer science for that matter. They’re beginning to happen in departments of computer science already that use more flexible and adventurous kinds of computation than you can easily fit into a graphical tool, because, you know, social scientists are used to using statistics, and departments of information science and computer science exist also. So, it’s like if we don’t do it, they definitely will, because they can, you know [both laugh]. And humanities materials, movies, books, art, are fascinating, they’re really appealing, so departments of computer science will definitely go for that, they’re not gonna wait for us to do it, so it’s fine that, we could do it both ways, but the computational way is gonna happen somewhere, it’s just a question of where.
Ben: And do you think it’s better off happening from the end of the humanities going to the computer or the other way, do you think…
Ted: I would love them to be a bridge, I would love it to be a bridge, and I think it can be, I mean in some ways that’s the promise of information science as a place, is that you can have a single institution where really both of those perspectives are represented are joining hands and collaborating. It can work across campus too, if you’ve got a humanist in a[n] English department or history department collaborating with someone in computer science, that can also work. But I like the feeling of a school where you’ve got people from a lot of different disciplinary backgrounds collaborating. So, yeah I hope we’re able to hold that all together, but you know, just as with the term digital humanities itself, you end up describing a very big arc or bridge that’s fragile at lots of points, and it’s hard to hold together.
Ben: Yeah, as we’ve been talking, I realize we haven’t really had a whole lot of opportunity for you to talk about your particular research interests, so I wanted to give you the opportunity to talk about some of the computational work you’ve done in your research.
Ted: Oh sure, I mean…
Ben: That’s a big question so… [Both laugh]
Ted: …it varies, I’ll organize it under two heads, things I’ve done and thing’s, sort of, maybe looking forward, things that I think are getting exciting. But a lot of what I’ve done is use large digital libraries to pose questions about long timelines in literary history. So, you know, how has the pace of narration changed, how much time passes in a typical page of a novel? Are we talking about a week of fictional time that passes in each page of reading, or is it a day, or is it sometimes increasingly in recent years, it’s more like two minutes per page. The pace has slowed down. And when that slowdown happened, is not something we had a good picture of. A lot of literary critics, for instance, thought that that happened at the beginning of the 20th century with modernism, now that we have a big picture we can see it was much more gradual. And similar sorts of questions about concreteness, the development of concreteness in fiction. And really it’s, you know all of these things come together in a way to make a bigger story about literary history and the study of literature, which is to a large extent, our idea of what literature is for has been shaped by certain aspects of literature that only developed very recently, like this emphasis on concrete particulars and brief, sort of, fragmentary moments that we now think is sort of, that experiential vividness and particularities is crucial to the mission of literature, so much so that it’s shaped the way we think we ought to be reading and interpreting literature. But it’s actually, if you look at the big picture, you can see where we got that. It’s a long story, a gradual story, and in fact, sort of our idea that you can’t use big numbers and panoramas to understand literature is a product of a history that, if you back up, we can actually the panorama that generated that. So, that’s the story I’ve been telling. But I think in years to come, I’m interested in looking at, not just at, sort of, big digital libraries and long surveys across long timelines. But I’m getting increasingly interested in understanding literature in detail, like how plot works, how suspense works, and I think it’s going, the nature of machine learning and of computation is evolving so rapidly that is gonna become increasingly possible for us to pose some questions that seems like less social sciency, more interpretive, using machine learning.
Ben: Yeah, I’ve seen arguments, and this was largely in a non-academic context..
Ben: …so take this kind of with a grain of salt, but that, like, plots are becoming increasingly complex over time in media…
Ben: …in general, and, at least I think the argument was particularly for TV shows…
Ben: …that there used to be TV shows were just linear, beginning to end, and nowadays you have a lot like, not just time travel, but flashbacks…
Ben: …and non-linear storytelling, so I’d be interested in questions like that about like, ways of seeing the way how narrative is constructed…
Ben: …is changing, because the conventional wisdom oftentimes is like, “People are getting dumb,” and like…
Ted: [Laughs] Yeah, no, no
Ben: …which, I don’t agree with, but yeah.
Ted: It’s pretty clear, I think there’s actually some agreement that TV has gotten more adventurous, for a lot of, I mean one thing, there’s a classic thesis, and I don’t know who to attribute it to, but just the development of the VCR or DVR meant that you could return and pay attention to the texture of of television, you could slow down and get the joke, whereas if it’s broadcast television in the 1970s…
Ted: …it’s gone, you know, you can’t return to it. So yeah, that I think we know, but I think there are gonna be all kinds of other questions. There’s some good work that’s come out recently in the Journal of Cultural Analytics studying the sitcom using face recognition, and they’re like, “Which characters get attention, when in the plot, when in the arc of the sitcom do they get attention,” and we’re gonna be able to study that kind of formal, the formal architecture of TV genres, as well as literary genres, in new ways, I think, yeah.
Ben: Yeah, that’s exciting, and that leads me to the last formal question I had prepared, which is, and you addressed it already, but perhaps big picture
Ted: There’s more to say.
Ben: I’m sure, there’s always more to say.
Ben: And the question is, what do you see as the future of data science in the humanities?
Ted: Yeah, I see it as being really capacious, really, so, you know, the kind of thing I’ve done, like, where it’s explicitly quantitative and it’s about big, historical stories, that’s never going to be everything the humanities are about, because we’re also about individuals, and that’s valid. We’re about individual stories. But I think, what’s happening now, the line between data science and machine learning, or, to use a term that’s sometimes used, artificial intelligence. I kind of prefer machine learning, but it’s different sides of the same coin, that is a very blurry continuum. And that means that we are not just going to be studying culture in a kind of large-scale social sciency way, but we’re going to be able to, for instance, do the things that large language models do, where you can give them the beginning of a story, and then they continue it. Like okay, if that’s how the story begins in that style, then this would be a plausible next paragraph. Or they can do something similar to that with images, and that means we can begin to pose questions about the sort of, the frame-to-frame movement of a video, or the paragraph-to-paragraph movement of a story, where we start to ask questions about, for instance, what makes some stories more predictable than others? Like is it easier to predict where this plot is gonna go? Or we can pose questions about, like, where could this story have alternatively gone? Where could it plausibly have gone, suppose it’s written in the 1870s, you know, what are the plausible alternative endings for this story. What about if we move it forward a decade? You can, we’re getting models that are good enough to be able to pose that kind of question, like what could have happened in the story under alternate circumstances? And that’s gonna open up just a huge range of questions that are not limited to the kinds of questions we think of as appropriate for data science or social science. They’re more, to be honest, they’re akin to creative questions. One of the things about these generative models, generative models of images or of text, is they’re basically, they’re doing something like generating the artwork. Now I don’t think that that means that we’re gonna, like, you know, have robots write all our novels for us. [Ben laughs] They’re not that good, and the arc of a whole plot is something they still struggle with, and I’m not sure we want that anyway, actually. I think it’s more fun, people enjoy fan fiction because they enjoy going back and forth with a story world, right. They enjoy participation. But it does mean that the line between, what we think of as analytical or critical tasks, and what we think of as creative play, could get really blurry, and that to me suggests that the future of data science in the humanities is not something we’re gonna want to keep walled out because it’s too much like social science, it actually could be really central to what we think of as things like play, that are central to the purpose of the humanities. But for that to happen, it’s gonna need to become kind of like a lingua franca that we’re comfortable than we are right now. And I think that will happen actually because the opportunities are just too huge, but how it will happen, you know, where that happens in the curriculum, remains a bit of a mystery.
Ben: Yeah, and I think, and.. correct me if I’m wrong, but there is certain level of I think fear out in the world about like artificial intelligence or machine learning…
Ted: You are not wrong. [Both laugh]
Ben: …I mean certainly more in like the context of facial recognition.
Ben: …and the use of like surveillance or what have you…
Ted: Right and social media…
Ted: …big tech corporations. These kinds of fear are all interwoven, yes.
Ben: Yeah so, I imagine that bleeds into the academy…
Ted: Oh yes
Ben: …in hesitations people might have about um…
Ted: Oh yes
Ben: …what you do.
Ted: It doesn’t just bleed into the academy, a lot of that is sort of generated in the academy…
Ted: …and it’s, I mean if I’m gonna be completely candid about that, partly that’s a reflection of an emerging competition between universities and tech companies, which are both like, there’s this space which is like, intellectual institutions and society, which universities have had kind of a monopoly there, like oh there’s gonna be research in computer science, we’re going to be doing it. Now the tech companies are kind of claiming to be the leading edge of CS research, which does not make universities comfortable, and so that’s, part of the anxiety is fueled by just sort of general social things with social media and fears of surveillance. But within the academic context, we also have to be candid that Google is a competitor for universities, and it’s not surprising that university professors like myself are real… wary of it. So, you know, but I think it’s also valid, that, to be sure, you know the steam engine was socially disruptive and was a problem and magnified social problems and was not handled well, and is machine learning going to do all those things too? Magnify social problems, increase concentration of power, not be handled well, need kinds of regulation that we don’t yet have, yes. I mean all of that will be true, for sure. So it’s going to be, at the same time, I also think everything that I said earlier is true, it can magnify human creativity and become a way we understand human creativity and be a kind of collaborative space for human creativity. So, it’s not an either or, but it means that there’s a really interesting, complex struggle and conversation that plays out there.
Ben: Yeah it almost seems like, perhaps, the attitude is that if we don’t use it or don’t acknowledge it, it doesn’t exist, or we don’t have to worry about it then.
Ben: Or maybe that’s oversimplifying it, but we’re better off at least, like, I dunno, claiming it, or taking ownership, or…
Ted: Yeah, I mean, yeah
Ben: …at least figuring out how it works, so like…
Ted: Yeah, that for sure. I mean we, everyone would agree about that, that it’s a real bad idea for people not to understand how machine learning works, I think we can all agree about that. Where to do next is where things get a little complicated, but I’m not really sure yet that we have a policy disagreement about like, how should machine learning be regulated? I think, generally speaking, a lot of, when the conversation actually gets that concrete, there’s often a lot of agreement. But it’s before we get to that level, when we’re talking about like, what attitude should we adopt to machine learning, before it really gets to the concrete level of “what should we do?” then there’s a lot of tension, because people have very different attitudes and very different, I mean honestly it’s an emotional thing, like how do we, I have, when I look at, sort of, a big new language model, I have feeling of excitement and joy and like it’s spring and there’s gonna be new [Ben laughs] flowers coming up and I don’t what they are. But I know that people do not have that feeling [Both laugh] when they look at a large language model, and I understand that, but it’s not, I don’t think that’s purely like an intellectual debate. It’s even sort of like before we’ve gotten to the stage of framing an intellectual debate about it, it’s that people just have, at this stage, very different kind of, I guess a technical term would be priors or instincts, um yeah.
Ben: Perhaps a gut reaction of some sort or?
Ted: Gut reaction, yeah. Yeah.
Ted: And, I do understand where the other gut reaction is coming from because big tech companies are scary, legit scary, and also ways governments can use machine learning and will use machine learning I think are highly scary. So that’s all true. It’s just, how we hold those things in tension, right, ’cause there’s always a dark and a bright side to everything, including like, you know, the human body, right, it has problems. So but it’s how you hold those things in tension that remains to be seen.
Ben: Right, um the title of our podcast is “It Takes a Campus,” so I feel like I should ask something about the, I guess the campus environment in which you operate, because, speaking as someone who came from another university that had DH in various forms…
Ben: …um, University of Illinois is so big that it almost inevitably becomes, in some sense siloed, for lack of a better way of putting it. And maybe that’s not accurate, I don’t know I’m still fairly new here
Ted: No, it’s accurate [Both laugh]
Ben: But, yeah, it feels like that’s a definite issue for digital humanities, is there’s a certain level of siloing that occurs, and how do we deal with that?
Ted: Yeah, so there’s siloing with, sort of a university from other universities, and then there’s different communities within the campus. The thing I like to say about the University of Illinois is that it’s like a Kafka novel in that, or a short story by Kafka, in that somewhere on campus there’s an amazing resource meant for you and you alone, but you may or may not ever discover the door where it’s hidden behind, so there’s so much going on here, that means that people are not necessarily always in communication. I think that that is a challenge for DH, you know, there are connections between, in particular I would say the iSchool and HRI, which has its own, Humanities Research Institute if I’m getting the acronym right, used to be IPRH, which is sort of on a different part of campus, but we’re both doing digital humanities in different ways in some communication with each other, and then there’s communication with the outside world, which I think is also important, and I think this program Training in Digital Methods for Humanists that’s centered at HRI has done a bit to encourage people to go out to, you know, summer institutes, say, where they can be in communication with people at other universities, and I do think that’s important, because otherwise you fall behind, honestly. And it’s particularly challenging for all universities, not just for Illinois right now, because there’s not a lot of hiring happening in the humanities, so it is easy to fall behind, actually, if we don’t consciously refresh our experience. I do think that’s a challenge, but I’m optimistic that we’ll build the needed bridges.
Ben: Yeah, well great, well that’s largely what I had prepared today, but I wanted to give you the opportunity to speak to anything you feel like we should have talked about. I mean, that’s a pretty open door, so…
Ted: No, those were great questions actually, and I got a chance to go off on how large language models are like the approach of spring with unknown flowers coming up…
Ben: [Ben laughs]
Ted: …that’s what I wanted to say, so, yeah.
Ben: Well I’m glad you got the opportunity to say that, yeah. It’s not an army from Mordor…
Ted: Yeah, or I mean, you know, it’s also an army from Mordor, but yeah, it’s both, both things. [Laughs]
Ben: Yeah, well, the world is complex.
Ben: Well, thank you so much for taking the time to talk to me today. I really enjoyed our conversation, and I look forward to talking to you more in class later. Full disclosure to the podcast: Dr. Underwood is currently my professor for Data Science in the Humanities course.
Ted: It’s fair to disclose.
Ted: It’s been a please talking to you too.
Ben: Yes, yep. And I will talk to you again later.
It Takes a Campus of the podcast brought to you by the University of Illinois at Urbana-Champaign Scholarly Commons located in the Main Library. If you want more from us be sure to check out our blog Commons Knowledge publish.illinois.edu/commonsknowledge and follow us on Twitter @ScholCommons. That’s S C H O L Commons. The opening and closing song is Tranquility Base by A.A. Alto. You can find their album Bright Corners in the Free Music Archive by searching for A A Alto at freemusicarchive.org. Thanks for listening.