Brian Magerko on AI to enhance human creativity, robot improv, music to learn coding, and improvisational dance with AI (AC Ep56)

Amplifying Cognition

Contenuto fornito da Ross Dawson. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Ross Dawson o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.

3M ago 37:12

MP3•Pagina principale dell'episodio

“AI is not a collaborator. It’s an Oracle, it’s a tool, it’s a thing. I have a query, give me the answer. It’s not a thing where you sit down with the computer like, okay, let’s think about this problem together.”

– Brian Magerko

About Brian Magerko

Dr. Magerko is a Professor of Digital Media, Director of Graduate Studies in Digital Media, and head of the Expressive Machinery Lab at Georgia Tech. His research explores how studying human and machine cognition can inform the creation of new human/computer creative experiences. Dr. Magerko has been research lead on over $15 million of federally-funded research; has authored over 100 peer reviewed articles related to computational media, cognition, and learning; has had his work shown at galleries and museums internationally; and co-founded a music-based learning environment for computer science – called EarSketch – that has been used by over 160K learners worldwide. Dr. Magerko and his work have been shown in the New Yorker, USA Today, CNN, Yahoo! Finance, NPR, and other global and regional outlets.

Google Scholar Page: Brian Magerko

LinkedIn: Brian Magerko

Georgia Tech Profile: Brian Magerko

YouTube: Brian Magerko

What you will learn

Exploring the roots of AI and cognitive science
Improvisational AI in robotics and dance
The journey of the EarSketch project
Challenges in AI-driven collaborative creativity
The importance of AI literacy and education
Ethical considerations in AI development
Envisioning the future of human-AI collaboration

Episode Resources

People

Transcript

Ross Dawson: Brian, it’s a delight to have you on the show.

Brian Magerko: Oh, thanks for having me, Ross.

Ross: So you’re a perfect guest, in many ways. You’ve been studying human and machine cognition, and how they shape creativity for quite a long time now. So, just to hear a little bit of how you came here, and why this is the center of your work?

Brian: I had the good fortune of being at Carnegie Mellon for my undergrad in the late 1990s. And there were a lot of folks that are doing really exciting work, since its inception, related to AI and cognition, so I got exposed to folks like John Anderson, who’s huge in the cognitive modeling community, Herb Simon, who wound up advising me, Ken Koedinger, who has been one of the leading intelligent tutoring system minds since the 80s. So, , being in the mix of all those, those great minds and being able to take classes with folks and, and do research really was a great place to start, .

Ross: Those are incredible people.

Brian: Oh, yeah, right! Yeah, I took Dave McClellan’s neural networks class. And he, , wrote the book that we used. Jaime Carbonell, I took his improved AI class.

Ross: So what was Herb Simon like?

Brian: Herb Simon? I took his, I mean, , as undergrads, we were, we were just in awe of him pretty much. I was friends with…there were five cognitive science majors at the time in our year, it was a huge class. We all put him on a really high pedestal and taking his class was absolutely phenomenal, though I feel like I would have gotten much more out of it as a graduate student than a scatterbrain undergraduate.

He was kind enough to be my research advisor for my undergrad thesis, which was one of the first places where I was really putting all of these ideas of studying human creativity and formalizing them computationally. Though, I kind of went in this direction of wanting to do it, models of creativity, which is a very difficult environment to do creativity work in at the level that I was doing. But he advised me on how I’m trying to study the tacit knowledge in jazz improvisers, as well as studying cognitive science and computer science at CMU. I was doing a jazz improv minor, because why not, I guess? I just wanted to explore the wide variety of things that interested me and take the opportunities that I had. And I, a lot of my career is about synthesizing those things together, so my work with Herb was about studying Jazzy, and jazz improvisers, which was the thing that I got exposed to and learned about as a student there. , yada, yada, yada, a lot of informing the first NSF proposal that I ever wrote and got awarded on Sunday’s Improvisational Theater and building formal representations of it.

Ross: That’s incredible. And for those listening who don’t know, Herb Simon was Nobel Laureate in economics and sort of the foundation of modern decision theory.

Brian: He’s also one of the progenitors of artificial intelligence.

Ross: Well, yes. He was right there at the start.

Brian: There was the Dartmouth conference in 56, I think it was. He wasn’t there the entire time. But he’s on the list of folks with Marvin Minsky, and others that Alan Newell and one of the building the computer science buildings at Carnegie Mellon, Mellon did, named after those two guys. It’s the Newell-Simon building.

Ross: I was looking through your list of papers and found this wonderful- I will bring this to the world of the present soon, but you had a paper in 2000, called robot improv. So, let’s go back to that section of jazz improvisation and AI. I’d be interested to have the context of today as sort of, we’ve come a long way, of course, from the capability so I’d love to lower the seeds of the thinking and also to a point where that has evolved.

Brian: I’m not sure what the question is. I can just talk though. That work was again sort of a product of being at Carnegie Mellon and having some wonderful people to work with there. I had the fortune of working with a robotics professor named Illah Nourbakhsh, who between him and Herb Simon, really the two of them sort of sowed the main seeds for me as a researcher, Illah was very much about doing robotics research at Carnegie Mellon but refused to get military funding. He wound up asking very different and very interesting questions that the folks who were doing the hardcore systems weren’t asking. He taught an intro mobile robot programming class and I thought that sounded fun, so let’s do that. And some of us liked it so much that we bugged him to do a special class and he did a special class. And we got to do this, this little improv robot comedy troupe, which is one of the… I mean, , there were in the 90s- and even earlier, there was work in sort of generative AI, story systems, , this is the first robotic one that I think existed. There hasn’t been much even since but we did this in ‘99. This was like, the robots were- the fact that the computers were laptops and that they were talking to each other wirelessly, like that was ‘woo’, . So we’ve come a long way, in some ways, but what those robots did was pretty much a trick, like they did this improv acting, but behind the scenes, they were just sending each other sort of messages saying, Hey, I’m doing something mean, hey, I’m doing something angry, I’m doing something happy, whatever. Just some emotional valence.

We had this little emotional calculus where like, ‘Oh, the other robot does this kind of move, emotional move, this is how you update your emotional model. And here’s how you pick a new move based off of your model’. They would basically just say one line, one shot lines, there was no kind of coherent dialogue back and forth. One robot was trying to leave the room and the other robot was trying to get the robot to stay. And it was about this: this tension as to whether or not the robot would leave or not, sometimes- and it was really interesting because they would actually improvise these things! Sometimes the robot would leave, sometimes they would stay and, and they always said funny things because we got to author these. When a robot turns to another robot and says, ‘Wait, don’t leave, I’m pregnant’ people laugh, like, they just think, , it’s a really good medium for comedy. We learned from that experience, at least I learned that improvisation through improvisation is really hard to do. So we faked it, basically, we said, ‘I’m mad, and here’s the thing I’m saying’, and the other one would say, ‘Oh, I’m scared. here’s the thing I’m saying, but there was no actual socio-cognitive mechanism actually going on. There was just the sort of independent, very, very simple Chinese Boxes of getting a little bit of input and then making a decision and outputting a thing. And that’s about it.

Ross: I think part of the point there, though, is that the AI was very primitive then, but you had the concept.

Brian: What I’m talking about is relevant to LLMs today- that there’s a lack of cognition, awareness, and reasoning about the establishment of meaning together. A large language model- some people would argue about this, but from an epistemological viewpoint, large language models do not have knowledge about collaborating, they don’t know, they can’t describe the process that we’re in and jump around in it, and reason about it. It’s called Generative AI for a reason.

Ross: This is where AI is a complement to humans.

Brian: It’s an Oracle, it’s a tool, it’s a thing, I have a query, give me the answer. It’s not a collaborator. It’s not a thing where you sit down at the computer like, ‘Okay, let’s think about this problem together, and we hash it out together, our solution’. It’s more of, a ‘gimme ideas’, or ‘what do you think about this idea’ kind of thing, right? This idea of establishing shared meaning, shared mental models, and making sense together? This is a lot of the work that I’ve been focused on over the past couple of decades in terms of studying human collaborative creativity, and how we can model those things that we’re doing together that afford us to so effectively make meaning in the world together. Computers, they don’t do this. It’s just not their capability, right?

Ross: One of the things you’ve come to, one of your recent projects is around improvisational dance with AI.

Brian: It’s called Luminae. It used to be called the Viewpoints AI Project before we stopped using viewpoints. It’s been a decade-long project. I started this question with a small group of students. I was like, Hey, why is this Improv Theater stuff really hard? Because you have to talk so much. It’s so dialectic and I don’t want to solve the natural language problem. Maybe somebody else will- somebody did, which was great, but still, what can we do if we just do all the reasoning, all this stuff that we’re talking about with improv, but no talking? Not even body language, no semantics, just no semaphore? Just abstract, raw emotional output. Scribbling, doodling, contemporary modern movements, things that aren’t restricted to a very specific vocabulary, but are more about just, ‘I’ve got stuff in my head I have to get out physically’. We’ve been working for the past few years, we were lucky enough to get National Science Foundation funding through a program called M3x, which is the mind machine motor Nexus. It’s a very futuristic-sounding program.

They funded this to study the function of dancers. How is it that folks across time reasoned about the ebb and flow of idea, introduction, and idea exploration when improvising with someone else’s body across time? We’ve been working with contemporary dancers and a dance professor there, her name’s Andrew Knowlton, and she’s been amazing. We both studied the dancers in terms of informing the technology and the design of the interface as well as we took this technology and incorporated it into their classroom. For two months, we did a longitudinal study of their thoughts and the function of adoption of the technology over time, and students hated it at first. They’re like, ‘AI, boo’, which made us wonder why they signed up for the AI and dance class. It was like, why did you? Why are you guys in the room? But, after they used it in the rehearsals, after a week or a week and a half, the language and attitude towards the technology really changed for the better. In early May, we had the world’s first improvisational human AI dance performance. You can find a video of it off our website, it’s on the Expressive Machinery Labs YouTube page.

Other folks have done AI dance performances before, I’ve done it before, but we’ve never had one where there was an actual model of collaboration occurring. This is what we put on stage, an agent who is reasoning about improvising, knows it’s improvising, quote, unquote. As opposed to a thing that is responsive to us, that’s more of an intelligent tool. This is trying to see how AI can augment us as a collaborator. The really interesting things that you can do with an AI collaborator that’s projected in midair, it was on a big scrim, you can make it really big, you can put it on this wall, you can make a dozen of them in a row. The affordances of this dancer both for rehearsing for certain reasons, as well as for performance were just kind of different, which is why I liked this work so much because it wasn’t about replacing dancers. This was about, ‘Okay, we have dancers, how can we have them express themselves and be creative in new and interesting ways with technology’?

Ross: Fantastic. Just to hop to a different topic, which is Earsketch, a different intersection of humans and cognition and senses. It had a big impact and I’d love to sort of hear that story.

Brian: Oh, sure. Earsketch actually has an AI as the end chapter. Hopefully, I’ll remember to talk about that. Earsketch is a project, it’s been a large team collaboration for 12-plus years at this point, it was co-founded by a school music professor here named Jason Freeman. We’ve been working with each other every week for over a decade. We’ve designed and built a learning environment. We’ve been working all this time on designing and disseminating a learning environment mainly targeted at high schoolers, for changing attitudes about considering being in computer science.

There’s a pretty big difference in representation and computing proportionally when you talk about gender or ethnicity. Earsketch is an attempt to try to design around the social, and socio-cultural barriers that have Black, Latino, Female, and other students not considering computer science in high school because it’s nerdy, or it’s for boys, or they’ll get beaten up. There are a lot of documented issues that unless you’re a white, or Asian male, you probably have these things in front of you keeping you from considering even taking that computer science class, or checking out that workshop at the library, or whatever, so Earsketch is an attempt to try to circumvent those socio-cultural barriers, provide computing in a different context that provides meaning-making, personal expression in computing, and in a domain that is especially ubiquitous across youth culture.

Making hip-hop and electronic music doesn’t touch everybody, but it touches a lot of the kids that we hope to be able to at least be more literate in computing, if not consider a more technical career down the road, right? This isn’t so much about helping feed the Silicon Valley machine as it is about empowering people in literacy that is very accessible and acceptable to a very specific part of our population and not so much to others.

Ross: Well, it’s very much fun to be around augmented cognition through using different senses. So, how does it come together?

Brian: As you said, the environment is called Earsketch, it’s an online platform, if you just Google Earsketch, you’ll find pictures of ears, and you’ll also find our website. It’s been used by over a million and a half students, we have about 20,000 active learners a month. It’s part of the AP curriculum for computer science programming in our country. It takes the idea of making music and programming and putting them together. Kids use Python or Javascript, which are industry-standard languages, and they still are likely enough. They manipulate musical samples, beats, and effects with this code. Within a single hour, they have sat down, never programmed before, and by the end of the hour and their hour code curricula and other curricula, they have a thing they want to show off.

That idea of having an artifact that you want to show off to your friends and family that you’re not trying to hide and that you’re actually kind of proud of or invested in is a very unique experience in education. That’s the kind of moment that we’re trying to provide for these kids. They’re really lacking because they’re checking out the robotics camps, and they’re not checking out whatever it is that is working especially well for the folks who are already represented in computing, like computer games.

Ross: That’s some stuff you’ve done in another domain, such as drawing, but this goes to the broader point of this co-creative cognitive agent. And I think that for me, I’m always talking about the idea of humans plus AI and AI as a compliment, AI to make us more, amplifying our cognition. That really seems to be the heart of your work, and this idea of this co-creative cognitive agent. We’d love just sort of to hear a little bit more, riff on that idea of where we are today and what we need to be doing.

Brian: It’s really nice to have help in creative domains that you’re trying to learn, and it’s really difficult to get help at an individualized level on a daily basis, unless you’re especially wealthy. If you can’t get a personal tutor for programming, or you can’t get a personal tutor for graphic design or what have you, it would be nice to have tools that can help you, so democratizing that knowledge is really a big part of our goal and the lack of prior research in this domain is simply because it’s a lot harder. It’s easier to do AI-based assisted learning in algebra, like what Ken Koedinger made his career in.

Algebra has really well-defined rules, and you can look at a problem and you can exhaustively search the errors that students make, and you can represent those errors computationally. It’s a well-understood problem. That’s hard to do with sketching. By working on sketching, there’s some really interesting questions that come up that point out the deficiencies in current AI techniques for generating images. One of the big criticisms is that they’re just copies, there’s no understanding, they’re just duplicating and mushing together things that they’ve had before.

Now, whether or not this is a good idea to release into the world I’m actually wrestling with but you can imagine improving these agents by representing actual perceptual processes, in terms of Gestalt representations, for example. This is a thing that I’ve been especially interested in. If I can draw a little C shape, the AI knowing that that’s a container and it can put things in that container is a very basic visual Gestalt representation that I feel like once you start being able to put those kinds of things together, you can get pretty complicated behaviors emerging pretty quickly, in terms of the intricacy and the variance of what you can do with an AI that is able to actually perceptually reason about the images, versus just ‘I’ve seen pixels and pixels go with other pixels, and I’m gonna put some pixels here.’

Ross: Well it seems to be what you’re describing as ‘bouncing off each other’, as in you do something, the AI does something which is not necessarily determined or you’re not just instructing it to do something, but it’s bouncing off you. It just goes back to the roots of this robot improvisation as it were. I think your point around LLM has been basically, yeah, you ask, you tell it to do stuff, so what do we do now to unfold this more emergent collaboration with LLMs and their ilk?

Brian: Funded by Labs Better is part of it. I hesitate to comment on this, large language model development is 99% led by industry. Who knows what’s going to come out in a month? At some point, it felt like we were just in this holding pattern of waiting and seeing until the crazy stops. Maybe it stopped with multimodal models. I haven’t seen anything about social cognition at all, in any of the work that’s out there, but I really like that after being so surprised by every month for the past few years, I’ve definitely lost a bit of certainty as to whether or not we’re asking questions that other people aren’t, because they’re doing it in secret and using lots of resources, and they’re suddenly going to release the thing someday. And that’s when we’ll find out when they’re out publishing papers.

Ross: There are two sides. One is how we get the AI to interact with us better and more usefully in more ways that draw us out, but also it is also our own skills and attitudes and the ways in which we interact with the systems as well. Taking your mindset and propagating that as to the ways in which we should be thinking about how we work with these systems.

Brian: We have a really clear ethical problem with one of our projects now. It’s a microcosm of the larger space. As I was saying at the end of the year sketch journey, we’ve worked on a co-creative AI, it’s called Cai. That was the collaboration with Kristy Boyer and folks at the University of Florida, it was this conversational AI that helped you write your script code and helped you with both the technical and aesthetic sides of the project which, never done before, is a big new thing. LLM came out right towards the end of our development and completely changed kids’ expectations. We had lots of really good positive findings on their iterative designs. Suddenly, kids’ perceptions have completely changed. The ceiling, or the floor was raised way up. Their expectations were just out of touch- we were out of touch with their expectations suddenly. This is still an interesting problem.

We’re working on it with LLM technology as a part of it. Even if they’re grossly successful, if we build this agent that really helps kids learn and write better Earsketch code, how do we release it? How do we make this tool? As I said before, this is a thing that gets judged in AP tests, those AP tests do not take into account intelligent assistants, right? There’s this double-sided question of what can we do? And what does the world actually want? And what is actually useful for the world, and still trying to figure that out to some extent, because this feels like a really useful thing. Yes, kids can learn from this, but how does it fit into our current structures how we teach right now and how we evaluate? Not sure about that. Part of this work really is figuring out the right way to integrate this into our current ecosystems, rather than some ideal one that we’re designing for. This might be a thing that you can turn on for your class or a thing that you can turn on for your assignment and then turn off. There has to be some teacher control in this that we’re gonna have to figure out, or some gatekeeping where it’s not just anybody who can use it anytime. It’s all for the sake of people being able to evaluate people at the end of the day. We’re not a part of that at all.

Do we make co-creative agents for Earsketch projects, or co-creative agents for drawing? There’s still this question of what is society okay with. Right now, it seems like society is not okay with AI-generated art, and visual art, more or less people are from what I’m seeing very- there’s a lot of vitriol if a comic book artist gets accused of using AI art, that person is canceled suddenly.

Ross: Even though it is usually used in co-creative processes. It’s just part of the co-creative process,

Brian: Right. Here’s also the weird and bizarre thing is that there’s a slippery slope argument here, there’s some weird big grayscale or gray area here, where AI has been used to film for decades, but now we’re worried about it being used in certain ways. How to talk about that nuance about the differences between ‘we use AI to fix Arnold Schwarzenegger, his eyebrows or whatever, post-production’, versus ‘we used AI to take these body scans of doubles, and have them be actors and all of our movies for forever’. There’s a really big difference, but it’s also using AI. There’s no really good language or common literacy to talk about these differences. That’s the reason that a big part of my work- Earsketch is about CS education, and as we’ve talked about, I do a lot of work in AI. There’s things that naturally come together, and I’ve done a lot of work in AI literacy, in particular, within CS education.

I don’t know if we’ve talked about this previously, but the main framework in the world for defining and discussing AI literacy is the result of a dissertation on my lab, my student, and current collaborator, Duri Long. I published a paper in 2020 called ‘What is AI literacy’. It gets hundreds of citations a year. I just point this out, to just bring up the idea that a lot of how we interact with these agents is on how we design them, but also in literacy, and in terms of how we know how to interact with them. It’s on us to design AI systems that are transparent, that are explainable, but it’s also on us to consume visual content skeptically now. It’s on us to have some basic understanding of the capabilities and limitations of large language models so that we consume it’s false information it’s giving us correctly. Some of the work that we do is in designing museum exhibits to try to get at specific learning objectives that center on this topic. If you’re in Chicago, we’re doing pop-up exhibits, every now and then for our exhibits, and hopefully, they’ll be there for a little bit longer of a time and in a year or so. At the Museum of Science and Industry, we’re putting AI on the floor that’s about engaging with AI in creative ways, which is sort of my thing, and learning about AI through that creative, often embodied, and tangible interaction. The dancing AI is part of that exhibit.

Ross: To round out, where should we be going in this human, computer, creativity, and expression? What are the next frontiers? What are the things which will enable us to do more with computers in being creative and expressing ourselves?

Brian: I feel like something I mentioned a minute ago, explainability, and we’re on the cusp of that, in a lot of ways, is far and beyond one of the most important things that’s missing from current systems. So when you see, gosh, I don’t know which systems do what now, but sometimes language models will give you some citations. So like, ‘here’s what I said, and here’s where I got this information from’, that is fantastic, compared to ‘here’s the truth’, and just nothing. In terms of implementation and technology, the socio-cognitive stuff that I was talking about earlier would be another thing. As a country and as a planet, policy is a thing that we need way more catching up work to do than maybe anything else. The integration of these technologies into our society suddenly is a weird experiment that we’ve decided to do. And it really feels that we’re not necessarily making the decisions that are best for us, but more that, what are maybe best for the market, or for investors in specific companies, and I really feel like they’re not looking out for me.

So there’s, I feel like a Star Trekian future for us where we take these technologies, and we use them for our betterment and to advance our lives to help create new art discover new scientific concepts to express ourselves to find meaning. But, there are other folks who are just using generative AI to make political spambots that argue with people on Reddit. So much about these technologies depends on the beholder rather than the technologies themselves, and I guess that’s true for pretty much any technology. But, that question of, what do we build, and how are people going to use it that aren’t like us, is the thing that we all should be asking as researchers and I’ve been asking myself quite a bit lately about, like the socio-cognitive work. Do I understand if I was super successful, though, do I understand the actual ramifications of this technology existing in the world? And me releasing it, open source? Like what would that do and, and having some lack of clarity, and understanding is a weird place to be in after having worked in this place, in this field for 25 years now.

Ross: I think the takeaway is, that we need inspiration and I think , your work and your attitude, and everything, all your collaborators is, is kind of a bit of a light and a lead for us, and being able to consider AI as enhancing who we are and our ability to express ourselves and our potential. So thank you so much for your work and everything you’re doing in your time today, Brian,

Brian: Thank you. That was one of the nicest summations of my work I’ve ever heard. I might have to write that down. Thank you. Appreciate it. And yeah, thanks for having me today. I love talking about this stuff. It was great.

The post Brian Magerko on AI to enhance human creativity, robot improv, music to learn coding, and improvisational dance with AI (AC Ep56) appeared first on amplifyingcognition.

100 episodi

#Tech #MBA #Entrepreneur #Business #Ross Dawson