Africa-focused technology, digital and innovation ecosystem insight and commentary.
…
continue reading
Contenuto fornito da Gaël DUEZ. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Gaël DUEZ o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Player FM - App Podcast
Vai offline con l'app Player FM !
Vai offline con l'app Player FM !
#21 Greening Software 101 with Anne Currie & Arne Tarara
Manage episode 367166650 series 3346125
Contenuto fornito da Gaël DUEZ. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Gaël DUEZ o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Are you ready to take a closer look at the environmental consequences of coding and join the movement towards green coding? What actions can we collectively take to minimize the harmful effects of software development on our environment?
That’s what we discussed in this episode on the harmful impact of code on the environment !
Join Gaël Duez to meet : Arne Tarara, Ceo of Green Coding Berlin in Berlin & Anne Currie in London, writer of several science fiction novels as well as the much looked forward O’Reilly book “Building Green Software”.
➡️ Arne and Anne shared their insights on green computing practices. We tackled various topics related to :
- green software, including grass-root efforts,
- integrating sustainability in training,
- tools for reducing environmental footprints.
✅ Don't miss this episode to explore "Green Software" and gain valuable insights on the tech industry's environmental landscape and web sustainability.
❤️ Subscribe, follow, like, ... stay connected the way you want to never miss an episode!
That’s what we discussed in this episode on the harmful impact of code on the environment !
Join Gaël Duez to meet : Arne Tarara, Ceo of Green Coding Berlin in Berlin & Anne Currie in London, writer of several science fiction novels as well as the much looked forward O’Reilly book “Building Green Software”.
➡️ Arne and Anne shared their insights on green computing practices. We tackled various topics related to :
- green software, including grass-root efforts,
- integrating sustainability in training,
- tools for reducing environmental footprints.
✅ Don't miss this episode to explore "Green Software" and gain valuable insights on the tech industry's environmental landscape and web sustainability.
❤️ Subscribe, follow, like, ... stay connected the way you want to never miss an episode!
Learn more about our guests and connect
Anne Currie is a seasoned technologist with 25+ years of experience in the industry, known for her advocacy of green software and responsible technology. She co-founded the Green Software Foundation. She is a writer of several science fiction novels as well as the much looked forward O’Reilly book “Building Green Software” !
Arne Tarara is the CEO and Software Engineer of Green Coding Berlin Software, a company focused on developing open-source tools for Green Software to reduce the climate impact caused by bloated software. Arne's mission revolves around researching the energy consumption of software and its infrastructure, creating open-source measurement tools, and building a community and ecosystem around green software.
📧 You can also send us an email at greenio@duez.com to share your feedback and suggest future guests or topics.
Anne and Arne’s sources and other references mentioned in this episode
- Tim Frick’s book "Designing for Sustainability"
- Greenframe.io
- Green software foundation
- Green Coding Berlin
- O'reilly book's "Building Green Software" (pre-release)
- University of Texas paper working group and blog
- Conference in Germany about computing and environmental: EnviroInfo conference
- Scaphandre
- CCF
- Max Schultz SDIA
- Adrien Cockroft's article : "Don't follow the sun"
- Dr. Melvin Vopson 'The environmental weight of data'
- Environmental Variables podcast
Transcript
[00:00:10] Anne: Write code that is absolutely good for the language that you're writing in. Because compilers do a fantastic job at optimizing your code for you. Don't second guess your compiler.
[00:00:26] Gaël: Hello, everyone. Welcome to Green IO the podcast for doers making our digital world greener, one byte at a time. I'm your host, Gaël Duez, and I invite you to meet a wide range of guests working in the tech industry to help you better understand and make sense of its sustainability issues and find inspiration to positively impact the digital world. If you like the podcast, please rate it on Apple, Spotify or your favorite platform to spread the word to more responsible technologists like you. And now enjoy the show.
So, Green Software. Quoting my dear friend Ismael Velasco, « Our code is harming the planet » and I am privileged today to have two of the best experts European-wide, and I dare to say, worldwide, to deep dive about it. One is based in London, England and the other in Berlin, Germany. Let's start with Anne Currie. When I think about her, I have this song from Dire Straits, ‘Lady Rider’ in mind, because Anne is a writer indeed of several science fiction novels, as well as the much-anticipated O’Reilly book ‘Building green software’ with her two partners in crime, Sarah Hsu and Sara Bergman. The book is in early release, and I can't wait to discover the animal, which will be finally chosen. Anne and the two Sara(h)’s, with and without the H, are also pillars of the Green Software Foundation, and are carrying the flag for sustainability in tech in many conferences like ‘QCon London’ or ‘Apidays Paris’ (a leading API conference series), just to name a few.
When I had the pleasure to meet Arne Tarara, we went for a walk in a small park in Berlin. This is how he likes to exchange, outside, surrounded by nature. And during our talk, I was astonished by Arne's deep knowledge on green computing, and his commitment to build efficient tools for developers in the true open-source spirit through the startup he created called « Green Coding Berlin ». As you can guess, both of them are seasoned software engineers having decades of practice behind them.
Hello Arne. Hello Anne. Nice to have you on the show today, and I'm going to be super cautious with the pronunciation of your names today, so that everyone will understand when I ask Anne, and when I ask Arne, to intervene, so as not to confuse too much all our American friends. And I'd like to start with the usual question I ask all my guests, which is: “how did you become interested in sustainability, and in the sustainability of our digital sector in the first place? You know, did you experience a light bulb moment, or was it more like something that was there forever?”. Maybe, Arne, if you want to start?
[00:03:20] Arne: Sure. Thank you for the nice introduction. I think the important point is one you already mentioned, that I have already been a software developer for quite a while, 16 plus years or so. I had just finished with my former company, which was mostly in performance, marketing, advertising, online shops etc. So, one of these classical building businesses, I would say, at least at the time, and I wanted to make something that had more meaning to me, and had a more sustainable touch. So, I tried to branch out in different fields. It wasn't clear if it would be digital or not. We have a strong meet-up culture here in Berlin, actually, same as in the US, I guess. And I was introduced to the meet-ups about green coding. People that claim that they can make the world a little bit better, not only the digital sector, in particular by using software. On the one hand, it was a bit surprising that you can be a professional in your domain for 16 plus years, and you obviously know that there's stuff like supercomputer optimizations, and hardware gets better over time, and more efficient. But you didn't even think about this issue in Germany. We have this pro work [attitude], like sweeping in front of your own door, i.e., basically just checking whatever you do, and see if that could be improved, made more efficient in any way. And I was introduced to this meet-up group and I would say, yes, it kind of had this mind-blowing effect on me - it was like « wow ». Actually, there is so much you can do. And this niche opened up to me. I believe green coding is still to a certain extent, [an area] that you can get to know very many people very quickly, people who seem to be the top players, with what they are currently doing at the moment, and that there is so much potential to be lifted, there's so much to be done, so much you can do with software. So much efficiency gains [to be reaped] that are still lying around in software field. And I immediately knew, OK, this is what I would like to do, and I looked around, and there were not many tools at the time. The tools that I discovered which kind of had my idea implemented in part, was, for instance, GreenFrame.io, which is still around. And I think it's actually from France, right? I picked up the tool and I tried to use it, and it didn't work at all for me, and it was not open source. And then I thought, this the natural way I'm going to contribute, but it was not possible. They will not act; they are not interested. I contacted the support team, but didn't get a call back, etc. And I thought, OK, I'm going to re-implement it myself, I made the green metrics tool, which is one of the main tools we're doing and said “OK, I also want do it a bit more professional, and be a business, and want to carry out research in this field ». And this is when I created Green Coding Berlin, and set up a small team of people that we would al like to work with. And now we're doing research and trying to make the sector a bit greener.
[00:06:10] Gaël: What about you Anne?
[00:06:13] Anne: I've always been very interested in efficiency, and a lot of that comes from the fact that but I've been in the software industry since the early nineties, and back then everything had to be efficient. We didn't really have any choice in that. The machines were really terrible, you know, in many ways, 1000 times worse, 1000 times less bandwidth, 1000 times less power. We had to be incredibly efficient in the way we wrote things. So back in the nineties I worked for a while on the first version of Microsoft Exchange, in these days, with one of my co-authors, Sara Bergman. She [still] works on exchange, you know, nearly 30 years later. It's always been interesting to me, that it’s still effectively the same product. But it requires 1000 times more resources to do its job now than it did then. And when you look at the world and the energy transition that's going to have to happen, you realize that 1000 times more resources, that's a lot. And we could really be doing with using that electricity, those resources, as efficiently as we did in the past, today. Except there are issues with that, you can't just go ahead and do it. That's the tricky thing to be doing at the moment. How do we align them both? So, I wouldn't say I didn't have a kind of an epiphany, a kind of like religious belief that we had to do better. Just the knowledge that we really could (do better). And this is something that has to happen and therefore we will do it. And this is a key way that we will do it, through increased code efficiency. But that's not the only way, but it is a fundamental way.
[00:08:02] Gaël: I didn't do any software engineering back in university, that is not what I studied, but I learned everything on the ground with a tremendous mentor called Jean Yves. And he used to tell me already about the old days, you know? And he had this expression, that in the old days we had to break every byte into two. And it started already to be, yes, quite convenient. The computing power was going up. Storage was less and less of an issue, etc. But I could not forget what he used to tell me: you need to pay attention to memory resources. You need to pay attention, and we completely lost sight of it. And it's quite fun to hear you referring to these times where resources were super scarce and super expensive, actually. And that actually leads me to the question I wanted to ask you, and so I might lose my bet, but I would dare to say that if I bet a dollar or a euro or a pound with you, and you choose your currency, that the words ‘green software’ was not that widespread, I guess, when you started coding. So how do you see the evolution (of it) today and, maybe more specifically, during recent years? Where are we? I mean, is it that widespread? Or is it more like do we (Arne, you and I) live in some kind of informational bubble, and we actually think everybody cares? But perhaps that's not the case?
[00:09:37] Anne: I think well, I'll answer your first question first, which is ‘No’. 30 years ago, green software did not exist as a concept for two reasons, really. Software just wasn't that much of a hit on the economy at that point it, it didn't use that much electricity, the software industry wasn't that big. It didn't have the impacts that it does today. Plus, also, you know, culturally, we weren't thinking about these kinds of things back then. But ironically that the kind of things that you have to do, or are part of being efficient and using energy well, for software we were just doing it, because we had to, because we had those terrible machines. So, in some ways, we were great, because we were doing a lot of the right things, though not all the right things, but we were doing a lot of the right things. We had efficiency absolutely down to pat. But that was because it was necessity rather than because we actually had to do it. But these days, I think you're right in saying that there's a risk that we're all in a bubble, where we think this is something people care about now. But it isn't. But it has become, massively, massively, more top of mind in the past few years. I remember talking about this at conferences 5-6 years ago, and people looked at you as if you were crazy, and we even got complaints. When we ran tracks on this subject a few years back, people were saying, well, that's politics, it's not technology, it shouldn't be included, in conferences. But now nobody says that. Everybody knows that it has to happen. There has been, I think, the IPCC report that really woke everybody up. And the fact that the tech industry is one of the biggest industries. We have to do things and yes, some of it is going to be efficiency, just like we did in the nineties. And some of it is going to be time-shifting, which is in the long term, even more important.
[00:11:35] Gaël: Could we say that the awareness has dramatically raised? But what about the practices? Maybe Arne, you want to comment on this one? Did you really see a significant change in the way people code, even if they are aware of the ecological crisis that we are into at the moment?
[00:11:55] Arne: I think part of every business that everybody does (or should do) is to do a bit of research, asking: ‘has this maybe come up before’? Have people been talking about green coding before? Are we currently on a hype or are we currently in some kind of a valley [trough], so to say? And if you look back in the academic world, there were already, in 2007 – 2010, very many papers around green coding. There was the university in Texas who had this Archer supercomputer where you could actually measure all your code and before even RAPL was out there. Or perhaps it was around the same time. which I think we would come to technologies later. But let’s put it out there for the moment that one measuring technique is the processor itself. You could already do it on systems that were out there. But basically, nobody was interested anymore. And I would say a drought of papers in the academic world happened, and now it's coming up a bit again, at least in Germany. We have a conference about it ‘EnviroInfo’ where it's mostly about computings and ecology in general. So, I would say that the green coding, at least in my historical view when I looked at it, has already had its ups and downs. And now coming back to your particular question, especially, I would say in the last year, I wouldn't say that there is necessarily a stronger move on people adopting these techniques. So, a measurement you could, for instance, take, is one of the most prominent softwares, I guess, like, cloud carbon footprint. And Scaphandre [a metrology agent dedicated to electrical power consumption metrics].
I would put out here for instance how many GitHubs do they have over time, is there a search or something? I don't have all the data as I'm not the repository owner, but this is my view on how green coding has evolved over time. But, if it's OK, I will elaborate on this one a bit because you also asked me about what we are doing and how we see the sector in particular. We at Green Coding Berlin do not necessarily do what people often think green coding entails for them in particular. When we talk to companies or young developers, they ask us for optimization. They say, “OK, how can I make my code greener in particular right now? Which tool do I have to use to emit less carbon?”. This is actually something we don't focus on a lot in particular because green coding, as from my view, if you look at the digital sector as a whole, (it) is not a problem that is coming from the industry itself. The industry doesn't necessarily have an issue with the digital product that it's using. It's rather something that's coming from states as actors, from developers, and from consumers. The industry itself, in my view, has an incentive to tackle things that they think are not efficient enough. For instance, the machine learning models, because they cost them a lot of money. I think this will be resolved on its own. A bit of additional pressure might be nice, but it's not necessarily needed, I guess, for this to transition. Or things that are not cost effective, so, if you look at something like YouTube, Twitch and Bitcoin, they are in themselves, for the most part of it, already cost effective. But people complain about them a lot and think ‘can we not make this greener in a way?’, because they often don't use these technologies themselves. You will rarely hear complaints from people that earn their money with Bitcoin, that Bitcoin should be should use less energy. And generally speaking, people that don't use Twitch are more likely to complain that one such streamer can emit x amounts of carbon. But for the companies that run them, like Google, that runs YouTube, or Twitch which I think is owned by Amazon, it's a cost-effective thing for them to use these platforms, even though they will produce an enormous amount of data, which is harmful on its own. But it works for them. The incentive is there, but it’s not that intense. And I think green coding techniques on YouTube will take a while until they're implemented if they are not directly cost effective, for instance. On the flip side, the developers are becoming more concerned. This is what we see, for instance, as a company. But this overlap of business and interest, I think this is still in the making, and I'm not really sure if this is the biggest driver. So, I think that green coding, and the effect (coming back to your question in particular), will mostly happen. And this is also what we work on at “Green Coding Berlin”, in particular through regulations. And this means that you have to have the transparency first. And this is what our tools are mostly doing. They are giving developers and users transparency. They make stuff comparable, and then some someone can step in, like regulators or society, to force that optimization techniques, or limits can be implied.
[00:16:42] Gaël: And how do you enable more transparency to happen when we have so many issues? And I'm not going to brag or quote too much during this discussion, from Max Schulz from the SDIA. But he's got a point when he says, again and again, that especially the main hyper-scalers are not providing enough comparable and transparent data to truly leverage everything that we could do in computing. Do you also believe that it's an issue? Or actually, what you were saying is that with the tools that you've developed or the approaches that you encourage people to follow, there is a way to become more efficient? Even if some data are missing. And I know that all the hyper-scalers are, I would say, not doing things at the same speed. But I will not enter into this debate here.
[00:17:38] Anne: It's interesting with the hyperscalers, they are interesting because you can put pressure on them. Even, obviously, Governments and things could put pressure on them, but it's amazing how much pressure users, customers can put on them. If you say, « Look, I want this, I'm demanding better carbon footprint, measure(s), I'm demanding this information ». They are quite customer, well, I say they are, not, they are. AWS. Amazon is quite customer responsive. Actually, Azure is quite customer responsive as well. Google not so much, but if you raise this, if enough customers raise it, and it doesn't require that many, and you keep raising it, they (hyper-scalers) will see that there's customer demand for it, and then they will do it. When Amazon talks about being customer obsessed, they actually are. If you keep raising (it), if a handful of people, not that many, just keep raising this, with AWS reps, we have a good chance of getting it. And we got those sustainability commitments. Whether they will be sufficient in the end remains to be seen. But we have made progress by getting folk to raise issues with their providers.
[00:18:45] Arne: I think this is one of the big levers to go to, that you have to put the pressure on the cloud providers, either through the user side or through a regulatory side. And, for instance, this is what our tools are trying to do. A lot of people run an extensive amount of CI/CD pipelines, and what our tools do is that they simply create an easy machine learning model that's based on an open database of server energy consumption, called spec power, and then you plug that into a bit of code, so that it can be digestible by GitHub actions. This is their pipeline product. Or by GitLab CI. This is their pipeline product. And then you just see at the end how much your pipeline is consuming, and you see it for your hundreds or thousands of pipelines. And then you have a number, at least at the end of the month, and you can see if this number is going up or going down, and then you can go the route that Anne was suggesting and saying: “Hey, this number is maybe not the best because this company green coding building is doing it from the outside in. So why don't you give us these numbers? “. So, they go to GitLab and they go to GitHub or Microsoft in particular. And they say, « We want better numbers. It's not so hard for you to give them to us. And now we see that it's possible to actually, somebody can do it from the outside in. So why don't you give us these numbers so we can be better ». But we believe that people need to see this to a certain extent before they can even ask the right questions.
[00:20:06] Gaël: It’s a bit like starting with the metrics, we have to create a momentum and then in parallel, put pressure to get better metrics and better data from providers. And if you don't mind, both of you, because we could discuss a lot about cloud providers and the general approach, but actually, I'd like to deep dive a bit more with you. Could you share the top 2 or 3 techniques or approaches that you implement, I would say on, almost on a daily basis, to reduce carbon or carbon emissions caused by software?
[00:20:41] Anne: I'm a bit controversial on this one, so I'll start off and say, this is something that came up when we started writing « Building Green Software ». One of the questions that came up immediately from people [was]… “Oh, in the book can you cover some examples of efficient code?”. I used to write efficient code. Almost everybody I know writes efficient code and we all (this is terrible), we all laughed when someone said this, because almost the definition of efficient code is, it's incredibly custom. It is utterly and specifically custom to the very, very particular use case that you're interested in, and a really efficient code takes ages to write. It is incredibly bad for developer productivity, so generally it's quite hard to give people advice about how to write efficient code. I mean, you can say, ‘Well, I'll use efficient languages like C or C++ or Rust, rather than less efficient ones like Python. But even that's not so clear-cut these days, because there are new Python compilers that are compiling Python to machine code, or compiling Python to C. So, you can still write in the inefficient language and have it transformed into a more efficient one, because they know that developer productivity is really killed by writing this very, very highly custom code. So, it's hard to give generic advice. If you speak to folk who are really still writing with efficient code, for example, in the networking area, you're still having to write that high[ly] proficient code, the same kind of code that we used to write 30 years ago because you really, really need that super performance. And their feedback is generally: write code that is absolutely good for the language that you're writing in. Because compilers do a fantastic job at optimizing your code for you. Don't second guess your compiler. Follow best practice so that your compiler can optimize as far as humanly possible. It's a bit sad because everybody wants to hear some amazing, C technique, whatever. But fundamentally, it's just really, really hard and very custom. The best thing you can do is measure different tools. Get somebody else to do it for you. Don't custom write your own high proficient code. Find libraries and tools that are good and use them, which is what you need to use the measurement for. You need to measure to find out which are the good tools in the libraries, and you swap out poor ones for those more optimized ones, but don't attempt to do it yourself unless you are actually writing those libraries, I would say. It's a bit sad, but I would say there's no there's no killer technique that you can use because it's all hyper custom. You know, it's all basically asking around with your L one L, two L, three caches for a very, very specific use case. I don't know if, you might disagree with me Arne?
[00:23:43] Arne: No, I actually have the same [outlook], I have the notion that we are very on a par here, with our view on the ‘optimal’, how good these generic optimization tips are. However, if you think about what we often get, [it] is requests from users who see these articles, that Amazon has implemented a new gzip or zlip compression technique in their S3 service and it saved them, I don't know, I think it was in the tens or hundreds of millions, because they had to use less hardware to store their stuff. Or that you see this article that states there is a 70% improvement in React by just stitching the virtual DOM, so apparently it is possible, on a particular product, to get these gains. However, I would very much agree that on a generic level, it's extremely hard to implement. So, there are techniques that have been known for many years, like using vector instructions, loop unrolling, etc., that do work if you really put the work in. But it's a very questionable if really, in the end, if you look at the whole thing. Also, the time the developer had to think in, how much the software will run in the end, how much it cost you building these 50 to 100 iterations until you get it working, if this really saved you something in the final calculation. So, I think this is a bigger question, and I think Gaël you might make a separate podcast on this, this whole idea of software life cycle assessment. This is also something that Max [Schultz] is very passionate about. But I would like to give you our approach on how we typically do it. I think we have the same idea that Anne mentioned, that measuring is like one of the first steps, when we typically consult with companies or when we do workshops with developers. We have these five pillars, so to say, so, first of all, it’s about understanding. People often don't understand the terms that are even used. If you talk about energy and energy efficiency, they don't even know how a network could even cost them in energy terms, that network costs can be linear, or they can be progressive in a way, and then [there is the question of] transparency. The measuring and transparency. Whatever you then have understood and measured, you should also show it to people and make it public. So as in GitHub, as a badge or something. Then continuity is a pillar we focus on a lot, so it doesn't help you if you look at it one time [only], so you have to monitor it over time. So, like the git-ops approach, that with every release, with every build, you basically have to check if your initial measurement or your initial assumptions are still right, or check if the product currently derailing, and you don't want that. Then the fourth pillar is comparing. If you are thinking about software, and you're looking at the goal at the end, the optimization, is to actually sav something, then comparing is often very helpful. So sometimes just looking at how much would database 1 - just technically identical to the database that I'm currently using - how much would this change? So just swapping libraries out, as you said, or swapping infrastructure out is often a better way to go than going on code level optimizations in particular. But they are obviously a point, so our first pillar is then code level optimizations’ wherever they make sense. However, this is then specific, so you have to really look into your product. It often means using specialized tools. So maybe VTune® or something, or code profiling techniques. This is very laborsome, and, these tools are also sometimes cumbersome.
[00:27:18] Gaël: I would say it's a lot about measuring and comparing, rather than having one silver bullet, it makes sense. If it was that easy, everyone will do it. And I guess the question of software productivity, the productivity for your developers is absolutely key here. We need to take into consideration the full life cycle, and like you should take three or four times more days, to just release one little piece of code, and so actually, you could even use the energy better.
[00:27:53] Anne: Unfortunately, it's more like 10 or 20 or 30 times as long! I remember how long things used to take. They used to take an incredible amount of time. I mean it is interesting, that in the 30 years of my career, there has been more than 1000-fold improvement in machine productivity, and we've used it to make developers more productive. And it's very hard to make the sell to your business that you should go slow, slow down, because otherwise you'll go out of business. So do you have to trade off what you can sell to your business, as well as what is a sensible thing for your business to do, as well as what is the green thing. You have to align them. I'm not saying throw out the green things, I'm thinking you have to find ways to align them both. And the good news is that ((there are)) all the modern ways of working with microservices, with open-source libraries with hyper scalers, hyper-scaler services. Arne said this himself, that there's an alignment, as if you're a big business, to make your stuff efficient because so many people are using it, it is worth putting in that 100 X developer effort to make it efficient because you've got so many people using it, (so) that pays off. But if you're only a small business, and you only have a moderate number of people using your software, you'll probably never pay back that developer effort to make it super-efficient, so you're better off just using a library. Don't do it yourself. Use a library. Use an open-source library, use a hyper-scale service. But I mean, we talked about code efficiency here, but I'm not even sure in the long run that that's going to be the big win that we're going make in the tech industry. I think it's going to be the time-shifting, because even now we're seeing that with renewables, you get huge amounts of energy at some times, and no energy at others. And that requires a whole different way of using electricity. In the old days, it was just, you know, flick of a switch, all fossil fuel driven.
[00:30:01] Gaël: Is it something that you implement quite a lot, like chasing the sun, which is time shifting and location shifting, or not that much?
[00:30:12] Arne: It is actually a technique we do implement, on workshops, with developers, because it's generally a very interesting technique to implement, as it suggests that there are immediate gains. I don't know if you've recently read the piece, I'm not sure who wrote it, if it was David Mytton or Adrian Cockroft, or maybe I might be mixing stuff up, where there was this piece called « Don't Chase the Sun ». It was like a counter argumentative piece. That [chasing the sun], at the moment at least, often doesn't make sense. I will elaborate on this a bit further, but I would like to say that I generally agree with Anne, that this is an enormous [energy] saving technique, and this is actually what, at least in Germany, we are implementing with the grid. I think every country it does, but I can only really speak for Germany by saying we want to have smart meters. So that in the end, when we have surplus energy, and we really need to not waste it by curtailing it, we want to charge electric cars at this particular time. And in Germany, we have a long way to go by incentivizing people to charge them at these hours so, that it's actually cheaper to wait. Currently, at the moment in Germany, it's not cheaper to wait, even if we would have smart meters, because there is a law that that makes the pricing even throughout the day. But if you look at the current state of how time-shifting works, we are currently implementing a small plug-in for GitHub where you can say, “Hey, I want to run this pipeline at this particular amount of time because the prognosis or the forecast says that there will be green energy at the time”. However, how the grid operators, to my knowledge, typically plan out how the grid is supposed to be, and it is very likely, if you're at some point where the forecast says there is a lot of green energy and the grid is already in a stable state and you demand more, then it will not come from solar or from a wind farm, because this is then already curtailed, because the grid needs this bit planning ahead so they will more likely act, drive the power plant that runs on coal a bit more up. But this is a temporary problem to my understanding, as if they learn these signals over time, so even if you do that 5, 10 times, the grid will learn, then they will actually not curtail the green energy so much, and you will get it. But it's the same as those network savings. It's often not an immediate gain. It's more a theoretical long-term thing until we can understand the signals better.
[00:32:36] Gaël: It was Adrian's article, “Don’t chase the sun ». Anne, you wanted to say something, sorry…
[00:32:43] Anne: Yes, I totally agree on both the « don't chase the sun ». You don't really want to be moving your data around. What you want to be doing is delaying it, you know, delaying jobs rather than moving data around to chase the sun. I agree with both Arne and Adrian on that one. It's interesting what Arne mentioned earlier: YouTube is an excellent example of one of the products that Google used to do their own kind of grid balancing, on their hardware. That, if you upload a video on YouTube, sometimes it happens. Sometimes you'll notice that it's transcoded very quickly, and sometimes it won't be transcoded for a while. And the reason for that is that they use that as one of their latency-insensitive workloads. If they've got a lot of stuff that's going on, if the systems are busy, they'll just shove that trans coding [down the line], it's a little bit later in the day when things are less busy, so they get better utilization on their machines. And right now, they're working on similar kind of shifting to try and move work to when the sun is shining and when there is potential to power it greenly. But Arne is right, that there isn't necessarily an immediate benefit to that, because right now, the grids might not have enough green energy to provide because that they may already be curtailing it. But in the long run, if you create demand at times when there is potential solar or wind to match it, then more solar and wind will be put in [to the grid] So it's not necessarily an instant win, but right now it's all about the transition. It's about moving to how we're going to work in that new world.
[00:34:36] Gaël: I don't remember if he mentioned also this aspect in his article. But it's also that chasing the sun is actually an issue once you start implementing multi-criteria approach, because carbon is one thing, but water is another. And, you know, if you shift all the workloads in a country where you've got plenty of sun, usually water is pretty scarce. And we are experiencing several droughts here in Europe, and the same goes in the US. So, the moment you say OK, let's chase the sun for green electricity, you might also create a lot of problems when it comes to water stress. So that's also why I kind of like his expression don't chase the sun. Maybe ‘Chase the wind’ is a is a bit more accurate, but eventually I guess it's all about reducing the energy intensity, and don't go for a silver bullet or a quick fix that actually does not exist in this energy transition. That's how I understood his main message, and I could not agree more with both of you.
If you're OK with it, because we talked a lot about measuring metrics, etc., so could you maybe share a bit, both of you, the do’s and don't’s when you measure, and maybe one or two examples on how you manage to measure for some of your clients.
[00:35:46] Arne: What I see in particular is that people have very often very different setups which, I think is normal if people are trying to find ways how to measure things and there is no standard out there. I think you can separate it into two basic domains. There is a cloud at the moment whereby most of the measurement techniques are not available that we use. The cloud is typically more an estimation game. You have premeasured machines and I will come to in a bit how you do that. So, you have, basically, premeasured machines. You have something you could call a calibration curve if you want, I know it's not technically correct, but for some people, this term might mean something. But you basically have a curve that tells you at this amount of utilization, this machine uses this amount of energy, and this curve is typically nonlinear, which requires a bit more than just a simple M times X plus B. So already getting into the technical stuff, so [there is] more than a linear equation to solve this problem, so you need a bit more. So here an easy machine learning model is what we use, for instance, to get this curve, and then you can go into the cloud where at least the utilization, which is a typical Dev ops metric, or a typical monitoring metric that is usually available in many of the products, is what you can use. And you can, to a certain degree, assume that the configuration of the machine that you have already measured is very similar to the machine that's in the cloud, as this database, where we get the data from, are typically machines that are bought by cloud vendors and they often use standard configurations (not all, but some). And then you can get a reasonable estimate of how much a machine in the cloud would use in terms of energy. There's also a similar approach that cloud carbon footprint follows. They have a linear assumption, to my knowledge, but I haven't monitored it currently. We have this nonlinear one, which is supposed to be a bit better, and I know there are people out there who have even better models, but they are not open source. So how do you even measure it? Most of the academic papers show that people attach a power meter to the computer, which is something that everybody who has done home automation or who just wants to know how much [energy] a microwave is really using knows, so it's basically an adapter that you can put over your power plug, and it will tell you how much the machine that is connected to it is currently using in terms of watts or kilowatt hours, if you want to have more an idea of energy and not a current power draw. And they have also USB XS’s, they have Bluetooth’s so you can easily hook them up in a connected system that can also then run measurement drops for people. But for some people, (it is still new for developers, because it's kind of under the hood), there is a technique that is called Intel RAPL or more like a hardware feature, I would say, not a technique. It is something like a power meter inside of the CPU, it is still more of an estimation calculation, but it's very accurate. So very many papers have already confirmed that it's very accurate to their falsification standards & parameters. What it basically gives you, as a developer, is you can write Linux code, and there is a function you can trigger, or a hook, and then you will get the energy that the CPU is using. So, you basically say “Hey, I'm going to start here, and so you make a start point, then you run a bit of code, and then you ask it again, and then you get number B. Then you have number A and number B and you subtract them, and then you know how much energy has been used between these two points. And what we do for measurement in particular is that we ride around these frameworks that already exist, so [there are] external power meters on board, there are sensors that exist. There are also techniques like IPMI which are also internal power meters. So there is this RAPL stuff, and we glue them together in one big open source tool, the green metrics tool, we call it sensor, that can attach these different sensors. And then we give this out as a fully-fledged solution to developers that already have software, which typically is now written in container form, and developers have already set up their container files, something like a doc compose file. And then they can just say “Hey, please take this Docker compose file, similar to like a bash script or like a Linux Easy, and then I want you to run these lines, maybe run this node program, maybe run the browser » and then you're finished. And I would like you to tell me in between, for example every 100 milliseconds or every 50 milliseconds, I would like you to write down the energy consumed. And then at the end, you get all the energy nicely displayed in the graph. There are some statistics applied to it. [You can ask yourself:] “Has there really been a change from the last time you've tested it to the time you've tested it now ». And to make this even better, we then also offer a service, on the web for free, where we have a measurement cluster with pre-configured machines that apply best practice on how to measure. I can elaborate a bit further on them later, but they do exist. And then you get a better measurement. It doesn't fluctuate as much. It is more reliable. You don't need as many repetitions to get a good statistical, conclusive answer, and see if the code is really different to another piece of code. We try to bring it [the measurement aspect] into a tool so that developers can use it with techniques they already know, like starting and stopping containers, or firing up a tool on the command line. And then they get with the onboard mechanisms that already exist, like Intel RAPL, or using machine learning models through CP utilization. They can then already get a metric out, and so they don't have to be measurement professionals. They just need to know how to use a Linux tool.
[00:41:44] Gaël: And this is where you can start comparing, I guess, or challenging the use of this library against another, and all that stuff that you mentioned earlier.
[00:41:54] Arne: Yes, exactly. The way to go would be that you have a Docker compose file and then let's say one time you use, as a package manager, you use NPM to install everything, and you want to see if it goes faster or uses less energy. And then you use PNPM, or you use a different one. I think Yard is also a package manager. And you can see if this library or tooling swap will change anything in your build process or your program.
[00:42:22] Gaël: Arne, you mentioned best practices. And I know that this is something that is very close to Anne's heart. Could you Anne, maybe tell us a bit more about these best practices, and Arne if you don't mind, you might want to comment on it.
[00:42:38] Anne: Well, when I was talking there, the thing that immediately hit me and I thought it quite interesting, is, in the old days, you know it's worth thinking about, the reason why we did all of this stuff was performance. You know, it was like the machines where you had to ring every millisecond of performance you could out of systems. We didn't use to measure energy use, we used to measure performance; your time, how long every operation took. And that's a fairly good proxy for energy use, how long things take, how performant stuff is. But I was thinking about it when Arne was talking, and the trouble with it is, it's very custom if you instrument your code to say, «I assume when this message comes in here, and then this message leaves here, [I know)] how long that is, and if it's less (than previously thought). Because how long things take is often about how many CPU cycles it's gone through. And then how many CPU cycles it goes through is basically how much energy you're using. There's a good correlation between performance and being green, which is why a lot of these kind of highly tuning techniques are still used in networking, where performance is absolutely key; you've got that [indicator]. But the trouble with that is, it's very specific. It's very custom. You have to know what an application is doing. You have to know which messages are going through, and know where to put your instrumentation in. Whereas if you're just measuring the energy use of a whole system, that's more generic. Therefore, you can have tools that are generally more usable by everyone, rather than doing things that are very, very specific and custom. So, I assume that's the reason why we've moved over from using performance as the key kind of way that you measure energy use, to actual energy use, because it is more generic and therefore it's more widely applicable. But would you say that was true?
[00:44:48] Arne: Well, I think you're absolutely right in what you're saying. And, if I speak to more seasoned engineers, then they often ask the questions like “Do we really need green coding? » I mean, we have performance optimization. « So where is the knob to tune if I don't take the classical performance techniques? ». And I think you mentioned some of the green coding techniques already. I think they are unique to green coding, like time-shifting in particular, it doesn't save you any performance, right? It only saves you green energy, or saves you carbon emissions in particular. However, how we see it is similar to how you [Anne] said it. If you think about green coding and energy is now so widely available through many sensors, why not make it the first auto-metric? Because this is actually what you care about, right? You don't want to save on performance at the moment. Or at least this is our mission. If you really want to save on energy, why not take it, even if it's strongly aligned, or if it's strongly co-linear, with performance metrics in particular? When these metrics are not aligning, there is typically something a bit wrong with your code in general. There are energy anomalies, and where you see that, maybe, performance goes up or goes down. But the energy budget goes in a different direction in particular, which could be like mis-configurations, for instance. You could have something like a vector instruction unit in the CPU nowadays called a AVX - It was called MMX or SSE before, to help get some gamers in the loop that might have heard these acronyms. They can be turned on, and then the CPU is using more energy. But actually, it's not doing anything, because it's currently not issuing any of these instructions, and this is typically a mis-configuration. Something turned the unit on, and then it's using more energy, when it's not needed. And so, it could be as though you have your hard drive mis-configured, it's spinning all the time, and so your disk is not going into a sleep mode or a pause mode, where it can stop spinning the disks. You know you are not using the hard drive in particular. This is also where discussions about idle time comes into play. So, your performance metrics could be perfect, but still, the machine is on. So a green coding technique, a classic one, and this also is what our tool shows, that if your code is doing nothing at the moment, does it really have to be on? Maybe it is an architectural decision here, where you say, maybe we move from a super, highly coupled, highly integrated, vertically-only scalable monolith, to something like a micro service architecture that we can actually turn off between requests, because we see more pauses. Then we really see activity, so the node doesn't have to be on all the time. Why not use the energy or the carbon metric as your first order metric? And then, however, if you lay hands on the stuff, [the metrics] you tune the performance metrics, but the measurement that you want to optimize against is the one that is actually following the goal that you want to achieve.
Anne: That's a great point.
[00:47:49] Gaël: Yeah, I do agree. Especially when we know that we will, more and more, as you mentioned Arne, have to take into consideration embedded carbon and full life cycle carbon etc. And that maybe, at some point, as you say, it will be environmental metrics and not just carbon. Because we have other environmental impacts that we do need to take care of. And this is really a question of which machine shall I use. And sometimes using less powerful machines, older machines, is also a way to save carbon. But that opens a completely different debate.
[00:48:25] Anne: It is a different debate, but it is worth reminding that there are three ways that the tech industry has to improve things. It's not just code efficiency. It's not just ‘be energy efficient’. It's also about being hardware efficient because hardware embodies one heck of a lot of carbon. And time-shifting. Those are the three things and we have to do all of them. We can't just do one of them.
[00:48:50] Gaël: Yes, I know, recently I was preparing for a conference, and I just found again this amazing interview that Jerry McGovern did with Melvin Vopson. And I know this is a theoretical work, just to raise alarm, but Melvin Vopson estimated the amount of mining that will need to occur to build the server to handle the 25% growth rate in data on a yearly basis that we have today. So, plus 25% data equals that amount [number] of servers to be built just to manage it all. And he discovered that in 2053, humanity will have to mine the equivalent of Mount Everest. So that's 175 billion tons, I think, just to build servers, just to handle the data - we're only talking about the data! And of course, then we can say we will have energy efficiency gains, but the scale is still so amazing, that it is something that we will have to pay attention to in the very near future. I know that at the moment we are focusing a lot on energy and immediate carbon emissions because of the electricity (consumption). But the embedded carbon is the next big battle, and actually it will be, I truly believe, that it will be the main battle at some point.
[00:50:23] Anne: Yes, and not just in in data centers. Every time I have to throw away or give away or do something with a working device, like a phone or a laptop, because it's out of support, out of security patch support, but it (still) works! you know, there's just so much embodied or embedded carbon in that device. It's immoral, basically, for us to know ((that there is so much embedded carbon and)) to give up on providing security patches.
[00:50:52] Gaël: Yes, that's true that we need to remember that, those end user devices. They account for three quarters of the entire environmental footprint during the ‘building’ phase, mining, manufacturing, transport, etc. So, of course, as professionals in tech, we focus on what we can do, which is mostly data centers and networks. But that's also true that when you talk with a designer, for instance, they are more and more aware of the tradeoff between “Do I want to enhance my code, even to do green coding, versus, how do I make sure I actually reduce the size of my code and not create extra complexities that will accelerate software obsolescence and hardware obsolescence?” But that's a very important battle as well. Can I ask you a final question on best practices? And I know Anne that there is quite a lot of good and sound advice in your book, and Arne, you already touched upon them a bit and if you want to comment, just feel free. But maybe, Anne, as one of the three authors of the next O'Reilly book, what are the best pages?
[00:52:01] Anne: Well, the introduction summarizes everything in the whole book, and that's already available in very rough, pre-release form, on the O'Reilly website. And you don't even have to buy an O'Reilly subscription, because you can just do a trial and you can have a quick read of it. And eventually, when it's finally published, the whole book will also be simultaneously public, open sourcing. But not until it actually is finally published next year. All of that will be available. So, principles - this is a horrible thing, nobody wants to hear this, no technology person wants to hear this, but really, the best practices don't focus on optimizing your own code. Instead use code that's pre-optimized by somebody else. Because that is by far the most effective thing you can do. In the long run chat GPT is going to become much better at optimizing; compilers are getting much, much better at optimizing code. You try and push that job off on to somebody else. But do be thinking about architectures and designs that will work with time-shifting. Things like spot instances, and micro services where you can turn things off, as Arne mentioned, or you can time shift them. Think time shifting first is my advice.
[00:53:21] Gaël: I guess, because you're influenced by your science fiction work and you want to travel across time, and this is why you're so obsessed by time-shifting! But it is finally happening. What about you Arne, do you want to travel in time again?
[00:53:41] Arne: Yes, as I said before, I also think that time shifting will be one of the bigger gains in the future. And embodied carbon is one of the bigger battles to fight, although there, I don't really know how the optimizations will play out because it's so opaque at the moment, as most people don't even know how, for instance, S3 is implemented. And what kind of hard disk, so it's very hard to say how optimizations could even work for a system like this, which stores, I think, most of the of the data that the Internet holds at the moment. I think my take on optimization techniques is very simple. Although we speak a lot about these, as I mentioned before, particular, vector instruction techniques, and these energy and performance metrics anomalies, so we speak about them because developers like to hear these super funny edge cases where something goes horribly wrong. But I think for a daily business, if you really want to save energy in your code, most developers know how to do it. So, there's really nothing you have to tell the developers really, to do. It's more that they are overwhelmed, as business is not giving them the time and the support to do it. I would really say that the particular key [issue] at the moment is transparency. Wherever you can measure your stuff, even if it's not the best metric, make it public, if it's on your own block, or if it's in the git-hub repository, or even if it's just in your notebook, that you at least know what your code is doing. And then the other thing is, to ask your management how much is our code emitting? Can you [management] not supply these numbers? « Ask the cloud providers » is also something that Anne mentioned, which I think will drive a lot of the transition - you have to ask for these metrics. For instance, if I go to the supermarket and I always buy a product, and I'm always angry that it's not packaged in recyclable paper yet I never ask the vendor [about the packaging], how can something change. There is no mind transition [reading]. I don't know what the English term is for It, when my mind goes into his mind, and so he obviously knows that I'm happy (or not) with the product. I have to ask for it [change). So, I think this is really the key, and such techniques like time-shifting. And I really have to say, and maybe this is a bit of belly rubbing for you, Gaël, we should listen to podcasts like Green I/O because you will hear about new techniques that developers find, that are useful and that should be employed.
[00:56:01] Gaël: Well, thanks. And that's a beautiful transition to my last question, which is what are the main resources you would advise the developer community to go for, when trying to green their code. But Anne, you cannot mention Environmental Variable because I’m going to do it first and give a big kudo to Chris Adams and Assim Hussain and the wonderful work they do with the regular guests like you. So environmental podcasts are Definitely podcasts to listen to and I personally I'm listening to pretty much every episode. I've taken this example, so you need to find another one!
[00:56:38] Anne: Well, of course I'm going to mention my book. “Building green software ». And I have good reason for mentioning this because we are publishing it every month. Ideally, hopefully, we'll be dropping a new rough early chapter, and we're looking for feedback. So, contact me at: buildinggreensoftware@gmail.com, and you'll be able to send us feedback for what you'd like to see in the book that has not already been covered. So, if there are questions whilst you are reading it, contact me - I'm on Twitter, Sara's on Twitter, Sarah's on LinkedIn. We're very happy to hear you come back and say « But I wanted you to answer this question ». We will attempt to answer the questions.
[00:57:24] Arne: I’ll also pick up the question, so, I'm monitoring what’s out there a lot. I have Google alerts that alert me about new stuff coming out. I read the Green Software Foundation newsletter. I read the Climate Action Tech newsletter. I'm a follower of this podcast. But I would say that there is no one [single] resource. I think this is what you what you are shooting for Gaël? So [for me] there is no one central place where you can find the all the best information. But if I have to name something that I think has given me the most value so far, with the most helpful techniques, it is from conferences. I think if people get a conference talk in somewhere where you have a sustainability track, [it gives you] that something that is a bit bigger, something to be watched. I think if you just want to follow one resource in particular, get an alert of something like sustainability conferences or sustainability tracks at IT conferences. I think it’s there that I've seen the most valuable content.
[00:58:27] Gaël: Well, that's music to my ears knowing that I will be in charge of the Sustainability track both at ‘Apidays’ in London and in Paris later this year. I've got a big blessing from you. Thanks a lot. But yes, that's so true, conferences, they're cool. I mean, you can interact, you can discuss with your peers, and that changes everything, I guess from, being just a passive listener. And, no, I didn't aim for a single source of truth. I'm always a bit dubious with these approaches, but, that's great, actually, that you mentioned conferences because we tend to mention articles, podcasts, etcetera. So yes, conferences and the big fight made by the Green Software Foundation. I mean, they've got a speaker repository now, and I know that their approach is no conference today in tech can spare having a sustainability track, or at least some talks on sustainability. And I think that's a great approach. And I've gathered people from all over the world saying, « Hey, these folks ((are good speakers)), and as far as I remember, both of you are in this cohort of speakers in IT/TECH/sustainability, and I am well as well (full disclosure!) But, these folks, they can talk, if you cannot find anyone, then just connect with them. But you cannot have a big conference without someone talking about carbon, sustainability and so on, so it makes definitely a lot of sense. Well, thanks a lot, both of you. That was a very lively discussion. I really enjoyed it and letting you converse with each other. That was really was music to my ears. So, I'd like to thank you once again for all the feedback and insights that you have shared with us today. Once again, thanks a lot.
[01:00:16] Arne: Thank you, Gaël. It was great to be on the show.
[01:00:18] Anne: Thank you very much.
[01:00:19] Gaël: And that's it. Thank you for listening to green IO. Make sure to subscribe to the mailing list to stay up to date on new episodes. If you enjoyed this one, feel free to share it on social media or with any friends or colleagues who could benefit from it. As a nonprofit podcast, we rely on you to spread the word. Last, but not the least, if you know someone who would make a great guest, please send them my way, so that we can make our digital world greener, one byte at a time.
❤️ Never miss an episode! Hit the subscribe button on the player above and follow us the way you like.
📧 Our Green IO monthly newsletter is also a good way to be notified, as well as getting carefully curated news on digital sustainability packed with exclusive Green IO contents.
52 episodi
Manage episode 367166650 series 3346125
Contenuto fornito da Gaël DUEZ. Tutti i contenuti dei podcast, inclusi episodi, grafica e descrizioni dei podcast, vengono caricati e forniti direttamente da Gaël DUEZ o dal partner della piattaforma podcast. Se ritieni che qualcuno stia utilizzando la tua opera protetta da copyright senza la tua autorizzazione, puoi seguire la procedura descritta qui https://it.player.fm/legal.
Are you ready to take a closer look at the environmental consequences of coding and join the movement towards green coding? What actions can we collectively take to minimize the harmful effects of software development on our environment?
That’s what we discussed in this episode on the harmful impact of code on the environment !
Join Gaël Duez to meet : Arne Tarara, Ceo of Green Coding Berlin in Berlin & Anne Currie in London, writer of several science fiction novels as well as the much looked forward O’Reilly book “Building Green Software”.
➡️ Arne and Anne shared their insights on green computing practices. We tackled various topics related to :
- green software, including grass-root efforts,
- integrating sustainability in training,
- tools for reducing environmental footprints.
✅ Don't miss this episode to explore "Green Software" and gain valuable insights on the tech industry's environmental landscape and web sustainability.
❤️ Subscribe, follow, like, ... stay connected the way you want to never miss an episode!
That’s what we discussed in this episode on the harmful impact of code on the environment !
Join Gaël Duez to meet : Arne Tarara, Ceo of Green Coding Berlin in Berlin & Anne Currie in London, writer of several science fiction novels as well as the much looked forward O’Reilly book “Building Green Software”.
➡️ Arne and Anne shared their insights on green computing practices. We tackled various topics related to :
- green software, including grass-root efforts,
- integrating sustainability in training,
- tools for reducing environmental footprints.
✅ Don't miss this episode to explore "Green Software" and gain valuable insights on the tech industry's environmental landscape and web sustainability.
❤️ Subscribe, follow, like, ... stay connected the way you want to never miss an episode!
Learn more about our guests and connect
Anne Currie is a seasoned technologist with 25+ years of experience in the industry, known for her advocacy of green software and responsible technology. She co-founded the Green Software Foundation. She is a writer of several science fiction novels as well as the much looked forward O’Reilly book “Building Green Software” !
Arne Tarara is the CEO and Software Engineer of Green Coding Berlin Software, a company focused on developing open-source tools for Green Software to reduce the climate impact caused by bloated software. Arne's mission revolves around researching the energy consumption of software and its infrastructure, creating open-source measurement tools, and building a community and ecosystem around green software.
📧 You can also send us an email at greenio@duez.com to share your feedback and suggest future guests or topics.
Anne and Arne’s sources and other references mentioned in this episode
- Tim Frick’s book "Designing for Sustainability"
- Greenframe.io
- Green software foundation
- Green Coding Berlin
- O'reilly book's "Building Green Software" (pre-release)
- University of Texas paper working group and blog
- Conference in Germany about computing and environmental: EnviroInfo conference
- Scaphandre
- CCF
- Max Schultz SDIA
- Adrien Cockroft's article : "Don't follow the sun"
- Dr. Melvin Vopson 'The environmental weight of data'
- Environmental Variables podcast
Transcript
[00:00:10] Anne: Write code that is absolutely good for the language that you're writing in. Because compilers do a fantastic job at optimizing your code for you. Don't second guess your compiler.
[00:00:26] Gaël: Hello, everyone. Welcome to Green IO the podcast for doers making our digital world greener, one byte at a time. I'm your host, Gaël Duez, and I invite you to meet a wide range of guests working in the tech industry to help you better understand and make sense of its sustainability issues and find inspiration to positively impact the digital world. If you like the podcast, please rate it on Apple, Spotify or your favorite platform to spread the word to more responsible technologists like you. And now enjoy the show.
So, Green Software. Quoting my dear friend Ismael Velasco, « Our code is harming the planet » and I am privileged today to have two of the best experts European-wide, and I dare to say, worldwide, to deep dive about it. One is based in London, England and the other in Berlin, Germany. Let's start with Anne Currie. When I think about her, I have this song from Dire Straits, ‘Lady Rider’ in mind, because Anne is a writer indeed of several science fiction novels, as well as the much-anticipated O’Reilly book ‘Building green software’ with her two partners in crime, Sarah Hsu and Sara Bergman. The book is in early release, and I can't wait to discover the animal, which will be finally chosen. Anne and the two Sara(h)’s, with and without the H, are also pillars of the Green Software Foundation, and are carrying the flag for sustainability in tech in many conferences like ‘QCon London’ or ‘Apidays Paris’ (a leading API conference series), just to name a few.
When I had the pleasure to meet Arne Tarara, we went for a walk in a small park in Berlin. This is how he likes to exchange, outside, surrounded by nature. And during our talk, I was astonished by Arne's deep knowledge on green computing, and his commitment to build efficient tools for developers in the true open-source spirit through the startup he created called « Green Coding Berlin ». As you can guess, both of them are seasoned software engineers having decades of practice behind them.
Hello Arne. Hello Anne. Nice to have you on the show today, and I'm going to be super cautious with the pronunciation of your names today, so that everyone will understand when I ask Anne, and when I ask Arne, to intervene, so as not to confuse too much all our American friends. And I'd like to start with the usual question I ask all my guests, which is: “how did you become interested in sustainability, and in the sustainability of our digital sector in the first place? You know, did you experience a light bulb moment, or was it more like something that was there forever?”. Maybe, Arne, if you want to start?
[00:03:20] Arne: Sure. Thank you for the nice introduction. I think the important point is one you already mentioned, that I have already been a software developer for quite a while, 16 plus years or so. I had just finished with my former company, which was mostly in performance, marketing, advertising, online shops etc. So, one of these classical building businesses, I would say, at least at the time, and I wanted to make something that had more meaning to me, and had a more sustainable touch. So, I tried to branch out in different fields. It wasn't clear if it would be digital or not. We have a strong meet-up culture here in Berlin, actually, same as in the US, I guess. And I was introduced to the meet-ups about green coding. People that claim that they can make the world a little bit better, not only the digital sector, in particular by using software. On the one hand, it was a bit surprising that you can be a professional in your domain for 16 plus years, and you obviously know that there's stuff like supercomputer optimizations, and hardware gets better over time, and more efficient. But you didn't even think about this issue in Germany. We have this pro work [attitude], like sweeping in front of your own door, i.e., basically just checking whatever you do, and see if that could be improved, made more efficient in any way. And I was introduced to this meet-up group and I would say, yes, it kind of had this mind-blowing effect on me - it was like « wow ». Actually, there is so much you can do. And this niche opened up to me. I believe green coding is still to a certain extent, [an area] that you can get to know very many people very quickly, people who seem to be the top players, with what they are currently doing at the moment, and that there is so much potential to be lifted, there's so much to be done, so much you can do with software. So much efficiency gains [to be reaped] that are still lying around in software field. And I immediately knew, OK, this is what I would like to do, and I looked around, and there were not many tools at the time. The tools that I discovered which kind of had my idea implemented in part, was, for instance, GreenFrame.io, which is still around. And I think it's actually from France, right? I picked up the tool and I tried to use it, and it didn't work at all for me, and it was not open source. And then I thought, this the natural way I'm going to contribute, but it was not possible. They will not act; they are not interested. I contacted the support team, but didn't get a call back, etc. And I thought, OK, I'm going to re-implement it myself, I made the green metrics tool, which is one of the main tools we're doing and said “OK, I also want do it a bit more professional, and be a business, and want to carry out research in this field ». And this is when I created Green Coding Berlin, and set up a small team of people that we would al like to work with. And now we're doing research and trying to make the sector a bit greener.
[00:06:10] Gaël: What about you Anne?
[00:06:13] Anne: I've always been very interested in efficiency, and a lot of that comes from the fact that but I've been in the software industry since the early nineties, and back then everything had to be efficient. We didn't really have any choice in that. The machines were really terrible, you know, in many ways, 1000 times worse, 1000 times less bandwidth, 1000 times less power. We had to be incredibly efficient in the way we wrote things. So back in the nineties I worked for a while on the first version of Microsoft Exchange, in these days, with one of my co-authors, Sara Bergman. She [still] works on exchange, you know, nearly 30 years later. It's always been interesting to me, that it’s still effectively the same product. But it requires 1000 times more resources to do its job now than it did then. And when you look at the world and the energy transition that's going to have to happen, you realize that 1000 times more resources, that's a lot. And we could really be doing with using that electricity, those resources, as efficiently as we did in the past, today. Except there are issues with that, you can't just go ahead and do it. That's the tricky thing to be doing at the moment. How do we align them both? So, I wouldn't say I didn't have a kind of an epiphany, a kind of like religious belief that we had to do better. Just the knowledge that we really could (do better). And this is something that has to happen and therefore we will do it. And this is a key way that we will do it, through increased code efficiency. But that's not the only way, but it is a fundamental way.
[00:08:02] Gaël: I didn't do any software engineering back in university, that is not what I studied, but I learned everything on the ground with a tremendous mentor called Jean Yves. And he used to tell me already about the old days, you know? And he had this expression, that in the old days we had to break every byte into two. And it started already to be, yes, quite convenient. The computing power was going up. Storage was less and less of an issue, etc. But I could not forget what he used to tell me: you need to pay attention to memory resources. You need to pay attention, and we completely lost sight of it. And it's quite fun to hear you referring to these times where resources were super scarce and super expensive, actually. And that actually leads me to the question I wanted to ask you, and so I might lose my bet, but I would dare to say that if I bet a dollar or a euro or a pound with you, and you choose your currency, that the words ‘green software’ was not that widespread, I guess, when you started coding. So how do you see the evolution (of it) today and, maybe more specifically, during recent years? Where are we? I mean, is it that widespread? Or is it more like do we (Arne, you and I) live in some kind of informational bubble, and we actually think everybody cares? But perhaps that's not the case?
[00:09:37] Anne: I think well, I'll answer your first question first, which is ‘No’. 30 years ago, green software did not exist as a concept for two reasons, really. Software just wasn't that much of a hit on the economy at that point it, it didn't use that much electricity, the software industry wasn't that big. It didn't have the impacts that it does today. Plus, also, you know, culturally, we weren't thinking about these kinds of things back then. But ironically that the kind of things that you have to do, or are part of being efficient and using energy well, for software we were just doing it, because we had to, because we had those terrible machines. So, in some ways, we were great, because we were doing a lot of the right things, though not all the right things, but we were doing a lot of the right things. We had efficiency absolutely down to pat. But that was because it was necessity rather than because we actually had to do it. But these days, I think you're right in saying that there's a risk that we're all in a bubble, where we think this is something people care about now. But it isn't. But it has become, massively, massively, more top of mind in the past few years. I remember talking about this at conferences 5-6 years ago, and people looked at you as if you were crazy, and we even got complaints. When we ran tracks on this subject a few years back, people were saying, well, that's politics, it's not technology, it shouldn't be included, in conferences. But now nobody says that. Everybody knows that it has to happen. There has been, I think, the IPCC report that really woke everybody up. And the fact that the tech industry is one of the biggest industries. We have to do things and yes, some of it is going to be efficiency, just like we did in the nineties. And some of it is going to be time-shifting, which is in the long term, even more important.
[00:11:35] Gaël: Could we say that the awareness has dramatically raised? But what about the practices? Maybe Arne, you want to comment on this one? Did you really see a significant change in the way people code, even if they are aware of the ecological crisis that we are into at the moment?
[00:11:55] Arne: I think part of every business that everybody does (or should do) is to do a bit of research, asking: ‘has this maybe come up before’? Have people been talking about green coding before? Are we currently on a hype or are we currently in some kind of a valley [trough], so to say? And if you look back in the academic world, there were already, in 2007 – 2010, very many papers around green coding. There was the university in Texas who had this Archer supercomputer where you could actually measure all your code and before even RAPL was out there. Or perhaps it was around the same time. which I think we would come to technologies later. But let’s put it out there for the moment that one measuring technique is the processor itself. You could already do it on systems that were out there. But basically, nobody was interested anymore. And I would say a drought of papers in the academic world happened, and now it's coming up a bit again, at least in Germany. We have a conference about it ‘EnviroInfo’ where it's mostly about computings and ecology in general. So, I would say that the green coding, at least in my historical view when I looked at it, has already had its ups and downs. And now coming back to your particular question, especially, I would say in the last year, I wouldn't say that there is necessarily a stronger move on people adopting these techniques. So, a measurement you could, for instance, take, is one of the most prominent softwares, I guess, like, cloud carbon footprint. And Scaphandre [a metrology agent dedicated to electrical power consumption metrics].
I would put out here for instance how many GitHubs do they have over time, is there a search or something? I don't have all the data as I'm not the repository owner, but this is my view on how green coding has evolved over time. But, if it's OK, I will elaborate on this one a bit because you also asked me about what we are doing and how we see the sector in particular. We at Green Coding Berlin do not necessarily do what people often think green coding entails for them in particular. When we talk to companies or young developers, they ask us for optimization. They say, “OK, how can I make my code greener in particular right now? Which tool do I have to use to emit less carbon?”. This is actually something we don't focus on a lot in particular because green coding, as from my view, if you look at the digital sector as a whole, (it) is not a problem that is coming from the industry itself. The industry doesn't necessarily have an issue with the digital product that it's using. It's rather something that's coming from states as actors, from developers, and from consumers. The industry itself, in my view, has an incentive to tackle things that they think are not efficient enough. For instance, the machine learning models, because they cost them a lot of money. I think this will be resolved on its own. A bit of additional pressure might be nice, but it's not necessarily needed, I guess, for this to transition. Or things that are not cost effective, so, if you look at something like YouTube, Twitch and Bitcoin, they are in themselves, for the most part of it, already cost effective. But people complain about them a lot and think ‘can we not make this greener in a way?’, because they often don't use these technologies themselves. You will rarely hear complaints from people that earn their money with Bitcoin, that Bitcoin should be should use less energy. And generally speaking, people that don't use Twitch are more likely to complain that one such streamer can emit x amounts of carbon. But for the companies that run them, like Google, that runs YouTube, or Twitch which I think is owned by Amazon, it's a cost-effective thing for them to use these platforms, even though they will produce an enormous amount of data, which is harmful on its own. But it works for them. The incentive is there, but it’s not that intense. And I think green coding techniques on YouTube will take a while until they're implemented if they are not directly cost effective, for instance. On the flip side, the developers are becoming more concerned. This is what we see, for instance, as a company. But this overlap of business and interest, I think this is still in the making, and I'm not really sure if this is the biggest driver. So, I think that green coding, and the effect (coming back to your question in particular), will mostly happen. And this is also what we work on at “Green Coding Berlin”, in particular through regulations. And this means that you have to have the transparency first. And this is what our tools are mostly doing. They are giving developers and users transparency. They make stuff comparable, and then some someone can step in, like regulators or society, to force that optimization techniques, or limits can be implied.
[00:16:42] Gaël: And how do you enable more transparency to happen when we have so many issues? And I'm not going to brag or quote too much during this discussion, from Max Schulz from the SDIA. But he's got a point when he says, again and again, that especially the main hyper-scalers are not providing enough comparable and transparent data to truly leverage everything that we could do in computing. Do you also believe that it's an issue? Or actually, what you were saying is that with the tools that you've developed or the approaches that you encourage people to follow, there is a way to become more efficient? Even if some data are missing. And I know that all the hyper-scalers are, I would say, not doing things at the same speed. But I will not enter into this debate here.
[00:17:38] Anne: It's interesting with the hyperscalers, they are interesting because you can put pressure on them. Even, obviously, Governments and things could put pressure on them, but it's amazing how much pressure users, customers can put on them. If you say, « Look, I want this, I'm demanding better carbon footprint, measure(s), I'm demanding this information ». They are quite customer, well, I say they are, not, they are. AWS. Amazon is quite customer responsive. Actually, Azure is quite customer responsive as well. Google not so much, but if you raise this, if enough customers raise it, and it doesn't require that many, and you keep raising it, they (hyper-scalers) will see that there's customer demand for it, and then they will do it. When Amazon talks about being customer obsessed, they actually are. If you keep raising (it), if a handful of people, not that many, just keep raising this, with AWS reps, we have a good chance of getting it. And we got those sustainability commitments. Whether they will be sufficient in the end remains to be seen. But we have made progress by getting folk to raise issues with their providers.
[00:18:45] Arne: I think this is one of the big levers to go to, that you have to put the pressure on the cloud providers, either through the user side or through a regulatory side. And, for instance, this is what our tools are trying to do. A lot of people run an extensive amount of CI/CD pipelines, and what our tools do is that they simply create an easy machine learning model that's based on an open database of server energy consumption, called spec power, and then you plug that into a bit of code, so that it can be digestible by GitHub actions. This is their pipeline product. Or by GitLab CI. This is their pipeline product. And then you just see at the end how much your pipeline is consuming, and you see it for your hundreds or thousands of pipelines. And then you have a number, at least at the end of the month, and you can see if this number is going up or going down, and then you can go the route that Anne was suggesting and saying: “Hey, this number is maybe not the best because this company green coding building is doing it from the outside in. So why don't you give us these numbers? “. So, they go to GitLab and they go to GitHub or Microsoft in particular. And they say, « We want better numbers. It's not so hard for you to give them to us. And now we see that it's possible to actually, somebody can do it from the outside in. So why don't you give us these numbers so we can be better ». But we believe that people need to see this to a certain extent before they can even ask the right questions.
[00:20:06] Gaël: It’s a bit like starting with the metrics, we have to create a momentum and then in parallel, put pressure to get better metrics and better data from providers. And if you don't mind, both of you, because we could discuss a lot about cloud providers and the general approach, but actually, I'd like to deep dive a bit more with you. Could you share the top 2 or 3 techniques or approaches that you implement, I would say on, almost on a daily basis, to reduce carbon or carbon emissions caused by software?
[00:20:41] Anne: I'm a bit controversial on this one, so I'll start off and say, this is something that came up when we started writing « Building Green Software ». One of the questions that came up immediately from people [was]… “Oh, in the book can you cover some examples of efficient code?”. I used to write efficient code. Almost everybody I know writes efficient code and we all (this is terrible), we all laughed when someone said this, because almost the definition of efficient code is, it's incredibly custom. It is utterly and specifically custom to the very, very particular use case that you're interested in, and a really efficient code takes ages to write. It is incredibly bad for developer productivity, so generally it's quite hard to give people advice about how to write efficient code. I mean, you can say, ‘Well, I'll use efficient languages like C or C++ or Rust, rather than less efficient ones like Python. But even that's not so clear-cut these days, because there are new Python compilers that are compiling Python to machine code, or compiling Python to C. So, you can still write in the inefficient language and have it transformed into a more efficient one, because they know that developer productivity is really killed by writing this very, very highly custom code. So, it's hard to give generic advice. If you speak to folk who are really still writing with efficient code, for example, in the networking area, you're still having to write that high[ly] proficient code, the same kind of code that we used to write 30 years ago because you really, really need that super performance. And their feedback is generally: write code that is absolutely good for the language that you're writing in. Because compilers do a fantastic job at optimizing your code for you. Don't second guess your compiler. Follow best practice so that your compiler can optimize as far as humanly possible. It's a bit sad because everybody wants to hear some amazing, C technique, whatever. But fundamentally, it's just really, really hard and very custom. The best thing you can do is measure different tools. Get somebody else to do it for you. Don't custom write your own high proficient code. Find libraries and tools that are good and use them, which is what you need to use the measurement for. You need to measure to find out which are the good tools in the libraries, and you swap out poor ones for those more optimized ones, but don't attempt to do it yourself unless you are actually writing those libraries, I would say. It's a bit sad, but I would say there's no there's no killer technique that you can use because it's all hyper custom. You know, it's all basically asking around with your L one L, two L, three caches for a very, very specific use case. I don't know if, you might disagree with me Arne?
[00:23:43] Arne: No, I actually have the same [outlook], I have the notion that we are very on a par here, with our view on the ‘optimal’, how good these generic optimization tips are. However, if you think about what we often get, [it] is requests from users who see these articles, that Amazon has implemented a new gzip or zlip compression technique in their S3 service and it saved them, I don't know, I think it was in the tens or hundreds of millions, because they had to use less hardware to store their stuff. Or that you see this article that states there is a 70% improvement in React by just stitching the virtual DOM, so apparently it is possible, on a particular product, to get these gains. However, I would very much agree that on a generic level, it's extremely hard to implement. So, there are techniques that have been known for many years, like using vector instructions, loop unrolling, etc., that do work if you really put the work in. But it's a very questionable if really, in the end, if you look at the whole thing. Also, the time the developer had to think in, how much the software will run in the end, how much it cost you building these 50 to 100 iterations until you get it working, if this really saved you something in the final calculation. So, I think this is a bigger question, and I think Gaël you might make a separate podcast on this, this whole idea of software life cycle assessment. This is also something that Max [Schultz] is very passionate about. But I would like to give you our approach on how we typically do it. I think we have the same idea that Anne mentioned, that measuring is like one of the first steps, when we typically consult with companies or when we do workshops with developers. We have these five pillars, so to say, so, first of all, it’s about understanding. People often don't understand the terms that are even used. If you talk about energy and energy efficiency, they don't even know how a network could even cost them in energy terms, that network costs can be linear, or they can be progressive in a way, and then [there is the question of] transparency. The measuring and transparency. Whatever you then have understood and measured, you should also show it to people and make it public. So as in GitHub, as a badge or something. Then continuity is a pillar we focus on a lot, so it doesn't help you if you look at it one time [only], so you have to monitor it over time. So, like the git-ops approach, that with every release, with every build, you basically have to check if your initial measurement or your initial assumptions are still right, or check if the product currently derailing, and you don't want that. Then the fourth pillar is comparing. If you are thinking about software, and you're looking at the goal at the end, the optimization, is to actually sav something, then comparing is often very helpful. So sometimes just looking at how much would database 1 - just technically identical to the database that I'm currently using - how much would this change? So just swapping libraries out, as you said, or swapping infrastructure out is often a better way to go than going on code level optimizations in particular. But they are obviously a point, so our first pillar is then code level optimizations’ wherever they make sense. However, this is then specific, so you have to really look into your product. It often means using specialized tools. So maybe VTune® or something, or code profiling techniques. This is very laborsome, and, these tools are also sometimes cumbersome.
[00:27:18] Gaël: I would say it's a lot about measuring and comparing, rather than having one silver bullet, it makes sense. If it was that easy, everyone will do it. And I guess the question of software productivity, the productivity for your developers is absolutely key here. We need to take into consideration the full life cycle, and like you should take three or four times more days, to just release one little piece of code, and so actually, you could even use the energy better.
[00:27:53] Anne: Unfortunately, it's more like 10 or 20 or 30 times as long! I remember how long things used to take. They used to take an incredible amount of time. I mean it is interesting, that in the 30 years of my career, there has been more than 1000-fold improvement in machine productivity, and we've used it to make developers more productive. And it's very hard to make the sell to your business that you should go slow, slow down, because otherwise you'll go out of business. So do you have to trade off what you can sell to your business, as well as what is a sensible thing for your business to do, as well as what is the green thing. You have to align them. I'm not saying throw out the green things, I'm thinking you have to find ways to align them both. And the good news is that ((there are)) all the modern ways of working with microservices, with open-source libraries with hyper scalers, hyper-scaler services. Arne said this himself, that there's an alignment, as if you're a big business, to make your stuff efficient because so many people are using it, it is worth putting in that 100 X developer effort to make it efficient because you've got so many people using it, (so) that pays off. But if you're only a small business, and you only have a moderate number of people using your software, you'll probably never pay back that developer effort to make it super-efficient, so you're better off just using a library. Don't do it yourself. Use a library. Use an open-source library, use a hyper-scale service. But I mean, we talked about code efficiency here, but I'm not even sure in the long run that that's going to be the big win that we're going make in the tech industry. I think it's going to be the time-shifting, because even now we're seeing that with renewables, you get huge amounts of energy at some times, and no energy at others. And that requires a whole different way of using electricity. In the old days, it was just, you know, flick of a switch, all fossil fuel driven.
[00:30:01] Gaël: Is it something that you implement quite a lot, like chasing the sun, which is time shifting and location shifting, or not that much?
[00:30:12] Arne: It is actually a technique we do implement, on workshops, with developers, because it's generally a very interesting technique to implement, as it suggests that there are immediate gains. I don't know if you've recently read the piece, I'm not sure who wrote it, if it was David Mytton or Adrian Cockroft, or maybe I might be mixing stuff up, where there was this piece called « Don't Chase the Sun ». It was like a counter argumentative piece. That [chasing the sun], at the moment at least, often doesn't make sense. I will elaborate on this a bit further, but I would like to say that I generally agree with Anne, that this is an enormous [energy] saving technique, and this is actually what, at least in Germany, we are implementing with the grid. I think every country it does, but I can only really speak for Germany by saying we want to have smart meters. So that in the end, when we have surplus energy, and we really need to not waste it by curtailing it, we want to charge electric cars at this particular time. And in Germany, we have a long way to go by incentivizing people to charge them at these hours so, that it's actually cheaper to wait. Currently, at the moment in Germany, it's not cheaper to wait, even if we would have smart meters, because there is a law that that makes the pricing even throughout the day. But if you look at the current state of how time-shifting works, we are currently implementing a small plug-in for GitHub where you can say, “Hey, I want to run this pipeline at this particular amount of time because the prognosis or the forecast says that there will be green energy at the time”. However, how the grid operators, to my knowledge, typically plan out how the grid is supposed to be, and it is very likely, if you're at some point where the forecast says there is a lot of green energy and the grid is already in a stable state and you demand more, then it will not come from solar or from a wind farm, because this is then already curtailed, because the grid needs this bit planning ahead so they will more likely act, drive the power plant that runs on coal a bit more up. But this is a temporary problem to my understanding, as if they learn these signals over time, so even if you do that 5, 10 times, the grid will learn, then they will actually not curtail the green energy so much, and you will get it. But it's the same as those network savings. It's often not an immediate gain. It's more a theoretical long-term thing until we can understand the signals better.
[00:32:36] Gaël: It was Adrian's article, “Don’t chase the sun ». Anne, you wanted to say something, sorry…
[00:32:43] Anne: Yes, I totally agree on both the « don't chase the sun ». You don't really want to be moving your data around. What you want to be doing is delaying it, you know, delaying jobs rather than moving data around to chase the sun. I agree with both Arne and Adrian on that one. It's interesting what Arne mentioned earlier: YouTube is an excellent example of one of the products that Google used to do their own kind of grid balancing, on their hardware. That, if you upload a video on YouTube, sometimes it happens. Sometimes you'll notice that it's transcoded very quickly, and sometimes it won't be transcoded for a while. And the reason for that is that they use that as one of their latency-insensitive workloads. If they've got a lot of stuff that's going on, if the systems are busy, they'll just shove that trans coding [down the line], it's a little bit later in the day when things are less busy, so they get better utilization on their machines. And right now, they're working on similar kind of shifting to try and move work to when the sun is shining and when there is potential to power it greenly. But Arne is right, that there isn't necessarily an immediate benefit to that, because right now, the grids might not have enough green energy to provide because that they may already be curtailing it. But in the long run, if you create demand at times when there is potential solar or wind to match it, then more solar and wind will be put in [to the grid] So it's not necessarily an instant win, but right now it's all about the transition. It's about moving to how we're going to work in that new world.
[00:34:36] Gaël: I don't remember if he mentioned also this aspect in his article. But it's also that chasing the sun is actually an issue once you start implementing multi-criteria approach, because carbon is one thing, but water is another. And, you know, if you shift all the workloads in a country where you've got plenty of sun, usually water is pretty scarce. And we are experiencing several droughts here in Europe, and the same goes in the US. So, the moment you say OK, let's chase the sun for green electricity, you might also create a lot of problems when it comes to water stress. So that's also why I kind of like his expression don't chase the sun. Maybe ‘Chase the wind’ is a is a bit more accurate, but eventually I guess it's all about reducing the energy intensity, and don't go for a silver bullet or a quick fix that actually does not exist in this energy transition. That's how I understood his main message, and I could not agree more with both of you.
If you're OK with it, because we talked a lot about measuring metrics, etc., so could you maybe share a bit, both of you, the do’s and don't’s when you measure, and maybe one or two examples on how you manage to measure for some of your clients.
[00:35:46] Arne: What I see in particular is that people have very often very different setups which, I think is normal if people are trying to find ways how to measure things and there is no standard out there. I think you can separate it into two basic domains. There is a cloud at the moment whereby most of the measurement techniques are not available that we use. The cloud is typically more an estimation game. You have premeasured machines and I will come to in a bit how you do that. So, you have, basically, premeasured machines. You have something you could call a calibration curve if you want, I know it's not technically correct, but for some people, this term might mean something. But you basically have a curve that tells you at this amount of utilization, this machine uses this amount of energy, and this curve is typically nonlinear, which requires a bit more than just a simple M times X plus B. So already getting into the technical stuff, so [there is] more than a linear equation to solve this problem, so you need a bit more. So here an easy machine learning model is what we use, for instance, to get this curve, and then you can go into the cloud where at least the utilization, which is a typical Dev ops metric, or a typical monitoring metric that is usually available in many of the products, is what you can use. And you can, to a certain degree, assume that the configuration of the machine that you have already measured is very similar to the machine that's in the cloud, as this database, where we get the data from, are typically machines that are bought by cloud vendors and they often use standard configurations (not all, but some). And then you can get a reasonable estimate of how much a machine in the cloud would use in terms of energy. There's also a similar approach that cloud carbon footprint follows. They have a linear assumption, to my knowledge, but I haven't monitored it currently. We have this nonlinear one, which is supposed to be a bit better, and I know there are people out there who have even better models, but they are not open source. So how do you even measure it? Most of the academic papers show that people attach a power meter to the computer, which is something that everybody who has done home automation or who just wants to know how much [energy] a microwave is really using knows, so it's basically an adapter that you can put over your power plug, and it will tell you how much the machine that is connected to it is currently using in terms of watts or kilowatt hours, if you want to have more an idea of energy and not a current power draw. And they have also USB XS’s, they have Bluetooth’s so you can easily hook them up in a connected system that can also then run measurement drops for people. But for some people, (it is still new for developers, because it's kind of under the hood), there is a technique that is called Intel RAPL or more like a hardware feature, I would say, not a technique. It is something like a power meter inside of the CPU, it is still more of an estimation calculation, but it's very accurate. So very many papers have already confirmed that it's very accurate to their falsification standards & parameters. What it basically gives you, as a developer, is you can write Linux code, and there is a function you can trigger, or a hook, and then you will get the energy that the CPU is using. So, you basically say “Hey, I'm going to start here, and so you make a start point, then you run a bit of code, and then you ask it again, and then you get number B. Then you have number A and number B and you subtract them, and then you know how much energy has been used between these two points. And what we do for measurement in particular is that we ride around these frameworks that already exist, so [there are] external power meters on board, there are sensors that exist. There are also techniques like IPMI which are also internal power meters. So there is this RAPL stuff, and we glue them together in one big open source tool, the green metrics tool, we call it sensor, that can attach these different sensors. And then we give this out as a fully-fledged solution to developers that already have software, which typically is now written in container form, and developers have already set up their container files, something like a doc compose file. And then they can just say “Hey, please take this Docker compose file, similar to like a bash script or like a Linux Easy, and then I want you to run these lines, maybe run this node program, maybe run the browser » and then you're finished. And I would like you to tell me in between, for example every 100 milliseconds or every 50 milliseconds, I would like you to write down the energy consumed. And then at the end, you get all the energy nicely displayed in the graph. There are some statistics applied to it. [You can ask yourself:] “Has there really been a change from the last time you've tested it to the time you've tested it now ». And to make this even better, we then also offer a service, on the web for free, where we have a measurement cluster with pre-configured machines that apply best practice on how to measure. I can elaborate a bit further on them later, but they do exist. And then you get a better measurement. It doesn't fluctuate as much. It is more reliable. You don't need as many repetitions to get a good statistical, conclusive answer, and see if the code is really different to another piece of code. We try to bring it [the measurement aspect] into a tool so that developers can use it with techniques they already know, like starting and stopping containers, or firing up a tool on the command line. And then they get with the onboard mechanisms that already exist, like Intel RAPL, or using machine learning models through CP utilization. They can then already get a metric out, and so they don't have to be measurement professionals. They just need to know how to use a Linux tool.
[00:41:44] Gaël: And this is where you can start comparing, I guess, or challenging the use of this library against another, and all that stuff that you mentioned earlier.
[00:41:54] Arne: Yes, exactly. The way to go would be that you have a Docker compose file and then let's say one time you use, as a package manager, you use NPM to install everything, and you want to see if it goes faster or uses less energy. And then you use PNPM, or you use a different one. I think Yard is also a package manager. And you can see if this library or tooling swap will change anything in your build process or your program.
[00:42:22] Gaël: Arne, you mentioned best practices. And I know that this is something that is very close to Anne's heart. Could you Anne, maybe tell us a bit more about these best practices, and Arne if you don't mind, you might want to comment on it.
[00:42:38] Anne: Well, when I was talking there, the thing that immediately hit me and I thought it quite interesting, is, in the old days, you know it's worth thinking about, the reason why we did all of this stuff was performance. You know, it was like the machines where you had to ring every millisecond of performance you could out of systems. We didn't use to measure energy use, we used to measure performance; your time, how long every operation took. And that's a fairly good proxy for energy use, how long things take, how performant stuff is. But I was thinking about it when Arne was talking, and the trouble with it is, it's very custom if you instrument your code to say, «I assume when this message comes in here, and then this message leaves here, [I know)] how long that is, and if it's less (than previously thought). Because how long things take is often about how many CPU cycles it's gone through. And then how many CPU cycles it goes through is basically how much energy you're using. There's a good correlation between performance and being green, which is why a lot of these kind of highly tuning techniques are still used in networking, where performance is absolutely key; you've got that [indicator]. But the trouble with that is, it's very specific. It's very custom. You have to know what an application is doing. You have to know which messages are going through, and know where to put your instrumentation in. Whereas if you're just measuring the energy use of a whole system, that's more generic. Therefore, you can have tools that are generally more usable by everyone, rather than doing things that are very, very specific and custom. So, I assume that's the reason why we've moved over from using performance as the key kind of way that you measure energy use, to actual energy use, because it is more generic and therefore it's more widely applicable. But would you say that was true?
[00:44:48] Arne: Well, I think you're absolutely right in what you're saying. And, if I speak to more seasoned engineers, then they often ask the questions like “Do we really need green coding? » I mean, we have performance optimization. « So where is the knob to tune if I don't take the classical performance techniques? ». And I think you mentioned some of the green coding techniques already. I think they are unique to green coding, like time-shifting in particular, it doesn't save you any performance, right? It only saves you green energy, or saves you carbon emissions in particular. However, how we see it is similar to how you [Anne] said it. If you think about green coding and energy is now so widely available through many sensors, why not make it the first auto-metric? Because this is actually what you care about, right? You don't want to save on performance at the moment. Or at least this is our mission. If you really want to save on energy, why not take it, even if it's strongly aligned, or if it's strongly co-linear, with performance metrics in particular? When these metrics are not aligning, there is typically something a bit wrong with your code in general. There are energy anomalies, and where you see that, maybe, performance goes up or goes down. But the energy budget goes in a different direction in particular, which could be like mis-configurations, for instance. You could have something like a vector instruction unit in the CPU nowadays called a AVX - It was called MMX or SSE before, to help get some gamers in the loop that might have heard these acronyms. They can be turned on, and then the CPU is using more energy. But actually, it's not doing anything, because it's currently not issuing any of these instructions, and this is typically a mis-configuration. Something turned the unit on, and then it's using more energy, when it's not needed. And so, it could be as though you have your hard drive mis-configured, it's spinning all the time, and so your disk is not going into a sleep mode or a pause mode, where it can stop spinning the disks. You know you are not using the hard drive in particular. This is also where discussions about idle time comes into play. So, your performance metrics could be perfect, but still, the machine is on. So a green coding technique, a classic one, and this also is what our tool shows, that if your code is doing nothing at the moment, does it really have to be on? Maybe it is an architectural decision here, where you say, maybe we move from a super, highly coupled, highly integrated, vertically-only scalable monolith, to something like a micro service architecture that we can actually turn off between requests, because we see more pauses. Then we really see activity, so the node doesn't have to be on all the time. Why not use the energy or the carbon metric as your first order metric? And then, however, if you lay hands on the stuff, [the metrics] you tune the performance metrics, but the measurement that you want to optimize against is the one that is actually following the goal that you want to achieve.
Anne: That's a great point.
[00:47:49] Gaël: Yeah, I do agree. Especially when we know that we will, more and more, as you mentioned Arne, have to take into consideration embedded carbon and full life cycle carbon etc. And that maybe, at some point, as you say, it will be environmental metrics and not just carbon. Because we have other environmental impacts that we do need to take care of. And this is really a question of which machine shall I use. And sometimes using less powerful machines, older machines, is also a way to save carbon. But that opens a completely different debate.
[00:48:25] Anne: It is a different debate, but it is worth reminding that there are three ways that the tech industry has to improve things. It's not just code efficiency. It's not just ‘be energy efficient’. It's also about being hardware efficient because hardware embodies one heck of a lot of carbon. And time-shifting. Those are the three things and we have to do all of them. We can't just do one of them.
[00:48:50] Gaël: Yes, I know, recently I was preparing for a conference, and I just found again this amazing interview that Jerry McGovern did with Melvin Vopson. And I know this is a theoretical work, just to raise alarm, but Melvin Vopson estimated the amount of mining that will need to occur to build the server to handle the 25% growth rate in data on a yearly basis that we have today. So, plus 25% data equals that amount [number] of servers to be built just to manage it all. And he discovered that in 2053, humanity will have to mine the equivalent of Mount Everest. So that's 175 billion tons, I think, just to build servers, just to handle the data - we're only talking about the data! And of course, then we can say we will have energy efficiency gains, but the scale is still so amazing, that it is something that we will have to pay attention to in the very near future. I know that at the moment we are focusing a lot on energy and immediate carbon emissions because of the electricity (consumption). But the embedded carbon is the next big battle, and actually it will be, I truly believe, that it will be the main battle at some point.
[00:50:23] Anne: Yes, and not just in in data centers. Every time I have to throw away or give away or do something with a working device, like a phone or a laptop, because it's out of support, out of security patch support, but it (still) works! you know, there's just so much embodied or embedded carbon in that device. It's immoral, basically, for us to know ((that there is so much embedded carbon and)) to give up on providing security patches.
[00:50:52] Gaël: Yes, that's true that we need to remember that, those end user devices. They account for three quarters of the entire environmental footprint during the ‘building’ phase, mining, manufacturing, transport, etc. So, of course, as professionals in tech, we focus on what we can do, which is mostly data centers and networks. But that's also true that when you talk with a designer, for instance, they are more and more aware of the tradeoff between “Do I want to enhance my code, even to do green coding, versus, how do I make sure I actually reduce the size of my code and not create extra complexities that will accelerate software obsolescence and hardware obsolescence?” But that's a very important battle as well. Can I ask you a final question on best practices? And I know Anne that there is quite a lot of good and sound advice in your book, and Arne, you already touched upon them a bit and if you want to comment, just feel free. But maybe, Anne, as one of the three authors of the next O'Reilly book, what are the best pages?
[00:52:01] Anne: Well, the introduction summarizes everything in the whole book, and that's already available in very rough, pre-release form, on the O'Reilly website. And you don't even have to buy an O'Reilly subscription, because you can just do a trial and you can have a quick read of it. And eventually, when it's finally published, the whole book will also be simultaneously public, open sourcing. But not until it actually is finally published next year. All of that will be available. So, principles - this is a horrible thing, nobody wants to hear this, no technology person wants to hear this, but really, the best practices don't focus on optimizing your own code. Instead use code that's pre-optimized by somebody else. Because that is by far the most effective thing you can do. In the long run chat GPT is going to become much better at optimizing; compilers are getting much, much better at optimizing code. You try and push that job off on to somebody else. But do be thinking about architectures and designs that will work with time-shifting. Things like spot instances, and micro services where you can turn things off, as Arne mentioned, or you can time shift them. Think time shifting first is my advice.
[00:53:21] Gaël: I guess, because you're influenced by your science fiction work and you want to travel across time, and this is why you're so obsessed by time-shifting! But it is finally happening. What about you Arne, do you want to travel in time again?
[00:53:41] Arne: Yes, as I said before, I also think that time shifting will be one of the bigger gains in the future. And embodied carbon is one of the bigger battles to fight, although there, I don't really know how the optimizations will play out because it's so opaque at the moment, as most people don't even know how, for instance, S3 is implemented. And what kind of hard disk, so it's very hard to say how optimizations could even work for a system like this, which stores, I think, most of the of the data that the Internet holds at the moment. I think my take on optimization techniques is very simple. Although we speak a lot about these, as I mentioned before, particular, vector instruction techniques, and these energy and performance metrics anomalies, so we speak about them because developers like to hear these super funny edge cases where something goes horribly wrong. But I think for a daily business, if you really want to save energy in your code, most developers know how to do it. So, there's really nothing you have to tell the developers really, to do. It's more that they are overwhelmed, as business is not giving them the time and the support to do it. I would really say that the particular key [issue] at the moment is transparency. Wherever you can measure your stuff, even if it's not the best metric, make it public, if it's on your own block, or if it's in the git-hub repository, or even if it's just in your notebook, that you at least know what your code is doing. And then the other thing is, to ask your management how much is our code emitting? Can you [management] not supply these numbers? « Ask the cloud providers » is also something that Anne mentioned, which I think will drive a lot of the transition - you have to ask for these metrics. For instance, if I go to the supermarket and I always buy a product, and I'm always angry that it's not packaged in recyclable paper yet I never ask the vendor [about the packaging], how can something change. There is no mind transition [reading]. I don't know what the English term is for It, when my mind goes into his mind, and so he obviously knows that I'm happy (or not) with the product. I have to ask for it [change). So, I think this is really the key, and such techniques like time-shifting. And I really have to say, and maybe this is a bit of belly rubbing for you, Gaël, we should listen to podcasts like Green I/O because you will hear about new techniques that developers find, that are useful and that should be employed.
[00:56:01] Gaël: Well, thanks. And that's a beautiful transition to my last question, which is what are the main resources you would advise the developer community to go for, when trying to green their code. But Anne, you cannot mention Environmental Variable because I’m going to do it first and give a big kudo to Chris Adams and Assim Hussain and the wonderful work they do with the regular guests like you. So environmental podcasts are Definitely podcasts to listen to and I personally I'm listening to pretty much every episode. I've taken this example, so you need to find another one!
[00:56:38] Anne: Well, of course I'm going to mention my book. “Building green software ». And I have good reason for mentioning this because we are publishing it every month. Ideally, hopefully, we'll be dropping a new rough early chapter, and we're looking for feedback. So, contact me at: buildinggreensoftware@gmail.com, and you'll be able to send us feedback for what you'd like to see in the book that has not already been covered. So, if there are questions whilst you are reading it, contact me - I'm on Twitter, Sara's on Twitter, Sarah's on LinkedIn. We're very happy to hear you come back and say « But I wanted you to answer this question ». We will attempt to answer the questions.
[00:57:24] Arne: I’ll also pick up the question, so, I'm monitoring what’s out there a lot. I have Google alerts that alert me about new stuff coming out. I read the Green Software Foundation newsletter. I read the Climate Action Tech newsletter. I'm a follower of this podcast. But I would say that there is no one [single] resource. I think this is what you what you are shooting for Gaël? So [for me] there is no one central place where you can find the all the best information. But if I have to name something that I think has given me the most value so far, with the most helpful techniques, it is from conferences. I think if people get a conference talk in somewhere where you have a sustainability track, [it gives you] that something that is a bit bigger, something to be watched. I think if you just want to follow one resource in particular, get an alert of something like sustainability conferences or sustainability tracks at IT conferences. I think it’s there that I've seen the most valuable content.
[00:58:27] Gaël: Well, that's music to my ears knowing that I will be in charge of the Sustainability track both at ‘Apidays’ in London and in Paris later this year. I've got a big blessing from you. Thanks a lot. But yes, that's so true, conferences, they're cool. I mean, you can interact, you can discuss with your peers, and that changes everything, I guess from, being just a passive listener. And, no, I didn't aim for a single source of truth. I'm always a bit dubious with these approaches, but, that's great, actually, that you mentioned conferences because we tend to mention articles, podcasts, etcetera. So yes, conferences and the big fight made by the Green Software Foundation. I mean, they've got a speaker repository now, and I know that their approach is no conference today in tech can spare having a sustainability track, or at least some talks on sustainability. And I think that's a great approach. And I've gathered people from all over the world saying, « Hey, these folks ((are good speakers)), and as far as I remember, both of you are in this cohort of speakers in IT/TECH/sustainability, and I am well as well (full disclosure!) But, these folks, they can talk, if you cannot find anyone, then just connect with them. But you cannot have a big conference without someone talking about carbon, sustainability and so on, so it makes definitely a lot of sense. Well, thanks a lot, both of you. That was a very lively discussion. I really enjoyed it and letting you converse with each other. That was really was music to my ears. So, I'd like to thank you once again for all the feedback and insights that you have shared with us today. Once again, thanks a lot.
[01:00:16] Arne: Thank you, Gaël. It was great to be on the show.
[01:00:18] Anne: Thank you very much.
[01:00:19] Gaël: And that's it. Thank you for listening to green IO. Make sure to subscribe to the mailing list to stay up to date on new episodes. If you enjoyed this one, feel free to share it on social media or with any friends or colleagues who could benefit from it. As a nonprofit podcast, we rely on you to spread the word. Last, but not the least, if you know someone who would make a great guest, please send them my way, so that we can make our digital world greener, one byte at a time.
❤️ Never miss an episode! Hit the subscribe button on the player above and follow us the way you like.
📧 Our Green IO monthly newsletter is also a good way to be notified, as well as getting carefully curated news on digital sustainability packed with exclusive Green IO contents.
52 episodi
Tutti gli episodi
×Benvenuto su Player FM!
Player FM ricerca sul web podcast di alta qualità che tu possa goderti adesso. È la migliore app di podcast e funziona su Android, iPhone e web. Registrati per sincronizzare le iscrizioni su tutti i tuoi dispositivi.