.png)
SheCanCode's Spilling The T
SheCanCode's Spilling The T
Building Bridges with Data: Dr. Janette Müller-Lehmann's Journey
From decoding Wikipedia’s social networks to orchestrating a company-wide data warehouse at SoundCloud, Dr. Janette Müller-Lehmann has spent her career making sense of the complex world of data. In this episode, we delve into Janette’s journey as a trailblazing data engineer, her principles for keeping things simple, structured, and documented, and the lessons she’s learned from navigating the challenges of big data.
Beyond her professional brilliance, Janette shares insights on balancing a global career with family life, the art of time management, and how even a PhD feels like a breeze compared to parenting two young kids. Tune in for an inspiring conversation about data, life, and finding harmony in the chaos.
SheCanCode is a collaborative community of women in tech working together to tackle the tech gender gap.
Join our community to find a supportive network, opportunities, guidance and jobs, so you can excel in your tech career.
Hello everyone. Thank you for tuning in Again. I am Kayleigh Bateman, the Managing Director Community and Partnerships at SheCanCode, and today we are discussing building bridges with data. I've got the amazing Dr Jeanette Muller-Lehman from SoundCloud with me today. She's here to discuss her journey as a trailblazing data engineer, life in general and finding harmony in the chaos. Welcome, jeanette. Thank you so much for joining us.
Speaker 2:Hello, thanks so much for inviting me.
Speaker 1:It's a pleasure to have you on. Thank you so much for taking time out of your day to come and have a chat on Spilling the Tea. Can we kick off with a bit of background about you? Did you fall into tech? How did you land at SoundCloud? A bit about you to set the scene for our listeners, if that's okay yeah for sure.
Speaker 2:So I was always more mathematical programming person. I had my first computer already in the 90s and I love to kind of hack my way through it. I was never good in languages, only in programming languages. That's how I always say um. So it was quite clear for me after graduation that I want to study computer science Also. Back then there was not this huge variety of possibilities to study computers.
Speaker 2:So I moved to Berlin, studied computer science and then came on this new field data, new field data, starting with wikipedia. Already at the university we had a kind of small research group doing analysis around wikipedia how the communities work there, how articles are structured and linked together. It was quite exciting. Long time ago we didn't have the right infrastructure. Long time ago we didn't have the right infrastructure, so we were really relying on our local computers, for instance, to compute the networks. It took us days. We pushed a button, waited 48 hours to get results. Wow.
Speaker 2:After I graduated then, actually, my first mentor, claudia, who was also leading the research group, told me I should continue researching or do something in this way. So she connected me with a research institute in Turin in Italy, because I also wanted to go abroad and improve my natural language skills. To go abroad and improve my natural language skills. In theory it was about improving my English skills, but I ended up in Italy and I also love Italy a lot and there I worked as a researcher for one year more, also on the social science. It was very interesting because there are a lot of most of them worked on complex networks, which relates it again to Wikipedia, which is also a complex network, but they also looked at already influencers in social networks.
Speaker 2:So Twitter was back then already starting to be a thing and also this was very exciting. My boss there then kind of, I would say, kicked me out of his institute by telling me you should do a PhD, but I will connect you to another person who can do that with you. So that's how I ended up in Barcelona doing my PhD with Yahoo Research but connected with the university there, studying then Yahoo Networks or Yahoo, but also in general, network or internet traffic and how users engage with certain websites. Apps were not a thing back then. They weren't big enough back then. So I really looked how users navigate through the internet via the Yahoo browser, how much they engage with certain websites, how it differs um, yeah, amazing I mean, you've had quite quite the career so far.
Speaker 1:I I was interested um a lot of your um, where, where you've worked and what you've studied. It's all very stem based and and you started with maths, um, and you and you really enjoyed maths. Was it just something about maths? People tend to say they're maths or words. Was there something about that? When you were younger, a teacher or someone that kind of said maybe STEM subjects might be good for you to move into.
Speaker 2:It was based on also my interest. I loved numbers always more and I always had problems to put language to understand language, because it cannot be written in math. Yeah, and so even early on at high school I was already interested in math and working with numbers and I wasn't even sure should I study computer science or math? But then I just checked okay, what kind of possibilities do I have, what kind of work life could I have with math versus computer science?
Speaker 1:and then somehow it's computer science excited me more yeah, yeah, well, that's good to hear that you were excited by the subject, um, but yeah, we tend to find that people tend to to think numbers, numbers or words are one or the other, and that very logical with maths and lots of people prefer that. So I mean, obviously you've described it there that you've had an incredible career trajectory already so far, so from studying Wikipedia's social networks to leading data teams at SoundCloud, which you do now and what inspired your transition from academia to the tech industry and how did those early experiences shape your approach to data engineering?
Speaker 2:Yeah. So when I finished my PhD I wasn't sure should I stay in research academia? Should I try out, kind of first time in my life, industry work. I worked as an industry before during university but was different and I was really going back and forth. Back then I already had my now husband. We had a distance relationship. So it was clear after the PhD we want to move together and the question was where.
Speaker 2:Research academia in Germany is quite challenging even to find a permanent position. So I was okay, if I would do research I would need to travel a lot or we would maybe relocate to the US and I also wanted to go back to Berlin, to my home. So it was back and forth and I was really challenged with the decision. And then at a certain point, okay, I will just try it out, I go to industry, I will search for a job. If it doesn't work out, I can always go back somehow.
Speaker 2:So I moved back to Berlin, took some time, half a year off, and then I just kind of reactivated actually my academia network that I had to search for companies also that fit to me. Of course it's kind of you can not always find a job in a bank, but working in a bank from what I've heard and a lot of people working in a bank told me it can be quite boring. So I would like to have exciting data to work with, and I was also searching for a company that is social and where people enjoy working. And then various people recommended SoundCloud, and that's how I applied and got my first job with over 30 years.
Speaker 1:Gosh, I love that you said exciting data to work with, because you're so right. We have ladies in our community who work in data but said there was that myth when they came into the industry that they thought it was all going to be data's very dry and data's very boring, and it's finding the right company and finding something that you really love, because there is that misconception in data that it's all very boring.
Speaker 2:It's not.
Speaker 1:Finding the right company.
Speaker 2:Yeah, I've worked with various data sets and somehow I find it always exciting, but it's also I'm interested in like solving puzzles, and data are always like puzzles and for me, soundcloud was also a good fit because of the size of the data. I know also a lot of data analysts, data scientists, working in smaller companies and then it would be boring for me because you don't have the challenge of how to iterate on big data and SoundCloud has huge volumes of data, which is exciting, but it's still small enough to make a difference in the short term. Working in bigger companies, I worked for Yahoo. It was also a research area. We were never really connected to the business and it was also a research area. We were never really connected to the business and it was also more difficult to bring things moving and bringing a change, because there's more bureaucracy, more teams, more everything.
Speaker 1:Yeah, and that makes such a difference to your day not being a part of that and feeling like you're really making an impact internally. Being a part of that and feeling like you're really making an impact internally, definitely, and on that day, to know you're passionate about increasing data literacy across organizations. So what does that actually mean to you and what strategies have you found most effective for fostering it in a company as dynamic as SoundCloud?
Speaker 2:So in general, so also at SoundCloud. I've been there now since over nine years and I started as what we call an IC individual contributor, so first as an analyst and a scientist. Then at a certain point I moved to become a manager, but then I also kind of regretted this decision. So I'm back as an IC, but a leader. So I'm back in SNIC, but I'm not leading teams, I'm helping teams to empower data. So I do not have a team under me, but I also have resources through the org that I'm in and I'm connecting the dots kind of, and what I've seen.
Speaker 2:That for me kind of data literacy means people have to understand the data all over the place, no matter whether it's business, product marketing, an engineer. Everyone should understand the data that they are. No matter the company we are in. Even, I think in a bank, you need to understand the numbers, which, for me, are data. Even I think in a bank, you need to understand the numbers, which, for me, are data. And then, on the different levels, you need to be able to use the data with different tools that a data org, for instance, has to provide, and in smaller companies it might just be a sheet and you pull some statistics out of it. In bigger companies, you go bigger. For me, for instance, when I look at the data org, we have different levels of data very raw, aggregated, and then already business-ready, and in all these layers people have to understand the data. There's literacy everywhere, yeah, yeah, how do I achieve it? Are you thinking about this?
Speaker 1:That's a loaded question. I mean yeah to have all of that data. So I mean people outside of your department can understand what is happening.
Speaker 2:That must be quite a challenge thought about this question and what came into my mind initially was actually a shared terminology around data and also shared definitions, and this starts with high-level KPIs performance KPIs. We need to all understand what they mean or how they are defined, and they should be also measured all in the same way. So we had a couple of years ago, a big problem that, for instance, the number of plays or streams or monthly active users that we have on our platform was measured differently and understood differently throughout the company. And then everyone was like why do we have a different number than I do? Can we trust the numbers? What has happened so? First in alignment and here across the whole company around what data mean.
Speaker 2:And another thing I experienced back then because of this not alignment people need to have trust in data. We had an episode, and we still have in certain units where people don't trust the data because of misalignment. Things don't make sense, and a couple of years ago we even ended up with a situation where, oh, I don't trust trust the data that this pipeline produces, so I'm yet producing another pipeline, starting from scratch, and then we got another pipeline and then everything didn't match up and everyone was like I have no trust in anything If you don't trust the data. You can also not make decisions based on data, and that's when things got very wrong For me.
Speaker 2:How to achieve trust in data is looking at, again, shared terminology, how we measure things through data, but also having single sources of truth. We have big company, big data, and we have also various sources to measure things. We have backend events, frontend events, we have production databases, and depending on which data source you plug in, you also get a different number, and depending on which data source you plug in, you also get a different number. So we started to define sources of truth for data that you should rely on measuring something. Another thing I would say is important is also data quality.
Speaker 1:So, even when you measure or compute something, ensure that what you do is right and the underlying source is correct. Yeah, because you're quite a big advocate, aren't you, on just making things simple, just simplicity, in your work, and that must be such a challenge in data. Like you said, it must be quite easy for people to lose trust in data very quickly if it's very complex and doesn't align. I suppose that's kind of your your thing. You just go in and you think you know what. We could just make this a lot more simple, and this is how, how we're going to do it. That must be quite satisfying as well yeah, I really.
Speaker 2:It's again like solving a puzzle. Yeah, uh, looking at, I've seen so many complex uh data jobs, etls and pipelines. I was like what are they doing? I don't understand what they're doing. So I'm going really deep, speaking with the consumers of the data what do you think this pipeline should do? And then, step by step, I'm coming up with simpler solutions. It's also an iterative process, but then I'm always very satisfied when suddenly and 2000 lines of code turn into 50 lines of code and doing almost the same thing.
Speaker 1:But you actually understand what it's doing yeah, yeah, and that's the main, the main thing. I think that's another myth in tech as well that, um, everything is very complicated and when you work in tech, it can be quite intimidating for people coming in thinking am I going to to know everything and actually a lot of? It is your job to communicate things to, to other people on your team, making things as simple as as possible, because there is a misconception of being in the tech industry sometimes that everything is, um, you have to be highly technical and uh, even you know working with the tech team can be very complicated, at times very confusing, um. But if you've got people like yourselves who are there to kind of translate all of that um and to simplify uh for everybody, um, then that, then that makes sense.
Speaker 2:Yeah, related to simplification, what I've observed in the past that data engineers, data analysts, we all try to plug in too many requirements into a data product, what I call, whereas now I'm taking the approach okay, what are the key findings? We want to always find in a data pipeline or data product? And everything else you can query, once needed, to the raw data, and this helps to keep the product itself simple and reliable yes, sort of always over complicating or some things are we?
Speaker 1:we say here sometimes we, we do things and we build on things and build on things, um, and it is almost like expanding and, um, renovating a house where you think you know what I've gone so far and nothing really makes sense anymore because we've bolted on bits and nothing makes sense. So we just need to scale everything back to to the original build and what that looked like. Um, and you are renovating as well.
Speaker 2:So I wanted to ask you a bit about your.
Speaker 1:You have a demanding career, family life and you're renovating a house as well, which sounds like just a monumental task that you're undergoing. But how do you approach time management and what advice would you give to others juggling similar responsibilities?
Speaker 2:I compare my private life very often with my work life. It's all about time management and prioritization, and it's always there's more work you can do. That's even that's how we approach the renovation. Certain things are already lasting since five years. We are almost done there, but we all say, ok, we can put more time and effort in getting it done earlier. We are almost done there, but we all say, okay, we can put more time and effort on getting it done earlier, but then we would be so stressed or something else would fall apart. Um, let's do it step by step as long as we feel comfortable. Um, and that's, yeah, you always, I'm trying, we always try to balance between having enough time for family, having enough time for our family, having enough time for this house and for our work yeah, and making sure you do it in the right order as well, especially renovating you.
Speaker 1:You, you are right, you can rush, but sometimes you think, oh gosh, I've done that in the wrong order and now we're having to go back over things, or I didn't think that through, or? Um, you're right, it is time management and juggling that balance of getting things right and not rushing through yeah, I even use Jira now for managing the renovations yeah, you know what?
Speaker 1:I bet the the skills that you learn from your private life as well really do transition into your work life from from renovating a house and project, managing that and having to deal with all of the people that come through your house and juggling the tradesmen and the deliveries and all of those things must really help with your work life. It is like work.
Speaker 2:Yeah, different departments having different expertise. You need to connect them, ensure everything is done in the right order yeah, yeah, and still finding time for family as well, um it's a lot to to juggle um.
Speaker 1:What about beyond the technical aspects? Have you spoken about the importance of clear communication? Already a stakeholder alignment, which is a big thing as well in tech companies or any companies, but how do you build trust and clarity when collaborating across teams with diverse levels of technical expertise?
Speaker 2:When I'm getting assigned to a new project, I usually try to meet all the stakeholders, or even participants, to understand their roles In this process. I might even just figure out what my role will be. Yeah, capability, skill set to then make a decision. Okay, what makes sense and how can? Can I fit in there, and do we need other skills that are not present yet? This helps a lot and I'm also trying kind of building bridges matches perfectly.
Speaker 2:Um, I'm also, when I'm, like, assigned to an existing team, first understanding what they're doing and why are things done this way? So I'm not driving in there. We need to change everything. Um, because certain things are done for a reason. Um, certain things are done because maybe, again, things were too complex and we just add things on top of it and then we can simplify. Um, and I have. I think my specialty is then to dive very deep into the topic from also various angles compliance, the technique, terminology. I'm currently working on our content catalog, trying to abstract everything on a higher level, which then helps again to speak to the upper management and try to make them understand what works, what doesn't work and why not.
Speaker 1:Yeah, yes, especially in terminology. Yes, trying to communicate that with upper management must be quite a skill because you understand, but you must also have certain jargon that you use within your departments that needs to also cross over to other teams and departments and that must be quite that.
Speaker 2:that sounds very challenging yeah, I like to work with visualizations or drawings or analogies, um that that on a very high level, in a very level. Or if I find an issue and it's so complex, I'm also trying to visualize it through an example and say, look, when the user is doing ABC, we expect this data to come true, but it's not happening. And this is the impact. I think, especially for upper management, it's also always important to communicate the impact, and even for us, it's always important to understand the impact. When I think about big data, I also think big data do not have to be always perfect. It depends on the use cases. A certain degree of uncertainty is always acceptable. If we still can gain insights, use it for reporting, can rely on the data, um. So I'm always trying to estimate, even sometimes in certain ways. Okay, the impact is very big, medium or low, and if low, we can even still decide to leave the issue as it is yeah, I know a lot of this.
Speaker 1:what you're discussing in this question, they're all soft skills, they're not technical skills, and I think that's sometimes something that people forget when they go into tech, especially things like data. It's going to sound very technical and quite intimidating to some people, but a lot of what you do is actually communication communicating with different teams, people that don't work in your world all the time, and I think it's almost people forget how important those those skills are in tech and that you can't you couldn't do your job without those soft skills yeah, especially when you work in data and to communicate through data, you need to design a story around it and then maybe even sometimes even I have the story in my head and then I look at the data where I can back up the story, because then the story comes, my hypothesis which makes it easier, but I really like it.
Speaker 2:So it's something I learned during my phd because also, their communication is very important when you present at a conference, because even if you're in a big room full of data experts, you cannot explain a 10 pages paper in 10 minutes in every detail, so you need to also work again abstract and who's interested can then go and read the paper yeah, you know what?
Speaker 1:um, it's funny. You said that I spoke to a lady only last week who said her phd taught her something similar and she said it wasn't just the content of the phd, but she said it gave me such confidence with presenting. And um, just just it was. She said it was so different having to communicate and go through the training and the presentations and what you have to do in a PhD and she said, if I took nothing else from it, that really just improved my confidence and set her up for her next role. So it's funny you just said that there. You kind of took that from your presentations when you went into work when you went into work.
Speaker 2:Yeah, even the writing. As I said, natural languages are not my strength. My PhD advisor also had to struggle a lot with me, but I learned to write better in short sentences, communicate clearly. Now AI is helping here as well, but the PhD was really from the soft skills side also valuable?
Speaker 1:yeah, and you have. You have a lot of experience across different industries and different roles. What trends or advancements in data engineering are you most excited about? It's a very fast-paced world in tech, especially in data engineering, and is there any advice that you would give to aspiring data professionals?
Speaker 2:Yeah, I think AI is the rising and I'm really excited about it because it will help our life a lot. Yes, especially when I think about all the. So I really hope. And also, we haven't established it yet. We are prototyping and experimenting with different tools, but it would be really nice if, when you think about data literacy, we still have the problem that non-SQL people cannot query data easily. They can look at dashboards as soon as they want to dive deeper. You cannot set up a dashboard for each of the questions they have, so for now, usually analysts have to go and write the query, then the next question comes up, next query, and it's very time consuming. And having this sort of AI would actually enable analysts to do their job that they want to do and supposed to do, and also help everyone who has questions to get them answered way easier. So that's my big, big hope for AI.
Speaker 1:Yes, and also taking the jobs of people. There's always that fear where something new comes in and they go. It's gonna take all of our job. I know everybody that works in something, that they go no it's just saving us time so we can do other things that we've always wanted to do.
Speaker 2:It's like industrialization long time ago was the same, the fear, but it's just saving us time so we can do other things that we've always wanted to do. It's like industrialisation. A long time ago it was the same, the fear, but it's always. We develop things to be faster and more efficient to do other things. Yes, and maybe some jobs as an industrialisation will drop, but you need to find a different opportunity and see what else can you then do with this new world. Yeah, and so far in data, there's always way more work than we can ever achieve and I hope this will balance it a little bit.
Speaker 1:Yeah, definitely I agree. Even the small things that have come in so far have made such a difference to everybody's day-to day. Um can only, can only keep growing. Um, I can only imagine what that's going to mean for people that work in data. Is there any advice you would give to people that want to go into data? Anything that you wish that somebody had told you before you entered that world?
Speaker 2:yeah, if now, if someone is really into it depends on whether it is someone already having experience or like fresh from school. Let's say, fresh from school. Uh, if you're really interested into data and really want to do it, I would actually now pick um a university or study that fits exactly this field, because I learned everything about computer science. Now I'm maybe using 20 percent of it, but I'm missing all the math. I would search for something that has computer science and math in it, so especially the statistic that's tailored to do this, and I know it was not possible back then but it's now.
Speaker 1:That's so interesting, and a lot of people say that what you study sometimes is not what you actually end up doing when you land in a role and you almost wish somebody had helped you. It's difficult in tech because obviously things move so fast and trying to align what you're studying by the time you graduated, everything's changed again. If you're going to your first role, um, but yeah, you are right which?
Speaker 2:is yeah, which is also okay, because at university you learn to learn. That's my attitude. You learn how to approach new things and how to adapt and deal with it, um. So I learned kind of math after I studied computer science, when I came into data, um, but it would have made my life way easier if I had learned it earlier yes, yes, I love that you learn to learn because you don't stop learning in the tech industry, and it's it isn't for everybody, and that's that's the thing you have to.
Speaker 1:Everything moves so fast, um and uh, people like yourselves obviously find it exciting that things move that fast and you can keep learning, um, but it doesn't suit everybody. But, yeah, I love that. You just learn to learn. Well, we are already out of time, um. So, jeanette, thank you so much for coming on and sharing your experiences of working in data and working at SoundCloud. It's been an absolute pleasure to pick your brains about data, and so thank you so much.
Speaker 2:You're welcome. Thanks so much.
Speaker 1:And for everybody listening, as always. Thank you so much for joining us and we hope to see you again next time.