Stephanie Labou: and recording alright so welcome everyone I know people are going to be trickling in over the next minute or so coming from the previous session so as we are all entering the room, I want to say thank you so much for joining us on the first day of. Stephanie Labou: IASSIST, this is the. Stephanie Labou: longest day because we figured the first day everyone had the most energy. Stephanie Labou: So we are very excited that you are sticking with us, especially if you are in a time zone where it's getting kind of late, or if you're our Australia or New Zealand colleagues. Stephanie Labou: This is finally not an unrealistically early hour, so we are very excited that you can join us, so we are going to have three fantastic talks.
Stephanie Labou: In this session, but before we get started a few logistics which you're probably used to from the other sessions. Stephanie Labou: So please put your questions, as you have them for the panelists into the zoom Q&A so again the Q&A from within the zoom webinar. Stephanie Labou: um you can use the chat for kind of non directed questions. Stephanie Labou: And again, if you have any technical problems, you can send a message to myself or San in the chat and we will make sure that everything is working fine so i'm going to go ahead and introduce our first presenter. Stephanie Labou: Our first presenter is Linda Lowry so Linda has been the business and economics librarian at Brock University in St Catharine's. Stephanie Labou: Ontario Canada since 1997 she holds a master of library science from the University of Toronto, a master of arts in communication and technology from the University of Alberta. Stephanie Labou: And a master of Business Administration from Niagara university she has been a member of IASSIST for many years, which we're very thankful for.
Stephanie Labou: and business data continues to be both a thorn in her side and motivation for scholarly research so, so this is going to be her third presentation about business data at an IASSIST Conference, and we are really excited to learn more so, Linda, you have the floor. Linda Lowry: All right now. Linda Lowry: This is where I. Linda Lowry: You see, if I actually can share my screen So there we go.
Stephanie Labou: And thank you access. Linda Lowry: Access awesome Hello everyone and welcome to my talk scaling up research research data services a saga of organizational design gone awry. Linda Lowry: academic institutions may initiate organizational redesign in order to better pursue new strategic priorities and the case of Brock university library one of these priorities was active engagement throughout the research lifecycle. Linda Lowry: The draft organizational design framework proposed the creation of a new unit that would take a holistic lifestyle lifecycle approach to research. Linda Lowry: and come to sing activities related to research processes and the stewardship of research output, unfortunately, it also called for the elimination of the current liaison structure the eight subject Librarians would be redeployed to functional roles in other departments. Linda Lowry: No one was more shocked at this turn of events than me because, in my role as the business and economics liaison librarian. Linda Lowry: I knew how crucial it was to understand the disciplinary landscape, with respect to research practices in order to develop research data services that aligned with researcher needs.
Linda Lowry: I wondered what the new organizational structure, be able to meet the discipline specific needs of business and economics researchers for data reference data literacy and data retrieval assistance or would this become a saga of organizational design gone awry. Linda Lowry: i'll present a brief overview of the organizational design process and where research data services will sit in the new structure. Linda Lowry: i'll follow with insights into the discipline specific data needs of business and economics researchers derived from a thesis content analysis and a review of library and consultation statistics and finally i'll conclude with just a few recommendations. Linda Lowry: Brock university is a public comprehensive university located in St catherine's Ontario Canada, with a wide variety of academic programs, including masters level research degrees in management and business economics, there are 20 Librarians and about 30 library staff.
Linda Lowry: The draft organizational framework was unveiled in late 2019 details of the new functional units, including the research lifecycle and a new liaison teams proposal were fleshed out during internal consultations. Linda Lowry: Our data library and took early retirement in December 2020 and the new GIS data library and won't start until June 2021 as of may 17 all stakeholder consultations have been completed and the Librarians who are going to be redeployed should be notified soon, myself included. Linda Lowry: based in the new research lifecycle department, the GIs and data services library and will provide a full range of GIs and data services, including consultations instruction and outreach in the discovery and use of secondary data sources across a range of disciplines. Linda Lowry: In the new disciplinary teams model, the library will move from having individual Librarians a scientist specific programs. Linda Lowry: To a team based model where brock's university's six academic faculties in Paris, as indicated on my slide will be assigned to a team of three Librarians drawn from the teaching and learning research lifecycle and collection services departments.
Linda Lowry: Now, moving on to the evidence theses content analysis is a proven discovery method for gaining insight into disciplinary data needs and practices. Linda Lowry: In 2015 I explored primary and secondary data use by business graduate students, based on a quantitative content analysis of. Linda Lowry: corpus of 32 management theses in 2021 I use the same approach to analyze a new corpus of 57 theses completed in the same program from 2014 until 2020. Linda Lowry: summary of my 2015 findings over 70% of these researchers could be categorized as business data consumers, for whom the discovery and acquisition of secondary data.
Linda Lowry: most often from subscription based financial and accounting data sources were crucial real world activities less than 30% were business data producers collecting only primary data. Linda Lowry: Now the results from the latest study finance is still the largest area of specialization comprising 42% of theses down from 47% in the first study the distribution of theses across the other subjects specializations has even out a little bit more since the first study. Linda Lowry: 67% of the theses I analyzed collected only secondary data well 30% collected only primary data, so these results are comparable to the earlier cohort with a similar 7030 split. Linda Lowry: The finance specialization relied exclusively on secondary data sources, as did the majority of theses in accounting and then the operations and information systems specialization this tumors the findings from my first study.
Linda Lowry: Secondary data collectors used both open and proprietary data sources secondary data collectors no secondary data collectors. Linda Lowry: Use of open sources sorry my notes are messed up use of open sources. Linda Lowry: doubled from the previous analysis from 8.6 to 17.5% of secondary data sources. Linda Lowry: So what are the top sources five of the most frequently used commercial data sources were subscription based financial market data sets. Linda Lowry: The usual suspects data sets in the words platform like copy Stat crisp and executive COMP data from Bloomberg terminals and, of course, our friends at definitive with data stream. Linda Lowry: Now, on to the second set of evidence liaison statistics brock's Librarians, including our data librarian record our research consultation activity and a simple shared database.
Linda Lowry: I analyze these statistics for the 2019 and 2020 calendar years to gain insight into the nature of data related consultations. Linda Lowry: A frequency analysis of consultations by topic across all disciplines reveal set around 17% or about 120 consultations per year involve the topic of statistics and data searching so i'm going to drill into those. Linda Lowry: Specific consultations next close to half of all data consultations came from users in the Business School followed by 31% from users in the social sciences. Linda Lowry: Oh sorry, this is the patron Type one these consultations are fairly evenly distributed by patron type 36 coming from faculty 32 from Grad students and 23 from undergraduates now by by subject so close to half were from business and about a third from the social sciences. Linda Lowry: So who answers data questions my analysis revealed that, in my capacity as the business and economics librarian I recorded 68.
Linda Lowry: Of the data consultations in our statistics database well our data library and who also had some liaison areas recorded 26% of the. Linda Lowry: interactions so this speaks to the data intensive nature of business and economics consultation activity, as well as to the potential volume of questions which may be referred to the data librarian after the reorganization. Linda Lowry: Now drilling down into the courses and programs within the Business School I used to foil hat tools serious word cloud a feature to visualize the top terms noted. Linda Lowry: In our optional field for course and program so not surprisingly accounting finance investment courses top this list, but entrepreneurship and marketing also accrued frequently in this notes field.
Linda Lowry: Under comments the top 25 terms reveal that access to data. Linda Lowry: tops tops concerns as the words librarian representative, I have the ability to approve account request so words is up there as well, also during the pandemic, there was a lot of confusion about how to access. Linda Lowry: Sources like Bloomberg terminals or a data stream, which was limited to on campus so I was fielding a lot of those questions as well. Linda Lowry: People were looking for industry information market company and financial data are also on my top 10 list, what about the social sciences.
Linda Lowry: Economics was the primary driver of social science data consults including course related consults for. Linda Lowry: ECON five to four, which is taken by students that are masters in business economics Program. Linda Lowry: And ECON three P 10, which is the research methods course for undergrad majors were students have to collect data for regression analysis assignments other programs, such as political science geography and Labor studies also had data console. Linda Lowry: In the comments area, you can see that a lot of users were looking for Canadian data from statistics Canada survey census data public use micro data files or help with a data repositories and Ontario we have something called odyssey as well as the wharton research data service.
Linda Lowry: So some observations organizations, offering research data services should be aware that certain areas of business and management research. Linda Lowry: Such as economics, financial and accounting research each have their own traditions and approaches, as this quote from a standard business, research methods research methods textbook notes. Linda Lowry: Data discovery, particularly of library license data resources is a common struggle for Economics undergraduate and graduate students, as documented in several recent studies. Linda Lowry: Ithaca srs report on teaching business also noted that significant barriers exist for both instructors and students in finding and accessing data. Linda Lowry: and, especially, industry and financial data and finally secondary data reference work can also be difficult and frustrating for Librarians in part due to non standardized discovery environments.
Linda Lowry: So how can we keep this from all going awry. Linda Lowry: I have three recommendations one integrate discipline specific data literacy. Linda Lowry: into teaching and learning initiatives by providing guidance and support geared to class assignments in economics, entrepreneurship, marketing and finance. Linda Lowry: to integrate discipline specific content and support into research lifecycle initiatives, including support for those key crucial data sources like Bloomberg data stream and the wharton research data service and three. Linda Lowry: create a business and economics data collection development policy which we don't have right now. Linda Lowry: After consultation with disciplinary faculty in those key departments accounting finance the business economics program another sub fields that rely heavily on commercial secondary sources to do their research. Linda Lowry: And that wraps up my talk, thank you very much, in every language possible. Stephanie Labou: Thank you, Linda so again as a reminder, you can put your questions for Linda into the Q&A and then I will moderate them and as people are thinking, I actually have a question, so this tracks. Stephanie Labou: Similarly to my experience so i'm the data library in my institution, but we have a separate business and economics librarian.
Stephanie Labou: And so just anecdotally do you have any thoughts on on why that split in terms of you are getting the majority of the questions and the data library and is not did they go directly to you, or is it somehow shuffled your way some. Stephanie Labou: Some other way. Linda Lowry: um, let me just stop sharing my screen. Stephanie Labou: And then we can start. Linda Lowry: There yeah so my sense is that you know students are coming to me for information. Linda Lowry: That turns out to be data right they you know they're they're an entrepreneur doing a feasibility study or a business plan, and they need information and a lot of it is statistical in nature.
Linda Lowry: And so I when I when I spoke to students, I said, you know come to me first and then, if it's something that I need to refer you to I will but. Linda Lowry: I feel I feel like I am the exception amongst my peers and I don't necessarily know that i'm senior admin understands that that that is what's happening so yeah. Stephanie Labou: And it's a. Stephanie Labou: common problem. Linda Lowry: I had for reference questions this morning in the email that were statistics you know so because we don't have a data library, and right now we're. Stephanie Labou: yeah and i'll echo what Bob raised in the chat is that subject knowledge is really key. Stephanie Labou: Which is why, having a liaison is so critical and I would be lost for the economic data questions that I get without working closely with our economic and business library and because it's all so hyper specific even the terminology I wouldn't even know where to start. Stephanie Labou: yeah so interesting to hear about how it's going out and other institution.
Stephanie Labou: as well, so thank you so much as we're going on, if you think of questions you can put them into the q&a box and we'll probably have a little bit of extra time at the end. Stephanie Labou: So next up, though we have Jenny Murack who's going to be our main presenter and then we also have Christine Malinowski and Madeline Wrable who are going to be on hand to answer questions during the Q&A. Stephanie Labou: So Jenny is the GIs and data library and at the MIT libraries and quick plug a Co Chair of the IASSIST geospatial interest group. Stephanie Labou: She also provides general reference help support for social science data and as a liaison to the Department of urban studies and planning. Stephanie Labou: So, along with Christine and Madeline she is part of the citation management and writing tools team and statistical services team.
Stephanie Labou: Now, Christine is the research data librarian at MIT libraries and then madeleines is with the MIT libraries, as well as the GIs specialist and so we've got three fantastic people here, and with that i'm going to let Jenny take it away. Jennie Murack: Okay, thank you. Jennie Murack: So our presentation today is co-location and collaboration space influenced our library data services. Jennie Murack: And stephanie mentioned i'll be the main presenter just in the interest of time, but Madeline and Christine will be here to answer any questions in the Q&A. Jennie Murack: So the reason we're talking about this at all is that, unlike some of our other library services, our new physical space is really the catalyst for the new services, we started implementing.
Jennie Murack: You know around data and so that doesn't always happen that way in the library and so i'm going to be talking about the space how its design led to increase collaboration among staff and new services. Jennie Murack: And we initially proposed this topic when the conference scene was data by design Sweden, and so we were going to showcase both the physical. Jennie Murack: kind of design and service diet design and we have not been in our space since March 2020 but I will talk about house piece influence our services prior to going remote and we've also added a little section at the end, about the virtual space.
Jennie Murack: So existing services and spaces. Jennie Murack: My co-authors and I are part of the Department of specialized services within the MIT libraries and our department focuses on providing data and software oriented services across all departments at MIT. Jennie Murack: And prior to moving into our new space staff were dispersed throughout the architecture urban planning library and and other libraries across the campus. Jennie Murack: And the result was that our team was fairly siloed without much communication outside of departmental meetings and the main services we had prior to creating this new physical space where our GIs services data management services and our citation management and writing tools service.
Jennie Murack: The only physical space held by our department was the GIs lab this was a small space only had seven computers, there was little area for group work, because the space is pretty small and also teaching in the space is limited as well. Jennie Murack: We also didn't have any point of service so with staff offices scattered throughout the library and across campus. Jennie Murack: Students could use the space on their own, but we're unlikely to be able to get help unless they scheduled an appointment, or we were there for the designated dropping help.
Jennie Murack: And the space was focused on jazz consultations so it wasn't apparent that data management services or citation management services even existed. Jennie Murack: During 2017 and 2018 we reconfigured the first floor of our architecture and planning library that was formerly book stacks and microfiche readers. Jennie Murack: And the goal of the project included, creating a space where GIs and data management services have a combined presence, increasing the size of the space. Jennie Murack: Allowing for flexibility in the configuration and increasing the visibility of staff to our Community.
Jennie Murack: And this was also meant to be an experimental space where the design and the services can be modified as needed, and so we opened the new space and September of 2018 and ended up being a mix of staff offices computer lab space and seating for individuals and groups. Jennie Murack: We also created an xr space and we had just begun exploring extra xr technologies, and this was led by mad line and this really allowed the program to gain visibility and to expand. Jennie Murack: We also added places where students can work alone or in groups, and this was actually unique for our library, because the main level of the library and the upper floors are completely open. Jennie Murack: And so they're more geared towards individual work so just having this kind of more collaborative space in itself really attracted students who might not might not otherwise have come into the library. Jennie Murack: One of the more notable features of the new space was our much expanded and our rebranded Jay is indeed a lab that had 16 computers ahead AV equipment and monitors, so we can do some teaching in the space.
Jennie Murack: And one of the more unique features actually our staff cubicle offices are positioned throughout the space, making us more visible and accessible to our users and so at first, we were concerned about having our. Jennie Murack: offices that don't have real walls within a public space and getting interrupted, but it actually ended up happening pretty infrequently and i'll talk a little bit more about that. Jennie Murack: So our new space greatly impacted our services going forward.
Jennie Murack: The new space really became a place to get centralized help. Jennie Murack: And so within months of opening the Jason data live became a space for students felt comfortable coming with any questions related to data digital resources or software. Jennie Murack: And we haven't had referenced desks in our libraries for some time now, and probably 10 years. Jennie Murack: And so students have been stopped by during designated drop and help hours to ask questions on a variety of specialized topics in addition to GIs data management. Jennie Murack: As I just said, we had formal and we had joint and GIs and data management dropping hours and we often referred users between the services but also give visibility to both the services as well. Jennie Murack: The closer proximity of staff offices to the user space further enabled users to interact with us as the experts. Jennie Murack: users have been able to find staff if they have questions and we would often informally check on students who are using computers are working kind of as we're coming and going out of the space. Jennie Murack: And, as I mentioned before all of our teams included some stuff who primarily report to other departments, or have offices and other libraries, and so the staff, the lab became a space where they.
Jennie Murack: and other staff in the library could schedule a consultation with users, which further increase staff visibility and collaboration among staff in our department. Jennie Murack: As I mentioned previously users stopped by to ask questions about topics that are not officially part of our service offerings and that's really what led us to pilot some new services. Jennie Murack: And so, previously, the only statistics help was available through a partnership we have with Harvard. Jennie Murack: And their data science services staff would provide help to MIT affiliates. Jennie Murack: And the services was more limited and that it was more geared towards we should research publications and often geared towards more experienced users. Jennie Murack: And while we were doing our joint dropping hours, we found that students were stopping by the lab to ask pretty basic questions about statistical software as part of their introductory courses. Jennie Murack: And my co authors and I had some experience with Cisco software, and so we developed a pilot service aimed at those new users. Jennie Murack: Similarly, based on questions we received about data visualization jazz and data lab staff, led by Christine piloted to data preparation and visualization workshops in this space and also just expand to the general workshop offerings on this topic.
Jennie Murack: There was increased visibility for existing services as well. Jennie Murack: madeleines collaborated with one of the MIT labs to offer virtual reality workshops in this space and also begin offering equipment loans and self service documentation for using the xr space. Jennie Murack: We supported these new services by increasing software and computers related to data analysis and visualization so we installed things like in vivo gfi opener fine state. Jennie Murack: And with the rebranding of the space as the Jason data lab it became kind of a natural place for users to look for this type of software.
Jennie Murack: And so we found that as students and faculty discover the space we began receiving request to use it for Class workshops, in addition to the workshops that the library staff are teaching. Jennie Murack: And during the first year we had 20 workshops in the space and those included workshops taught by library staff but i'll also faculty and staff from other departments, who reserved it for a special workshops. Jennie Murack: So they could use their computers and use the software that was on it and that further increased visibility of our services, having students from classes coming, which was not possible, before. Jennie Murack: So, like most library services that are new we did an assessment. Jennie Murack: And we conducted a variety of surveys to learn more about the usage of the space and the opinions of it, and this involved kind of leaving out surveys and also staff walking around in different times, you see, who was using the space and how they were using it. Jennie Murack: So overall the space received positive feedback from users as a place to work collaboratively at tables their computers and not surprisingly, some of the drawbacks were things associated with having a fully open space, such as noise traveling back and forth and lack of privacy.
Jennie Murack: The computer space layup layout was set up well for both individual work and group work during instruction sessions, and this is something that other computer spaces, the libraries have on campus didn't really. Jennie Murack: Provide and so, because it was not originally designed as an instruction space we didn't have to work with what was there, so some of the drawbacks were pillars. Jennie Murack: holding up the ceiling that obscured sight lines and some issues with noise traveling back and forth, but it was still a much preferable space for instruction, especially within data and software, as opposed to the other spaces available to us. Jennie Murack: and users liked the availability and visibility of staff and for us these serendipitous encounters allowed us to learn more about user needs related to data services. Jennie Murack: And let's all the additional you know pilots and collaboration, as I mentioned. Jennie Murack: And so the results of this assessment are actually used for the renovation of our main library, that is just wrapping up now, especially around.
Jennie Murack: Space and just having staff space students base individual space teaching space all in one. Jennie Murack: Just to see how that worked and so both kind of the positives and negatives we're, taking into consideration for that renovation. Jennie Murack: And many of the identified challenges that I just mentioned we're actually. Jennie Murack: addressed to some extent over the in the last couple years because this was seen as an experimental space and so luckily our administration was willing to work with us and kind of continue to tweak the space over the course of the last couple years. Jennie Murack: So now we'll talk about what happens when we started working remotely and so, like most libraries, we shifted to fully virtual services during March 2020 and our library locations are currently closed. Jennie Murack: But we continue to provide all of our services virtually, and that includes our workshops access to our lab computers and continuing consultations remotely.
Jennie Murack: we've continued to expand our service offerings and our visibility, even though we you know went virtual including. Jennie Murack: The ability to allow more people to attend our workshops because we're not constrained. Jennie Murack: By the size of the space and the ability to reserve a lab computer for remote usage. Jennie Murack: And the remote computer usage has actually been incredibly popular, as it allows students to complete work using software that can sit on their own machines. Jennie Murack: or won't run efficiently on their computers, and these are actually they're able to login directly to the remote computer it's not any sort of like cloud or a virtual service that way.
Jennie Murack: And this type of service would have been difficult or impossible if we had not developed and more robust lab space prior to working virtual and even though we don't have joint dropping help at them at the moment we're continuing to bring. Jennie Murack: Others from different groups in our department into our virtual consultations as needed. Jennie Murack: And we've been doing a weekly happy hour among the lab staff and her to maintain our informal collaborations so, even though we do not see each other in our cubicles every day we get to see each other every week virtually that way.
Jennie Murack: And having experience with in person collaboration is really led to an easier transition from virtual collaboration, since we were already used to collaborating. Jennie Murack: With each other, during our consultations and we further mirrored our centralized physical space, by creating a centralized data services page on the library website which didn't exist prior to this creation. Jennie Murack: And so, finally, our Community is now more comfortable with working in a virtual environment environment, so we plan to continue some of the services that were influenced by our virtual space, such as our online workshops and our computer innovations. Jennie Murack: We have may also pilot some online dropping help to mirror the in person healthy provided. Jennie Murack: In our physical space might be used in different ways as well once MIT returns in person learning and working and we may develop new in person services, as well as the result. Jennie Murack: of our virtual time and so, in conclusion, we began using this experimental space without knowing exactly what would happen in the space or have. Jennie Murack: We learned to be open to anything and to shift and pivot pivot our ways of working as needed.
Jennie Murack: The positive effects, the new space overshadow the few negatives and lead to new opportunities for collaborations and services to better meet the needs of our Community. Jennie Murack: And our experimental mindset led to a fairly easy transition to virtual working and we were able to continue and even expand upon her services in a virtual environment. Jennie Murack: And thank you that's it. Stephanie Labou: Thank you Jenny That was really interesting, we have a you know not quite similar but data GIs lab thing as well, so super interesting to hear about how it's going and other libraries and we do have a couple questions.
Stephanie Labou: So a really good question that I had a similar one when I was saying, to talk is, do you think it is the visibility or novelty or convenience that is drawing the students to this space. Jennie Murack: I think in in my my co presenter or my co authors chime in I think it's a combination of everything we, in addition to just being kind of like a novelty we work with our marketing director to do a lot of branding so like on the entrance floor to the library they they put like.
Jennie Murack: Where they called like sticker things that says like new jet lag visit with the arrows and stuff so so the visibility is definitely part of it as well. Jennie Murack: I guess that's what i'll say on that if anyone at Madeline Christine if you want to chime in feel free. Christine Malinowski (she/her): I was going to actually say you know the the floor it's on is actually not the most visible in the library. Christine Malinowski (she/her): So the marketing is partially be like hey this exists it's no longer bound journals you should come check it out, but I think yeah I think. Christine Malinowski (she/her): We we sort of started to see a little bit of word of mouth of like oh there's a place that you can sit at tables and talk to people in this library, because as Jenny mentioned. Christine Malinowski (she/her): There aren't a lot of spaces in this particular library where that is doable without sort of everybody in the library hearing, so I think. Christine Malinowski (she/her): It became a space that people were like Oh, we can do group work here, and I think that bright light people, then I was like oh they're teaching and so it's sort of just snowballed a little bit that way.
Stephanie Labou: Okay, and then another question we have time for another question before we move on and a lot of people are interested in the remote access. Stephanie Labou: So i'm going to phrase this generally as if you could just talk a little bit more briefly about how you're going about providing remote access and any. Stephanie Labou: Again quickly any kind of like challenges or issues that you kind of ran into and how you manage to get that because not every institution was able to spin up remote access. Jennie Murack: yeah I thought them, I thought that might hit a chord with a lot of us as data professionals, so we had actually we said we just like remote desktop and we actually had that setup.
Jennie Murack: Prior to going remote and we hadn't actually use like we weren't really using it like advertising the people, but it was on the computers, so we had our it kind of allow remote access and so. Jennie Murack: Without getting into too many logistics, we just do it that way, but we basically manually have people email us we kind of book them for a certain computer. Jennie Murack: And then they're able we send them instructions and policies and stuff they're able to login so it's fairly low tech as far as that goes.
Jennie Murack: The main issues have been which hasn't happened is recently times the computers would stop working or like we couldn't access them remotely. Jennie Murack: but luckily our it our IT staff were able to get access to the campus, and so they were actually great about going into the space within like 24 to 48 hours and like rebooting stuff for troubleshooting.
Jennie Murack: And so we work through that and kind of came up with some policies for like starting them periodically and to prevent that from happening, but that's the gist of it, but anyone, you know can feel free to post on the website or follow up after if you want any more details. Stephanie Labou: Thank you yeah i'm starting to think we also did remote access and it looks like Harrison, the chat is saying that they did as well. Stephanie Labou: We should start a thread about the remote access and what didn't didn't work because we ended up using Google calendars and let me tell you that does not work well, when you have. Stephanie Labou: 100 people to make reservations, but it worked well at the beginning, so thank you, that was a really interesting presentation, so thank you to all of you.
Stephanie Labou: So for our final presentation of the day we have Alex Storer and Julie Williamson and they're going to talk about. Stephanie Labou: The research hub so Alex leads the data analytics and research computing team at Stanford graduate school of business, so he consults with researchers on challenges about data storage cloud infrastructure. Stephanie Labou: All manner of things and Julie is the assistant dean of the research hub and i'm gonna keep it brief because they are going to talk a little bit more about their roles as well, so with that I will hand it over to Alex and Julie.
Alex Storer: Great Thank you. Alex Storer: My screen visible okay excellent alright so um so yeah diving into it we're going to tell you today a little bit about. Alex Storer: What we do at the research and, specifically, how we provide cross functional data services to researchers at the graduate school of business at Stanford university so i'm Alex i'm joined by Julie we're both going to talk and kind of pass it back and forth. Alex Storer: yeah. Alex Storer: So, so when people think about the graduate school of business at Stanford I think most of the time they don't think about research, I think that the MBA program is much more widely known. Alex Storer: But there is a very active research community at at the gsb their PhD students there a ton of faculty members and many of these faculty members are very accomplished. Alex Storer: So the 2020 Nobel laureates and economics rabbit wells and Paul milgram are both gsb faculty members and the research that happens at the gsb is actually.
Alex Storer: quite broad so it spans a lot of the social sciences, in addition to what you might expect, at a Business School like we've heard earlier in the session about accounting and finance. Alex Storer: So a lot of the previous research that that the gsb is known for is very theoretical so those Nobel laureates did did a lot of economic modeling. Alex Storer: But they didn't use a ton of data and now we're seeing this major shift across the entire school where data is becoming absolutely crucial to refine and confirm those groundbreaking theories and the data can come from so many different places, so. Alex Storer: The diversity of this data is really key it can be historical data from books or other scanned records could be donated from a friendly company could be purchased from a data vendor or licensed from that vendor.
Alex Storer: Our researchers also collect data experimentally at the behavioral and then, finally, this data can be scraped from the Internet, which is a less conventional data source and all these together, provide a set of challenges that we aim to meet at the research so take it away Julie. Alex Storer: When you are muted. Julie Williamsen: Okay, so the research hub is a relatively new organization and we are super excited to be more involved in IASSIST. Julie Williamsen: So it's a unique multi unit organization that provides research support services to Stanford Graduate School of Business. Julie Williamsen: And the library services to the gsb and the broader Stanford Community it includes the GSB business library. Julie Williamsen: A data analytics and research computing team which Alex overseas and that's data comma analytics comma and research computing we lovingly call them DARC.
Julie Williamsen: And a human behavior research lab and a convenient programs and operations unit, so our Organizational Structure helps us take a holistic approach to our services, we consider the research process as a whole for every research project data set or question that comes to us. Julie Williamsen: let's jump to the next slide so today, we are going to discuss how the services of our dark team intersect with the gsb business library services to provide a ritually interconnected data support Program. Julie Williamsen: We are able to provide end to end custom data services through interconnected roles functions and systems, the team includes research Librarians legal and acquisitions Librarians a data curation manager and. Julie Williamsen: know we are hiring So if you want to come and work with a fun and smart people and live in sunny California come and join us.
Julie Williamsen: We also have data engineers research computing specialist and research analytics scientists, so these individual self organized to address the unique research inquiries storage compute needs and. Julie Williamsen: For each data set and for each researcher what you don't see pictured here is. Julie Williamsen: Piper. Julie Williamsen: So work closely with Stanford like this morning we connected with Ashley Jester many of you know Ashley regarding a data security question about a data set she purchased that one of our doctoral students, would like to use. Julie Williamsen: So now let's talk about how these individuals work together when considering data sets and the research hub. Julie Williamsen: So one of the first questions we ask when considering a data set is is it useful for researchers research Librarians assist.
Julie Williamsen: Researchers to ensure each dataset is suitable for each research project they help to identify relevant data sources and data providers for each research project and for classroom yes. Julie Williamsen: So this includes helping researchers discover existing library resources, as well as identifying and vetting new sources comparing variables date coverage company coverage and data collection methodologies. Julie Williamsen: Approximately half of our research data is licensed from non standard providers or it's obtained by our dart team by scraping or other methods so Librarians and dark TEAM members regularly consult on these types of data acquisitions projects.
Julie Williamsen: So another question the team asked when considering a data set is is the status that license for research. Julie Williamsen: Librarians provide assistance for all aspects of data acquisition that support academic research from negotiating master data agreements to individual data use agreements to managing vendor payments to getting the appropriate final signatures from various Stanford offices. Julie Williamsen: The library's data licensing team negotiated contracts with data providers such as ensuring appropriate publication rights and ensuring the gsb gets the best deal on pricing. Julie Williamsen: Data acquisition agreements can be quite complex, and this is a lot, this is where a lot of the magic happens, we have. Julie Williamsen: Research and contracts Librarians working with data, scientists and engineers to ensure each acquisition is in compliance with Stanford policies, including appropriate security and access, controls.
Julie Williamsen: So during the contracting phase contractual agreement support is tied to a foreseeable data use case data storage research computing and security. Julie Williamsen: The teams also coordinate on delivery and continuous support on compliance issues for all aspects of data lifecycle management from onboarding to off board and users. Julie Williamsen: resolving issues with providers and advising users on their rights and obligations, so it all ties back to the researchers needs. Alex Storer: So, speaking of the researchers needs ultimately.
Alex Storer: We have to ask if using a data set. Alex Storer: is even feasible, given what a researcher is hoping to do with it, and this is a really big and complicated. Alex Storer: framework for questions that really ties in a lot of the research computing aspect of this. Alex Storer: So, if you think about acquiring the data set, particularly a really large data set often the platform where you're going to be doing that analysis is really important. Alex Storer: So you might have to use a separate platform and it might be not equipped with the software that you want to use or maybe that you have to use to do your analysis, it might be stuck in a sandbox where you can't export all of the data out of that environment.
Alex Storer: And it might be restricted in terms of what systems, you can use it on at all. Alex Storer: So, on top of that, the data from a provider might not be in a very useful format, even if it's licensed for your use, even if the contents of the data are good. Alex Storer: The formatting of the data can make a huge difference, particularly if it's a really big data set that spans a long period of time. Alex Storer: After that you might have to combine that data with other important data sets which can sometimes be prohibited by one or both of the licenses for those data sets plus.
Alex Storer: Researchers are frequently collaborating with people at other institutions and that's something that's often called out in these license agreements as well. Alex Storer: So I think we've talked about this abstractly a lot so let's dive into some specific examples so we'll kind of start at the end with the paper. Alex Storer: To show that it worked, I guess, and then go through some of the details of how we were able to get this data on board so so this paper is about using a donated data set of micro transactions to evaluate health care spending. Alex Storer: As a result of being a part of a and ensure in the the affordable care act so before, during and after how much do people spend on medical care so it's an incredibly rich data set.
Alex Storer: That was multiple terabytes and it came with a lot of challenges that Julie, is going to go into. Julie Williamsen: Okay, so um the initial acquisition of the stages that it came from a well intentioned Professor who originally acquired this particular data set and he negotiated the license so we inherited. Julie Williamsen: A poorly crafted agreement that required us to do a significant amount of extra due diligence. Julie Williamsen: So the contract itself was so poor that the contracts librarian and the Professor tried to renegotiate it with the company, but this was not possible. Julie Williamsen: So, then, we took the agreement to university general counsel and they advised us on some things that we could do to still continue to use these data. Julie Williamsen: And then we brought in the senior Dean over the research hub to consult with how we could best comply in light of the terms and conditions.
Julie Williamsen: So the application for us um This led us to create an application process for these data. Julie Williamsen: So those who want to use this data first need to discuss their project with a research librarian and the Professor who originally acquire these data. Julie Williamsen: And if this sounds like a good project, the next step is for the applicant to actually fill out an application and. Julie Williamsen: And this application is then sent to faculty who have used these data are using this data to ensure it doesn't overlap with current research projects and then, finally, the sad that our senior associate Dean.
Julie Williamsen: will sign off if everything looks good um so for these data, the next steps for us is. Julie Williamsen: tracking users and the contracts librarian will then consult with a researcher and have them sign an internal data use agreement. Julie Williamsen: And then, after that we send them to a to one of our fabulous data engineers who has made these data available in a few different environments for various query use cases. Alex Storer: yeah so these data are are pretty big so we had to actually evaluate some new cloud platforms for this data, keeping in mind with the license agreements were for the data as well as.
Alex Storer: What the specific researcher use cases were so as a result of this we've actually iterated through several different platforms, there are a few different places. Alex Storer: where you can work on the data, depending on what sort of cut you're interested in getting so being able to work with the researchers on that helps us provide a holistic view of of helping them achieve their goals. Julie Williamsen: So another interesting data set that we acquired. Julie Williamsen: was a private equity data set and these data were initially the basis of a research investigation by Professor alien struggle, I have, and he found with these data that the evaluation methods for unicorn companies. Julie Williamsen: Primarily, the way they're doing is primarily determining valuation based on the last terms of the last financing round and share values.
Julie Williamsen: that this does not give an accurate reflection of the unicorn companies value when ipos and the more than half are overvalued. Julie Williamsen: So this has direct implications for employees shareholder stock ownership and can make it very difficult for employees to know the true value of their stock which can often be a significant portion of their salary. Julie Williamsen: So, as we look at the matrix of the self organizing teams who work with this data set. Julie Williamsen: Initially, when we acquire these data, the primary researcher and research librarian determine the most reliable data provider for these types of data so that was the first step, and then the research librarian works with the data provider to explain academic research needs. Julie Williamsen: Like we're saying we work with some non standard data providers and this particular data provider and never worked with academic researchers, so we determine the variables that we needed. Julie Williamsen: We couldn't afford everything that we could actually see on the platform, so we were getting a feed we couldn't get everything we needed in the platform.
Julie Williamsen: But we still thought deeply about what variables, we could get that would benefit the greatest number of researchers so then it came to contracting and the contract. Julie Williamsen: For again with the provider and create an academic use contract. Julie Williamsen: And we also during this partner with the data engineer, and this person assisted with how the feed would be structured so the storage location update frequency, etc. Julie Williamsen: For new projects, and if we get folks who want to use these data, what so when a new researcher comes in, so. Julie Williamsen: For these data they consult with a research librarian and if the project is sound, they are required to sign an internal data use agreement with the contracts librarian and then at that point, they are referred to a data engineer who assists with the particular access needs.
Alex Storer: And this is another one of those data sets where the way the company provides the data is not the way the researchers are hoping to use it, so we need to not only. Alex Storer: provide access to the data, but actually restructuring and make sure that it's available. Alex Storer: Both in the original data that was provided and with all of the continuous updates so then we're able to collaborate with the researchers and make sure that they're able to perform the analyses that they need to do. Alex Storer: Great so thank you so much for sticking around to the very end, listening to what we do with the research hub, and again if the sounds like a fun challenging interesting job for somebody that you know send them our way we'd love to talk about data curation thanks a lot.
Stephanie Labou: Thank you Alex and Julie, so we do have a few minutes left so reminder that you can put your questions into the Q&A. Stephanie Labou: So those big matrix is makes my eyes bleed and make me break out in hives just thinking about. Stephanie Labou: Tackling those so fantastic job, I have a million questions, and now I would love to see you know, a you the things that you wouldn't believe i've seen as like a birds of a feather for difficult data access questions, because I think we've all. Stephanie Labou: You know wondered what it takes to get access to data from some of these non standard providers, and so, seeing that.
Stephanie Labou: You know the number of people involved is really helpful in terms of getting the sense of what it what it takes so well, people are thinking if they have a question I have one. Stephanie Labou: I picked up on something small, you said that i'm really curious about you mentioned that you had the staff in the research hub that will actually scrape data. Stephanie Labou: So i'm i'm super curious if I heard that right, so if they will actually go out and scrape data, because we We sometimes get requests from faculty as well. Stephanie Labou: And we draw this line I like we can kind of teach you how to do it, but you have to go and do it and also please read the terms of service and don't get us in trouble. Stephanie Labou: So i'm super curious if you have that kind of web scrape as data access as part of the service or if that was wishful thinking and mishearing on my part.
Alex Storer: that's such a fantastic question so we've. Alex Storer: we've kind of gone back and forth on. Alex Storer: kind of the acceptability of scraping and I think that there's a lot of legal decisions that are still. Alex Storer: kind of forthcoming about how appropriate it is so. Alex Storer: So, in terms of the best practice which is one that we do fairly often actually as we will get permission from the company to to scrape their site. Alex Storer: One of the nice things about being the Stanford graduate school of business is that there's a pretty rich alumni network.
Alex Storer: So often there's somebody in that alumni network, where you can get in touch with and say like hey are you able to provide us with this data, like can you. Alex Storer: Give us a database dump of it, and the answer is always no but then sometimes the answer after that is but you're welcome to scrape it so once we get that Okay, then we'll provide that that technical assistance or sometimes outsource it depending.
Stephanie Labou: Thanks. Julie Williamsen: i'm just to be clear yeah yeah me oh so just to be clear that the service is specifically for gsb faculty and so it's pretty it's pretty unique that the school offers this and you know, has a team that supports this kind of work. Stephanie Labou: that's amazing, and I mean, I can only imagine that if yeah it was wider you would be inundated with requests to scrape I do think i've seen.
Stephanie Labou: Some you know business data that people want and I kind of introduce them to some of the other tools and say I hope you like pulling tables out of spreadsheets or pdfs. Stephanie Labou: And things like that so that's really interesting so kind of last call for questions from our wonderful attendees you can also always put. Stephanie Labou: additional questions in Hoover for the presenters or send them a message as well, but I know that everyone has been. Stephanie Labou: really good sports to sit on us sit with us on zoom for quite a few hours, so thank you all for a great session I have so many. Stephanie Labou: thoughts I have so many questions about you know, following up with MIT library folks and how you get people to come into your stuff because we have one, but nobody comes.
Stephanie Labou: And sounds like there's a lot of interest in remote access as well, so we can continue that conversation on Hoover and thank you Alex and Julie again for that really. Alex Storer: enlightening and. Stephanie Labou: You know claps for the amazing amount of work that that is. Stephanie Labou: it's a lot going on, so thank you all i'm going to go ahead and you know, stop the recording and then see you all. Stephanie Labou: Tomorrow, whatever time of day, that may be for you for this day to have our wonderful presentations and in the meantime feel free to use all the chat and community on Huda and thank you so much to all our panelists..