Jump to content

Earth Science Information Partners Celebrate 25 Years of Collaboration


Recommended Posts

  • Publishers
Posted
eo-meeting-summary-banner.png?w=1037

13 min read

Earth Science Information Partners Celebrate 25 Years of Collaboration

Allison Mills, Earth Science Information Partners, allisonmills@esipfed.org
Susan Shingledecker, Earth Science Information Partners, susanshingledecker@esipfed.org

ESIP Photo 1
Photo 1. Photo of some of the in-person participants of the July 2023 ESIP Meeting. ESIP celebrated its twenty-fifth anniversary in 2023. Founded as a knowledge sharing space, the nonprofit has grown as a collaborative data hub.
Photo credit: Homer Horowitz/ Homer Horowitz Photography

Introduction

In 2023, the Earth Science Information Partners (ESIP) community celebrated 25 years since the nonprofit’s founding. Serving as a home for Earth science data and computing professionals, ESIP has evolved alongside the tools and vast expansion of Earth science data available now.

Building on the deep roots of collaboration that ground ESIP and honoring the 2023 Year of Open Science, the 2023 July ESIP Meeting’s theme focused on “Opening Doors to Open Science.” Open science is a collaborative culture enabled by technology that empowers the open sharing of data, information, and knowledge within the scientific community and the wider public to accelerate scientific research and understanding. This definition of open science comes from the 2021 article on the topic published in Earth and Space Science(To learn more about how open science is being implemented within the context of NASA’s Earth Science Division – see Open Source Science: The NASA Earth Science Perspective, in the September–October 2021 issue of The Earth Observer [Volume 33, Issue 5, pp. 5–9, 11].)

Participants from around the world gathered July 18–21, 2023, in Burlington, VT to explore this theme. One of the strengths of the ESIP community is how it brings people together from government agencies, academia, and industry to work toward common goals. Altogether, nearly 400 attendees from nearly as many institutions, spanning many technical domains and career stages, gathered for the 4-day meeting, which featured a hybrid format that allowed for both in-person participation and virtual access to all plenaries and breakout sessions. Some of the in-person attendees are shown in Photo 1.

This article begins with a brief section on the history and purpose of ESIP followed by a summary of the highlights from each day of the July 2023 meeting. 

History and Purpose of ESIP

ESIP was created in response to a National Research Council (NRC) review of the Earth Observing System Data and Information System (EOSDIS). (To learn more about EOSDIS, see Earth Science Data Operations: Acquiring, Distributing, and Delivering NASA Data for the Benefit of Society, in the March–April 2017 issue of The Earth Observer [Volume 29, Issue 2, pp. 4–18].) As NASA’s first Earth Observing System (EOS) missions were launching or preparing to launch, the NRC called on NASA to develop a new, distributed structure that would be operated and managed by the Earth science community and would include observation and research, application, and education data.

ESIP began with 24 NASA-funded partners, whose purpose was to experiment with and evolve methods to make Earth science data easy to preserve, locate, access, and use by a broad community encompassing research, education, and commercial interests. NASA adopted a deliberate and incremental approach in developing ESIP by starting with a limited set of prototype projects called ESIPs, representing both the research and applications development communities. These working prototype ESIP projects were joined by nine NASA distributed active archive centers (DAACs) to form the core of what was then known as the Federation of ESIPs and were responsible for creating its governing structures and the collaborative community it is today.

Although it started as a federation of partners connected due to a NASA mandate, ESIP has grown into an organization of organizations — and its membership has increased exponentially and diversified significantly. Today, there are more than 170 partner organizations – with room to grow. ESIP holds twice-annual meetings, which have run nonstop since 1998, and all past meeting material is available online. (To see an example of topics discussed at an early ESIP Federation meeting, see Meeting of the Federation of Earth Science Information Partners in the September–October 2001 issue of The Earth Observer [Volume 13, Issue 5, pp. 19–20, 26].)

ESIP also currently supports about 30 collaboration areas, which include 11 standing committees and numerous smaller clusters, or working groups. These committees and clusters conduct business both during and especially between meetings. ESIP also started the ESIP Lab, a microfunding initiative that supports learning objectives alongside technical skill-building. The establishment of an ESIP Community Fellows program has carved out a stronger foothold for early career professionals while the Awards, Endorsement, and programs offers knowledge sharing and recognition at all career stages.

ESIP still brings people together to work on complex Earth science issues — an important task that has not changed in over 25 years — but clearly the world is not the same as it was in 1998 when ESIP was established. This holds true for the hardware, software, remote sensing tools, and computing resources that have changed along with the people and communities who use them. In recognition of this, ESIP has developed a new mission and vision statements, and a new list of core values. A key moment in the 2023 July ESIP meeting (reported on below) was the revelation of these new statements, which were then refined during the meeting and voted on by the Board on July 17, 2023 — see ESIP Vision, Mission, and Core Value Statements below.

Gray Separator Line

ESIP Vision and Mission Statements and Core Values

Vision. We envision a world where data-driven solutions are a reality for all by making Earth science data actionable by all who need them anytime, anywhere.

Mission. To empower innovative use and stewardship of Earth science data to solve our planet’s greatest challenges.

Core Values. Integrity, inclusiveness, collaboration, openness, and curiosity.

Gray Separator Line

The new vision statement was intentionally worded to acknowledge how much power is at the fingertips of all data users. The new mission statement honors the depth of knowledge that is required to make data-driven decisions. Much like open science itself, there is a productive tension between wanting to make data as easy to use as possible while upholding the rigor of scientific standards.

All ESIP collaborations are open to everyone, whether an individual’s home institution is an ESIP partner or not.

Overview of the 2023 July ESIP Meeting

The 2023 July ESIP Meeting showcased how the attitudes, behaviors, connections, engagement, and responses of people to the natural environment as well as to agricultural and food systems – known as human dimensions – inform the ways the community tackles technical challenges and how important it is to gather, work together, and find inspiration. Summary highlights from the meeting follow – organized by day. All the meeting sessions were recorded and are available publicly through the ESIP YouTube channel. The reader is referred to these recordings to learn more about the topics mentioned here. 

The 2023 July ESIP meeting brought together 366 attendees – including 120 first-time participants. Through 4 plenaries and 44 breakout sessions, more than 100 organizers and speakers addressed the latest updates in Earth science data. Through the lens of open science, the community considered both the impact of the past 25 years of ESIP as well as how to move forward into the next quarter century. 

Black Separator Line

Opening Doors – and Knocking Down Barriers – to Open Science

Throughout its history, ESIP meetings have brought together the most innovative thinkers and leaders around Earth observation data, forming a community dedicated to making Earth observations more discoverable, accessible, and useful to researchers, practitioners, policy makers, and the public. Openness is simply how work is done in ESIP.

Many participants are drawn to ESIP’s approach, because they find roadblocks to open collaboration and innovation elsewhere. While the ESIP community values the transparency and accountability that is fundamental to open science processes, ESIP participants also recognize the challenges in implementing those practices more broadly.

The 2023 July meeting was an excellent example. The “Opening Doors to Open Science” theme provided a space for participants to talk honestly about the institutional inertia, lack of incentives, and unintended consequences that hinder the open science approach. Often, the barriers are specific to particular domains, organizations, or roles. The ESIP meeting content explored such challenges – and solutions – for researchers, agencies, repositories, data managers, software developers, curriculum designers, and many other groups.

ESIP Sidebar Photo
Daniel Segessenman [ESIP Community Fellow] explains his poster at the Research Showcase in Burlington, VT.
Photo credit: Homer Horowitz
Black Separator Line

DAY ONE

Susan Shingledecker [ESIP—Executive Director] gave the opening remarks and rallied the audience with interactive activities codesigned with Charley Haley [Way Foragers Consulting]. As a collaborative space, ESIP often breaks the norm of lecture-and-listen modes. The discussion and audience-driven talking points helped the community frame the week’s explorations of open science in Earth science data and computing.

Ken Casey [NOAA, National Center for Environmental Information (NCEI)—Deputy Chief of Data Stewardship and ESIP President 2021–2023] shared ESIP’s new mission, vision, and core values.

Kari Jordan [The Carpentries—Chief Executive Officer (CEO)] addressed the importance of authentic diversity and inclusion as a key function of open science. While she laid out systemic issues and barriers, her presentation focused mostly on action and solutions. She advised the ESIP community to use the organization’s core values and mission to continue opening doors to communities that have been historically left out of Science, Technology, Engineering, and Math (STEM) careers, leadership, and tech development.

The rest of the day was filled with rich, deep dives into many Earth science data and computing topics. Notable highlights include the hands-on, knowledge-sharing sessions led by the ESIP Cloud Computing Cluster, chaired by Aimee Barciauskas [Development Seed]. The sessions – from kerchunk tutorials to overviews of geospatial packages for the Python programming language, to lightning talks where speakers gave walkthroughs of tools used for cloud computing applications (e.g. GeoZarr, a geospatial extension to the Zarr specification for processing multidimensional arrays, or tensors, and storing and manipulating them on the cloud, and JupyterHub) – were often standing room only.

In addition to exploring technical tools, another breakout session motif centered around discussions on engaging stakeholders. One session featured Lesley-Ann Dupigny-Giroux [University of Vermont—State Climatologist], who spoke about climate preparedness for small communities, which was particularly relevant in light of the record-setting flooding that had taken place in Vermont just prior to the meeting. In another session, a team from NASA, including Grace Llewellyn [NASA/Jet Propulsion Laboratory—Software Engineer], Stephanie Schollaert Uz [NASA’s Goddard Space Flight Center (GSFC)—Applied Sciences Manager], and Jennifer Wei [GSFC—Scientist] alongside their collaborators Robert Gradeck [University of Pittsburgh], Mukul Sonwalkar [George Mason University], and Michiaki Tatsubori [IBM Research– Tokyo—Senior Technical Staff Member and Manager], focused on broader collaborations for natural disaster response. Several other sessions focused on specific end users in data centers, repositories, and universities.

DAY TWO

The second day of ESIP’s in-person meetings was nicknamed “Workshop Wednesday.” The day began with the ESIP Lab Plenary, followed by longer, in-depth sessions, and capped with the crowd-favorite Research Showcase Poster and Demo Reception.

Annie Burgess [ESIP—ESIP Lab Director] gave the opening remarks and welcomed Corine Farewell [University of Vermont Innovations] to share her perspective on open science and technology transfer. Many in the research community see the two at odds fundamentally – which the audience made clear during the question-and-answer session – but Farewell laid out how interactions between open science and technology transfer can open opportunities to tailor licensing and rollouts and to help ensure technology is shared and supported.

Scott Reinhard [New York Times—Graphics Editor] took the stage and showed a room full of data managers, researchers, and program directors just how powerful their work can be with the right color choice and analytical filtering for an audience’s intuitive ease – see Figure. As a data visualization expert, Reinhard laid out his creative process for making award-winning news graphics, built with data from sources such as the Moderate Resolution Imaging Spectroradiometer (MODIS) on NASA’s Terra and Aqua platforms, and from instruments on NASA–U.S. Geological Survey Landsat missions. His advice during the question-and-answer session was that “less is more.” He said sharing data with public audiences should be about meeting their needs with clarity and succinctness, which means removing ancillary data that is often included in more dense, scientific presentations.

ESIP Figure 1
Figure. This graphic shows an example of work by Scott Reinhard [New York Times], who uses national and state geospatial data to create data visualizations for broad audiences. This map depicts the Dixie Fire in California in 2021 and is shown in a newsprint layout.
Figure Credit: Scott Reinhard/New York Times

The rest of the day continued with community-led breakout sessions that dove into additional tools like OPeNDAP, Amazon Web Service’s SageMaker, and open data resources in NASA’s Earth Science Division. The day also featured a special plated lunch with presentations from ESIP Award winners. Falkenberg Awardees Angelia Seyfferth [University of Delaware] and Raskin Scholar Alexis Garretson [Tufts University] each shared their domain specialties, Seyfferth focusing on arsenic uptake in crops and Garretson on the ecology of mouse genomes.

In the afternoon, the ESIP Education Committee led the annual ESIP Teacher’s Workshop. The organizers brought together about a dozen instructors keen to learn more about Earth science data tools for use in their middle and high school classrooms. Every participant was given a solar eclipse kit, including eclipse glasses and lesson plans – see Photo 2.

The evening concluded with the Research Showcase, which featured 47 posters and demonstrations. This is a particularly important event for early career meeting attendees, including the ESIP Community Fellows.

ESIP Photo 2
Photo 2. The ESIP Teacher Workshop took participants outside to test the solar eclipse gear they will use in their classrooms.
Photo credit: Homer Horowitz

DAY THREE

While there was no plenary to start the day, breakout sessions continued throughout the morning and late afternoon. Covering artificial intelligence (AI) tools for wildfires, the United Nations Decade of Ocean Science for Sustainable Development (2021–2030), and the Ocean Decade, these ESIP sessions spanned the interdisciplinary breadth of the community. While many attendees have different backgrounds and career paths, it is the technical challenges and opportunities that bring everyone together.

A longer scheduled lunch break transitioned to the unconference, a space for on-the-fly and emergent discussions. Organizers pitched their mini-session ideas, the audience voted, then everyone split into discussion groups similar to organized coffee-break hallway chats. ESIP meeting feedback data shows that in-person attendees value time to integrate new knowledge and network; a short unconference has proven to be a productive way to encourage this.

Another key networking opportunity was the FUNding Friday microfunding competition. On Thursday night, participants gathered at a local eatery to ideate, write, and even draw their projects, which would be pitched the next morning.

DAY FOUR

While short, the final day of the ESIP meeting proved to be lively. The morning started with the FUNding Friday pitches and voting followed by the closing plenary and Partner Assembly Business meeting. The day concluded with the final breakout sessions, which highlighted the human and social aspects of implementing open science in an Earth data context. From the process of public comments to AI and large-language models, the breakouts illustrated how entangled human challenges are with technical and environmental ones.

Conclusion

Celebrating the organization’s twenty-fifth anniversary at the 2023 July ESIP Meeting tapped into the community’s deep roots while highlighting how much the gathering has grown and evolved. Over the next 25 years, the Earth sciences and its technology will continue to expand – and so will the user base.

To help make Earth science data and its tools accessible, ESIP is committed to making its meetings as open as possible. All ESIP meeting content is made freely available on the ESIP YouTube channel with no time limit.

In general, the ESIP community is open to all people interested in making Earth science data accessible and actionable. The community gathers twice each year in January and July, but the ESIP Collaboration Areas host monthly gatherings throughout the year. Additionally, the ESIP Lab offers seed funding for pilot projects.

Readers who wish to stay informed on the latest from ESIP, Earth science data community events, jobs, and resources are invited to subscribe to the weekly ESIP Update. The next ESIP meeting will take place in July 2024; watch the ESIP website and other social media for more details.

View the full article

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Similar Topics

    • By NASA
      Explore This Section Science Goddard Space Flight Center Linking Satellite Data and… Overview Learning Resources Science Activation Teams SME Map Opportunities More Science Activation Stories Citizen Science   4 min read
      Linking Satellite Data and Community Knowledge to Advance Alaskan Snow Science
      Seasonal snow plays a significant role in global water and energy cycles, and billions of people worldwide rely on snowmelt for water resources needs, including water supply, hydropower, agriculture, and more. Monitoring snow water equivalent (SWE) is critical for supporting these applications and for mitigating damages caused by snowmelt flooding, avalanches, and other snow-related disasters. However, our ability to measure SWE remains a challenge, particularly in northern latitudes where in situ SWE observations are sparse and satellite observations are impacted by the boreal forest and environmental conditions. Despite limited in situ SWE measurements, local residents in Arctic and sub-Arctic regions provide a vast and valuable body of place-based knowledge and observations that are essential for understanding snowpack behavior in northern regions.
      As part of a joint NASA SnowEx, NASA’s Minority University Research and Education Project (MUREP) for American Indian and Alaska Native STEM (Science, Technology, Engineering, & Mathematics) Engagement (MAIANSE), and Global Learning & Observations to Benefit the Environment (GLOBE) Program partnership, a team of scientists including NASA intern Julia White (NASA Goddard Space Flight Center, University of Alaska Fairbanks), Carrie Vuyovich (NASA Goddard Space Flight Center), Alicia Joseph (NASA Goddard Space Flight Center), and Christi Buffington (University of Alaska Fairbanks, GLOBE Implementation Office) is studying snow water equivalent (SWE) across Interior Alaska. This project combines satellite-based interferometric synthetic aperture radar (InSAR) data, primarily from the Sentinel-1 satellite, with ground-based observations from the Snow Telemetry (SNOTEL) network and GLOBE (Global Learning Observations to Benefit the Environment). Together, these data sources help the team investigate how SWE varies across the landscape and how it affects local ecosystems and communities. The team is also preparing for future integration of data from NASA’s upcoming NISAR (NASA ISRO Synthetic Aperture Radar) mission, which is expected to enhance SWE retrieval capabilities.
      After a collaborative visit to the classroom of Tammie Kovalenko in November 2024, Delta Junction junior and senior high school students in vocational agriculture (Vo Ag) classes, including members of Future Farmers of America (FFA), began collecting GLOBE data on a snowdrift located just outside their classroom. As the project progressed, students developed their own research questions. One student, Fianna Rooney, took the project even further — presenting research posters at both the GLOBE International Virtual Science Symposium (IVSS) and both the FFA Regional and National Conventions. Her work highlights the growing role of Alaskan youth in science, and how student-led inquiry can enrich both education and research outcomes. (This trip was funded by the NASA Science Activation Program’s Arctic and Earth SIGNs – STEM Integrating GLOBE & NASA – project at the University of Alaska Fairbanks.)
      In February 2025, the team collaborated with Delta Junction Junior High and High School students, along with the Delta Junction Trails Association, to conduct a GLOBE Intensive Observation Period (IOP), “Delta Junction Snowdrifts,” to collect Landcover photos, snow depth, and snow water equivalent data. Thanks to aligned interests and research goals at the Alaska Satellite Facility (ASF), the project was further expanded into Spring 2025. Collaborators from ASF and the Alaska Center for Unmanned Aircraft Systems Integration (ACUASI) collected high resolution airborne data over the snowdrift at the Delta Junction Junior and Senior High School. This complementary dataset helped strengthen connections between satellite observations and ground-based student measurements.
      This effort, led by a NASA intern, scientists, students, and Alaskan community members, highlights the power of collaboration in advancing science and education. Next steps will include collaboration with Native Alaskan communities near Delta Junction, including the Healy Lake Tribe, whose vast, generational knowledge will be of great value to deepening our understanding of Alaskan snow dynamics.
      Learn more about how NASA’s Science Activation program connects NASA science experts, real content, and experiences with community leaders to do science in ways that activate minds and promote deeper understanding of our world and beyond: https://science.nasa.gov/learn/about-science-activation/
      Julia White and Delta Junction student following GLOBE protocols for snow depth. Tori Brannan Share








      Details
      Last Updated Jul 14, 2025 Editor NASA Science Editorial Team Location Goddard Space Flight Center Related Terms
      Earth Science Goddard Space Flight Center MUREP Science Activation Explore More
      2 min read Hubble Snaps Galaxy Cluster’s Portrait


      Article


      3 days ago
      7 min read NASA’s Parker Solar Probe Snaps Closest-Ever Images to Sun
      On its record-breaking pass by the Sun late last year, NASA’s Parker Solar Probe captured…


      Article


      4 days ago
      8 min read NASA’s Webb Scratches Beyond Surface of Cat’s Paw for 3rd Anniversary


      Article


      4 days ago
      Keep Exploring Discover More Topics From NASA
      James Webb Space Telescope


      Webb is the premier observatory of the next decade, serving thousands of astronomers worldwide. It studies every phase in the…


      Perseverance Rover


      This rover and its aerial sidekick were assigned to study the geology of Mars and seek signs of ancient microbial…


      Parker Solar Probe


      On a mission to “touch the Sun,” NASA’s Parker Solar Probe became the first spacecraft to fly through the corona…


      Juno


      NASA’s Juno spacecraft entered orbit around Jupiter in 2016, the first explorer to peer below the planet’s dense clouds to…

      View the full article
    • By NASA
      NASA/Johns Hopkins University Applied Physics Laboratory/Southwest Research Institute/Alex Parker This image, taken by NASA’s New Horizons spacecraft on July 14, 2015, is the most accurate natural color image of Pluto. This natural-color image results from refined calibration of data gathered by New Horizons’ color Multispectral Visible Imaging Camera (MVIC). The processing creates images that would approximate the colors that the human eye would perceive, bringing them closer to “true color” than the images released near the encounter. This single color MVIC scan includes no data from other New Horizons imagers or instruments added. The striking features on Pluto are clearly visible, including the bright expanse of Pluto’s icy, nitrogen-and-methane rich “heart,” Sputnik Planitia.
      Image credit: NASA/Johns Hopkins University Applied Physics Laboratory/Southwest Research Institute/Alex Parker
      View the full article
    • By European Space Agency
      Image: The varied landscape of England’s Lake District is featured in this image captured by the Copernicus Sentinel-2 mission. View the full article
    • By NASA
      The TRACERS (Tandem Reconnection and Cusp Electrodynamics Reconnaissance Satellites) mission will help scientists understand an explosive process called magnetic reconnection and its effects in Earth’s atmosphere. Credit: University of Iowa/Andy Kale NASA will hold a media teleconference at 11 a.m. EDT on Thursday, July 17, to share information about the agency’s upcoming Tandem Reconnection and Cusp Electrodynamics Reconnaissance Satellites, or TRACERS, mission, which is targeted to launch no earlier than late July.
      The TRACERS mission is a pair of twin satellites that will study how Earth’s magnetic shield — the magnetosphere — protects our planet from the supersonic stream of material from the Sun called solar wind. As they fly pole to pole in a Sun-synchronous orbit, the two TRACERS spacecraft will measure how magnetic explosions send these solar wind particles zooming down into Earth’s atmosphere — and how these explosions shape the space weather that impacts our satellites, technology, and astronauts.
      Also launching on this flight will be three additional NASA-funded payloads. The Athena EPIC (Economical Payload Integration Cost) SmallSat, led by NASA’s Langley Research Center in Hampton, Virginia, is designed to demonstrate an innovative, configurable way to put remote-sensing instruments into orbit faster and more affordably. The Polylingual Experimental Terminal technology demonstration, managed by the agency’s SCaN (Space Communications and Navigation) program, will showcase new technology that empowers missions to roam between communications networks in space, like cell phones roam between providers on Earth. Finally, the Relativistic Electron Atmospheric Loss (REAL) CubeSat, led by Dartmouth College in Hanover, New Hampshire, will use space as a laboratory to understand how high-energy particles within the bands of radiation that surround Earth are naturally scattered into the atmosphere, aiding the development of methods for removing these damaging particles to better protect satellites and the critical ground systems they support.
      Audio of the teleconference will stream live on the agency’s website at:
      nasa.gov/live
      Participants include:
      Joe Westlake, division director, Heliophysics, NASA Headquarters Kory Priestley, principal investigator, Athena EPIC, NASA Langley Greg Heckler, deputy program manager for capability development, SCaN, NASA Headquarters David Miles, principal investigator for TRACERS, University of Iowa Robyn Millan, REAL principal investigator, Dartmouth College To participate in the media teleconference, media must RSVP no later than 10 a.m. on July 17 to Sarah Frazier at: sarah.frazier@nasa.gov. NASA’s media accreditation policy is available online. 
      The TRACERS mission will launch on a SpaceX Falcon 9 rocket from Space Launch Complex 4 East at Vandenberg Space Force Base in California.
      This mission is led by David Miles at the University of Iowa with support from the Southwest Research Institute in San Antonio. NASA’s Heliophysics Explorers Program Office at the agency’s Goddard Space Flight Center in Greenbelt, Maryland, manages the mission for the agency’s HeliophysicsDivision at NASA Headquarters in Washington. The University of Iowa, Southwest Research Institute, University of California, Los Angeles, and University of California, Berkeley, all lead instruments on TRACERS that will study changes in the Earth’s magnetic field and electric field. NASA’s Launch Services Program, based at the agency’s Kennedy Space Center in Florida, manages the Venture-class Acquisition of Dedicated and Rideshare contract.
      To learn more about TRACERS, please visit:
      nasa.gov/tracers
      -end-
      Abbey Interrante / Karen Fox
      Headquarters, Washington
      301-201-0124 / 202-358-1600
      abbey.a.interrante@nasa.gov / karen.c.fox@nasa.gov
      Sarah Frazier
      Goddard Space Flight Center, Greenbelt, Maryland
      202-853-7191
      sarah.frazier@nasa.gov
      Share
      Details
      Last Updated Jul 10, 2025 LocationNASA Headquarters Related Terms
      Earth Heliophysics Science Mission Directorate Solar Wind TRACERS View the full article
    • By NASA
      6 min read
      Smarter Searching: NASA AI Makes Science Data Easier to Find
      Image snapshot taken from NASA Worldview of NASA’s Global Precipitation Measurement (GPM) mission on March 15, 2025 showing heavy rain across the southeastern U.S. with an overlay of the GCMD Keyword Recommender for Earth Science, Atmosphere, Precipitation, Droplet Size. NASA Worldview Imagine shopping for a new pair of running shoes online. If each seller described them differently—one calling them “sneakers,” another “trainers,” and someone else “footwear for exercise”—you’d quickly feel lost in a sea of mismatched terminology. Fortunately, most online stores use standardized categories and filters, so you can click through a simple path: Women’s > Shoes > Running Shoes—and quickly find what you need.
      Now, scale that problem to scientific research. Instead of sneakers, think “aerosol optical depth” or “sea surface temperature.” Instead of a handful of retailers, it is thousands of researchers, instruments, and data providers. Without a common language for describing data, finding relevant Earth science datasets would be like trying to locate a needle in a haystack, blindfolded.
      That’s why NASA created the Global Change Master Directory (GCMD), a standardized vocabulary that helps scientists tag their datasets in a consistent and searchable way. But as science evolves, so does the challenge of keeping metadata organized and discoverable. 
      To meet that challenge, NASA’s Office of Data Science and Informatics (ODSI) at the agency’s Marshall Space Flight Center (MSFC) in Huntsville, Alabama, developed the GCMD Keyword Recommender (GKR): a smart tool designed to help data providers and curators assign the right keywords, automatically.
      Smarter Tagging, Accelerated Discovery
      The upgraded GKR model isn’t just a technical improvement; it’s a leap forward in how we organize and access scientific knowledge. By automatically recommending precise, standardized keywords, the model reduces the burden on human curators while ensuring metadata quality remains high. This makes it easier for researchers, students, and the public to find exactly the datasets they need.
      It also sets the stage for broader applications. The techniques used in GKR, like applying focal loss to rare-label classification problems and adapting pre-trained transformers to specialized domains, can benefit fields well beyond Earth science.
      Metadata Matchmaker
      The newly upgraded GKR model tackles a massive challenge in information science known as extreme multi-label classification. That’s a mouthful, but the concept is straightforward: Instead of predicting just one label, the model must choose many, sometimes dozens, from a set of thousands. Each dataset may need to be tagged with multiple, nuanced descriptors pulled from a controlled vocabulary.
      Think of it like trying to identify all the animals in a photograph. If there’s just a dog, it’s easy. But if there’s a dog, a bird, a raccoon hiding behind a bush, and a unicorn that only shows up in 0.1% of your training photos, the task becomes far more difficult. That’s what GKR is up against: tagging complex datasets with precision, even when examples of some keywords are scarce.
      And the problem is only growing. The new version of GKR now considers more than 3,200 keywords, up from about 430 in its earlier iteration. That’s a sevenfold increase in vocabulary complexity, and a major leap in what the model needs to learn and predict.
      To handle this scale, the GKR team didn’t just add more data; they built a more capable model from the ground up. At the heart of the upgrade is INDUS, an advanced language model trained on a staggering 66 billion words drawn from scientific literature across disciplines—Earth science, biological sciences, astronomy, and more.
      NASA ODSI’s GCMD Keyword Recommender AI model automatically tags scientific datasets with the help of INDUS, a large language model trained on NASA scientific publications across the disciplines of astrophysics, biological and physical sciences, Earth science, heliophysics, and planetary science. NASA “We’re at the frontier of cutting-edge artificial intelligence and machine learning for science,” said Sajil Awale, a member of the NASA ODSI AI team at MSFC. “This problem domain is interesting, and challenging, because it’s an extreme classification problem where the model needs to differentiate even very similar keywords/tags based on small variations of context. It’s exciting to see how we have leveraged INDUS to build this GKR model because it is designed and trained for scientific domains. There are opportunities to improve INDUS for future uses.”
      This means that the new GKR isn’t just guessing based on word similarities; it understands the context in which keywords appear. It’s the difference between a model knowing that “precipitation” might relate to weather versus recognizing when it means a climate variable in satellite data.
      And while the older model was trained on only 2,000 metadata records, the new version had access to a much richer dataset of more than 43,000 records from NASA’s Common Metadata Repository. That increased exposure helps the model make more accurate predictions.
      The Common Metadata Repository is the backend behind the following data search and discovery services:
      Earthdata Search International Data Network Learning to Love Rare Words
      One of the biggest hurdles in a task like this is class imbalance. Some keywords appear frequently; others might show up just a handful of times. Traditional machine learning approaches, like cross-entropy loss, which was used initially to train the model, tend to favor the easy, common labels, and neglect the rare ones.
      To solve this, NASA’s team turned to focal loss, a strategy that reduces the model’s attention to obvious examples and shifts focus toward the harder, underrepresented cases. 
      The result? A model that performs better across the board, especially on the keywords that matter most to specialists searching for niche datasets.
      From Metadata to Mission
      Ultimately, science depends not only on collecting data, but on making that data usable and discoverable. The updated GKR tool is a quiet but critical part of that mission. By bringing powerful AI to the task of metadata tagging, it helps ensure that the flood of Earth observation data pouring in from satellites and instruments around the globe doesn’t get lost in translation.
      In a world awash with data, tools like GKR help researchers find the signal in the noise and turn information into insight.
      Beyond powering GKR, the INDUS large language model is also enabling innovation across other NASA SMD projects. For example, INDUS supports the Science Discovery Engine by helping automate metadata curation and improving the relevancy ranking of search results.The diverse applications reflect INDUS’s growing role as a foundational AI capability for SMD.
      The INDUS large language model is funded by the Office of the Chief Science Data Officer within NASA’s Science Mission Directorate at NASA Headquarters in Washington. The Office of the Chief Science Data Officer advances scientific discovery through innovative applications and partnerships in data science, advanced analytics, and artificial intelligence.
      Share








      Details
      Last Updated Jul 09, 2025 Related Terms
      Science & Research Artificial Intelligence (AI) Explore More
      2 min read Polar Tourists Give Positive Reviews to NASA Citizen Science in Antarctica


      Article


      6 hours ago
      2 min read Hubble Observations Give “Missing” Globular Cluster Time to Shine


      Article


      6 days ago
      5 min read How NASA’s SPHEREx Mission Will Share Its All-Sky Map With the World 


      Article


      7 days ago
      Keep Exploring Discover Related Topics
      Missions



      Humans in Space



      Climate Change



      Solar System


      View the full article
  • Check out these Videos

×
×
  • Create New...