Jump to content

Recommended Posts

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Similar Topics

    • By NASA
      6 min read
      Smarter Searching: NASA AI Makes Science Data Easier to Find
      Image snapshot taken from NASA Worldview of NASA’s Global Precipitation Measurement (GPM) mission on March 15, 2025 showing heavy rain across the southeastern U.S. with an overlay of the GCMD Keyword Recommender for Earth Science, Atmosphere, Precipitation, Droplet Size. NASA Worldview Imagine shopping for a new pair of running shoes online. If each seller described them differently—one calling them “sneakers,” another “trainers,” and someone else “footwear for exercise”—you’d quickly feel lost in a sea of mismatched terminology. Fortunately, most online stores use standardized categories and filters, so you can click through a simple path: Women’s > Shoes > Running Shoes—and quickly find what you need.
      Now, scale that problem to scientific research. Instead of sneakers, think “aerosol optical depth” or “sea surface temperature.” Instead of a handful of retailers, it is thousands of researchers, instruments, and data providers. Without a common language for describing data, finding relevant Earth science datasets would be like trying to locate a needle in a haystack, blindfolded.
      That’s why NASA created the Global Change Master Directory (GCMD), a standardized vocabulary that helps scientists tag their datasets in a consistent and searchable way. But as science evolves, so does the challenge of keeping metadata organized and discoverable. 
      To meet that challenge, NASA’s Office of Data Science and Informatics (ODSI) at the agency’s Marshall Space Flight Center (MSFC) in Huntsville, Alabama, developed the GCMD Keyword Recommender (GKR): a smart tool designed to help data providers and curators assign the right keywords, automatically.
      Smarter Tagging, Accelerated Discovery
      The upgraded GKR model isn’t just a technical improvement; it’s a leap forward in how we organize and access scientific knowledge. By automatically recommending precise, standardized keywords, the model reduces the burden on human curators while ensuring metadata quality remains high. This makes it easier for researchers, students, and the public to find exactly the datasets they need.
      It also sets the stage for broader applications. The techniques used in GKR, like applying focal loss to rare-label classification problems and adapting pre-trained transformers to specialized domains, can benefit fields well beyond Earth science.
      Metadata Matchmaker
      The newly upgraded GKR model tackles a massive challenge in information science known as extreme multi-label classification. That’s a mouthful, but the concept is straightforward: Instead of predicting just one label, the model must choose many, sometimes dozens, from a set of thousands. Each dataset may need to be tagged with multiple, nuanced descriptors pulled from a controlled vocabulary.
      Think of it like trying to identify all the animals in a photograph. If there’s just a dog, it’s easy. But if there’s a dog, a bird, a raccoon hiding behind a bush, and a unicorn that only shows up in 0.1% of your training photos, the task becomes far more difficult. That’s what GKR is up against: tagging complex datasets with precision, even when examples of some keywords are scarce.
      And the problem is only growing. The new version of GKR now considers more than 3,200 keywords, up from about 430 in its earlier iteration. That’s a sevenfold increase in vocabulary complexity, and a major leap in what the model needs to learn and predict.
      To handle this scale, the GKR team didn’t just add more data; they built a more capable model from the ground up. At the heart of the upgrade is INDUS, an advanced language model trained on a staggering 66 billion words drawn from scientific literature across disciplines—Earth science, biological sciences, astronomy, and more.
      NASA ODSI’s GCMD Keyword Recommender AI model automatically tags scientific datasets with the help of INDUS, a large language model trained on NASA scientific publications across the disciplines of astrophysics, biological and physical sciences, Earth science, heliophysics, and planetary science. NASA “We’re at the frontier of cutting-edge artificial intelligence and machine learning for science,” said Sajil Awale, a member of the NASA ODSI AI team at MSFC. “This problem domain is interesting, and challenging, because it’s an extreme classification problem where the model needs to differentiate even very similar keywords/tags based on small variations of context. It’s exciting to see how we have leveraged INDUS to build this GKR model because it is designed and trained for scientific domains. There are opportunities to improve INDUS for future uses.”
      This means that the new GKR isn’t just guessing based on word similarities; it understands the context in which keywords appear. It’s the difference between a model knowing that “precipitation” might relate to weather versus recognizing when it means a climate variable in satellite data.
      And while the older model was trained on only 2,000 metadata records, the new version had access to a much richer dataset of more than 43,000 records from NASA’s Common Metadata Repository. That increased exposure helps the model make more accurate predictions.
      The Common Metadata Repository is the backend behind the following data search and discovery services:
      Earthdata Search International Data Network Learning to Love Rare Words
      One of the biggest hurdles in a task like this is class imbalance. Some keywords appear frequently; others might show up just a handful of times. Traditional machine learning approaches, like cross-entropy loss, which was used initially to train the model, tend to favor the easy, common labels, and neglect the rare ones.
      To solve this, NASA’s team turned to focal loss, a strategy that reduces the model’s attention to obvious examples and shifts focus toward the harder, underrepresented cases. 
      The result? A model that performs better across the board, especially on the keywords that matter most to specialists searching for niche datasets.
      From Metadata to Mission
      Ultimately, science depends not only on collecting data, but on making that data usable and discoverable. The updated GKR tool is a quiet but critical part of that mission. By bringing powerful AI to the task of metadata tagging, it helps ensure that the flood of Earth observation data pouring in from satellites and instruments around the globe doesn’t get lost in translation.
      In a world awash with data, tools like GKR help researchers find the signal in the noise and turn information into insight.
      Beyond powering GKR, the INDUS large language model is also enabling innovation across other NASA SMD projects. For example, INDUS supports the Science Discovery Engine by helping automate metadata curation and improving the relevancy ranking of search results.The diverse applications reflect INDUS’s growing role as a foundational AI capability for SMD.
      The INDUS large language model is funded by the Office of the Chief Science Data Officer within NASA’s Science Mission Directorate at NASA Headquarters in Washington. The Office of the Chief Science Data Officer advances scientific discovery through innovative applications and partnerships in data science, advanced analytics, and artificial intelligence.
      Share








      Details
      Last Updated Jul 09, 2025 Related Terms
      Science & Research Artificial Intelligence (AI) Explore More
      2 min read Polar Tourists Give Positive Reviews to NASA Citizen Science in Antarctica


      Article


      6 hours ago
      2 min read Hubble Observations Give “Missing” Globular Cluster Time to Shine


      Article


      6 days ago
      5 min read How NASA’s SPHEREx Mission Will Share Its All-Sky Map With the World 


      Article


      7 days ago
      Keep Exploring Discover Related Topics
      Missions



      Humans in Space



      Climate Change



      Solar System


      View the full article
    • By European Space Agency
      The second of the Meteosat Third Generation (MTG) satellites and the first instrument for the Copernicus Sentinel-4 mission lifted off at 23:04 CEST on Tuesday, 1 July. The satellite is now on its way to monitor Earth’s atmosphere from an altitude of 36 000 km. From this geostationary orbit, the missions can provide game-changing data for forecasting severe storms and air pollution over Europe.
      View the full article
    • By NASA
      Credit: NASA/Krystofer Kim Lee esta nota en español aquí.
      NASA released the first episode Tuesday of its third season of Universo curioso de la NASA, the agency’s only Spanish-language podcast.
      Episodes focus on some of NASA’s top missions and research topics for 2025, bringing the wonder of exploration, space technology, and scientific discoveries to Spanish-speaking audiences around the world. 
      “NASA Science is literally everywhere, transcending geography and language to provide real time benefits to everyday lives across the globe using our scientific innovations, data, and discoveries from the unique vantage point of space,” said Dr. Nicky Fox, associate administrator, Science Mission Directorate, at NASA Headquarters in Washington. “The Universo curioso de la NASA podcast shares NASA’s discoveries with Spanish-speaking communities across the globe, inspiring future explorers to join our journey as we return to the Moon and venture onward to Mars for the benefit of all humanity.”


      New episodes will post every month through the end of the year. The first episode, centered on the science objectives of NASA’s Artemis II mission to the Moon, is available at:
      https://go.nasa.gov/4l9lmbN

      Universo curioso is hosted by Noelia González, communications specialist at NASA’s Goddard Space Flight Center in Greenbelt, Maryland. This season introduces co-host Andrés Almeida, technical writer and host of NASA’s Small Steps, Giant Leaps podcast at NASA’s Headquarters. Throughout the season, listeners will celebrate the legacy of NASA’s Hubble Space Telescope, learn about an upcoming mission to the Sun, and explore dark energy and how the future Roman Space Telescope will study it, among other topics.

      Universo curioso de la NASA is a joint initiative of the agency’s Spanish-language communications and audio programs. The new season, as well as previous episodes, are available on Apple Podcasts, Spotify, SoundCloud and NASA’s website.

      Listen to the podcast and download related art materials at:
      https://ciencia.nasa.gov/universocurioso
      Share
      Details
      Last Updated Jul 01, 2025 EditorJessica TaveauLocationNASA Headquarters Related Terms
      Podcasts General View the full article
    • By NASA
      The four crew members of NASA’s SpaceX Crew-11 mission to the International Space Station train inside a SpaceX Dragon spacecraft in Hawthorne, California. From left to right: Roscosmos cosmonaut Oleg Platonov, NASA astronauts Mike Fincke and Zena Cardman, and JAXA astronaut Kimiya Yui.Credit: SpaceX Media accreditation is open for the launch of NASA’s 11th rotational mission of a SpaceX Falcon 9 rocket and Dragon spacecraft carrying astronauts to the International Space Station for a science expedition. NASA’s SpaceX Crew-11 mission is targeted to launch in the late July/early August timeframe from Launch Complex 39A at the agency’s Kennedy Space Center in Florida.
      The mission includes NASA astronauts Zena Cardman, serving as commander; Mike Fincke, pilot; JAXA (Japan Aerospace Exploration Agency) astronaut Kimiya Yui, mission specialist; and Roscosmos cosmonaut Oleg Platonov, mission specialist. This is the first spaceflight for Cardman and Platonov, the fourth trip for Fincke, and the second for Yui, to the orbiting laboratory.
      Media accreditation deadlines for the Crew-11 launch as part of NASA’s Commercial Crew Program are as follows:
      International media without U.S. citizenship must apply by 11:59 p.m. EDT on Sunday, July 6. U.S. media and U.S. citizens representing international media organizations must apply by 11:59 p.m. on Monday, July 14. All accreditation requests must be submitted online at:
      https://media.ksc.nasa.gov
      NASA’s media accreditation policy is online. For questions about accreditation or special logistical requests, email: ksc-media-accreditat@mail.nasa.gov. Requests for space for satellite trucks, tents, or electrical connections are due by Monday, July 14.
      For other questions, please contact NASA Kennedy’s newsroom at: 321-867-2468.
      Para obtener información sobre cobertura en español en el Centro Espacial Kennedy o si desea solicitar entrevistas en español, comuníquese con Antonia Jaramillo: 321-501-8425, o Messod Bendayan: 256-930-1371.
      For launch coverage and more information about the mission, visit:
      https://www.nasa.gov/commercialcrew
      -end-
      Joshua Finch / Claire O’Shea
      Headquarters, Washington
      202-358-1100
      joshua.a.finch@nasa.gov / claire.a.o’shea@nasa.gov
      Steve Siceloff / Stephanie Plucinsky
      Kennedy Space Center, Florida
      321-867-2468
      steven.p.siceloff@nasa.gov / stephanie.n.plucinsky@nasa.gov
      Joseph Zakrzewski
      Johnson Space Center, Houston
      281-483-5111
      joseph.a.zakrzewski@nasa.gov
      Share
      Details
      Last Updated Jul 01, 2025 EditorJessica TaveauLocationNASA Headquarters Related Terms
      Commercial Crew Commercial Space Humans in Space International Space Station (ISS) ISS Research Space Operations Mission Directorate View the full article
  • Check out these Videos

×
×
  • Create New...