How GDS Used Data to Improve Content and User Journeys on GOV.UK

GOV.UK faced challenges with managing over 400,000 content pieces and helping users navigate effectively through their vast information landscape. The Government Digital Service (GDS) Data Science team implemented an innovative solution using machine learning and network analysis to automatically identify related content and improve navigation paths. This resulted in enhanced user journeys for more than 10,000 daily users and a significant reduction in internal search usage, demonstrating the power of data-driven content management.

Project Overview
Timeline:

May - August 2019

Team:

GDS Data Science Team

Tools:

Machine Learning, Network Analysis, Google Universal Sentence Encoder

Background

Content Scale
  • 400,000+ unique content pieces
  • 700 new pieces published weekly
  • Only 2% had related links
Challenges
  • Duplicate content
  • Navigation difficulties
  • Limited content relationships

Technical Approach

Semantic Vectors
  • Used Google's universal sentence encoder
  • Identified similar content
  • Supported taxonomy decisions
Network Analysis
  • Created Network Data Pipeline
  • Analyzed user journeys
  • Compared functional vs structural networks
Machine Learning
  • Implemented node2vec algorithm
  • Generated automated related links
  • Conducted A/B testing

Results and Impact

Improvements
  • Enhanced user journeys for 10,000+ users daily
  • 20-40% reduction in internal search usage
  • Automated link generation every 3 weeks
Quality Controls
  • Confidence thresholds for generated links
  • Content designer review for top 200 pages
  • Exclusion rules for inappropriate linking

Key Learnings

  • Automated links can effectively supplement manual curation
  • Data-driven approach improved user navigation
  • Regular updates maintain link relevance
  • Balance needed between automation and human oversight